Deepak's Algorithm

Introduction Deepak's algorithm is a generic idea. It applies to the following class of problems:

  You are given array Arr of length N and Q queries. Each query is represented by two numbers L and R, and it asks       you to compute some function Func with subarray Arr[L..R] as its argument.

For the sake of brevity we will denote Func([L, R]) as the value of Func on subarray Arr[L..R]. If this sounds too abstract, let’s look at specific example:

   There is an integer array Arr of length N and Q queries. For each i, query #i asks you to output the sum of            numbers on subarray [Li, Ri], i.e. Arr[Li] + Arr[Li + 1] + … + Arr[Ri].

Here we have Func([L, R]) = Arr[L] + Arr[L + 1] + ... + Arr[R].

This does not sound so scary, does it? You’ve probably heard of solutions to this problem using Segment Trees or Binary Indexed Trees, or even prefix sums.

Deepak's algorithm provides a way to answer all queries in O((N + Q) * sqrt(N) * F) time with at least O(Q) additional memory. Meaning of F is explained below.

The algorithm is applicable if all following conditions are met:

    Arr is not changed by queries;

    All queries are known beforehand (techniques requiring this property are often called “offline algorithms”);

    If we know Func([L, R]), then we can compute Func([L + 1, R]), Func([L - 1, R]), Func([L, R + 1]) and Func([L,          R - 1]), each in O(F) time.

Arr is not changed by queries; All queries are known beforehand (techniques requiring this property are often called “offline algorithms”); If we know Func([L, R]), then we can compute Func([L + 1, R]), Func([L - 1, R]), Func([L, R + 1]) and Func([L, R - 1]), each in O(F) time.

Basic overviewoverview We have Q queries to answer. Suppose we answer them in order they are asked in the following manner:

           for i = 0..Q-1:

           L, R = query #i

            for j = L..R:

           do some work to compute Func([L, R])

This can take Ω(N * Q) time. If N and Q are of order 105, then this would lead to time limit exceeded.

But what if we answer queries in different order? Can we do better then?

Definition #1: Segment [L, R] is a continuous subarray Arr[L..R], i.e. array formed by elements Arr[L], Arr[L + 1], …, Arr[R]. We call L left endpoint and R right endpoint of segment [L, R]. We say that index i belongs to segment [L, R] if L ≤ i ≤ R.

Notation

Throughout this tutorial “x⁄y” will mean “integer part of x divided by y”. For instance, 10⁄4 = 2, 15⁄3 = 5, 27⁄8 = 3; By “sqrt(x)” we will mean “largest integer less or equal to square root of x”. For example, sqrt(16) = 4, sqrt(39) = 6; Suppose a query asks to calculate Func([L, R]). We will denote this query as [L, R] - the same way as the respective argument to Func; Everything is 0-indexed. We will describe Mo’s algorithm, and then prove its running time.

This can take Ω(N * Q) time. If N and Q are of order 105, then this would lead to time limit exceeded.

But what if we answer queries in different order? Can we do better then?

Definition #1: Segment [L, R] is a continuous subarray Arr[L..R], i.e. array formed by elements Arr[L], Arr[L + 1], …, Arr[R]. We call L left endpoint and R right endpoint of segment [L, R]. We say that index i belongs to segment [L, R] if L ≤ i ≤ R.

Notation

Throughout this tutorial “x⁄y” will mean “integer part of x divided by y”. For instance, 10⁄4 = 2, 15⁄3 = 5, 27⁄8 = 3; By “sqrt(x)” we will mean “largest integer less or equal to square root of x”. For example, sqrt(16) = 4, sqrt(39) = 6; Suppose a query asks to calculate Func([L, R]). We will denote this query as [L, R] - the same way as the respective argument to Func; Everything is 0-indexed. We will describe Mo’s algorithm, and then prove its running time.

Time complexitycomplexity Let’s view Arr as a union of disjoint segments of size BLOCK_SIZE, which we will call “blocks”. Take a look at the picture for better understanding:

Suppose we’ve just started processing queries from Qr for some r, and we’ve already answered first (following Mo’s order) query from it. Let that query be [L, R0]. This means that now mo_right = R0.

Let R0, R1, ..., R|Qr| - 1 be right endpoints of queries from Qr, in order they appear in Mo’s order. From proposition #3 we know that

R0 ≤ R1 ≤ R2 ≤ … ≤ R|Qr| - 1

Changes to mo_right

From proposition #2 and definition of Qr we know that

r * BLOCK_SIZE ≤ L

Since right endpoint is not lower that left endpoint (i.e. L ≤ R), we conclude that

r * BLOCK_SIZE ≤ R0

We have N elements in Arr, so any query has right endpoint less than N. Therefore,

R|Qr| - 1 ≤ N - 1

Since R’s are not decreasing, the total amount of times mo_right changes is

R|Qr| - 1 - R0 ≤ N - 1 - r * BLOCK_SIZE = O(N)

(we’ve substituted R|Qr| - 1 with its highest possible value, and R0 with its lowest possible value to maximize the subtraction).

There are O(sqrt(N)) different values of r (proposition #1), so in total we will have O(N * sqrt(N)) changes, assuming that we’ve already started from the first query of each Qr.

Now suppose we’ve just ended processing queries from Qr and we must process first query from Qr + 1 (assuming that it is not empty. If it is, then choose next non-empty set). Currently, mo_right is constrained to be:

r * BLOCK_SIZE ≤ mo_right ≤ N - 1

Let the first query from Qr + 1 be [L', R']. We know (similarly to previous paragraph) that

(r + 1) * BLOCK_SIZE ≤ R' ≤ N - 1

Hence, mo_right must be changed at most

max(|r * BLOCK SIZE - (N - 1)|, N - 1 - (r + 1) * BLOCK SIZE)

times (we took lowest value of mo_right with highest value of R’ and vice-versa). This is clearly O(N) (and it does not matter whether it is r+1’th set of r+k’th for some k > 1).

There are O(sqrt(N)) switches from r to r + 1 (and it's true even if we skip some empty sets), so in total we will have O(N * sqrt(N)) mo_right changes to do this.

There are no more cases when mo_right changes. Overall, we have O(N * sqrt(N)) + O(N * sqrt(N)) = O(N * sqrt(N)) changes of mo_right.

Corollary #3: All mo_right changes combined take O(N * sqrt(N) * F) time (because each change is done in O(F) time).

Theorem #2:

Mo_left changes its value O(Q * sqrt(N)) times throughout the run of Mo’s algorithm.

Proof:

Suppose, as in the proof of Theorem #1, we’ve just started processing queries from Qr for some r, and currently mo_left = L, mo_right = R0, where [L, R0] ∈ Qr. For every query [L', R'] ∈ Qr by definition #2 and proposition #2 we have:

r * BLOCK_SIZE ≤ L' ≤ (r + 1) * BLOCK_SIZE - 1

So, when we change mo_left from one query to other (both queries ∈ Qr), we must do at most

(r + 1) * BLOCK_SIZE - 1 - r * BLOCK_SIZE = BLOCK_SIZE - 1 = O(sqrt(N))

changes to mo_left. At the picture below we represented leftpoints that lie in block #r, where the subscript shows relative order of queries:

There are |Qr| queries in Qr. That means, that we can estimate upper bound on number of changes for single r as O(|Qr| * sqrt(N)). Let’s sum it over all r:

O(sqrt(N) * (|Q0| + |Q1| + ... + |QK|)) = O(sqrt(N) * Q)

(because each query's leftpoint belongs to exactly one block, and we have Q queries in total).

Now suppose we’re done with queries from Qr and want to process queries from non-empty set Qr + k (for some k > 0). Any query [L', R'] ∈ Qr + k has

(r + k) * BLOCK_SIZE ≤ L' ≤ (r + k + 1) * BLOCK_SIZE - 1

Similarly, any query [L, R] ∈ Qr has

r * BLOCK_SIZE ≤ L ≤ (r + 1) * BLOCK_SIZE - 1

We can see that the maximum number of changes needed to transit from some query from Qr to some query from Qr + k is

(r + k + 1) * BLOCK_SIZE - 1 - r * BLOCK_SIZE = k * BLOCK_SIZE - 1

Now if we sum this over all r, we will get at most

K * BLOCK_SIZE = O(sqrt(N)) * O(sqrt(N)) = O(N),

because sum of all possible k’s does not exceed K.

There are no more cases when mo_left changes. Overall, we have O(sqrt(N) * Q) + O(N) = O(sqrt(N) * Q) changes of mo_left.

Corollary #4: All mo_left changes combined take O(Q * sqrt(N) * F) time (because each change is done in O(F) time).

Corollary #5: Time complexity of Mo’s algorithm is O((N + Q) * sqrt(N) * F).

Example problemproblem

       You have an array Arr of N numbers ranging from 0 to 99. Also you have Q queries [L, R]. For each such query            you must print

        V([L, R]) = ∑i=0..99 count(i)2 * i

        where count(i) is the number of times i occurs in Arr[L..R].

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.gitignore		.gitignore
Deepak's Algorithm Explanation.cpp		Deepak's Algorithm Explanation.cpp
Example problem		Example problem
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Deepak's Algorithm

About

Uh oh!

Releases

Packages

Languages

License

deepak-411/Deepak-s-Algorithm

Folders and files

Latest commit

History

Repository files navigation

Deepak's Algorithm

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages