# Range Minimum Query and Lowest Common Ancestor in O(n)/O(1) Time: An Overview

We show that the Range Minimum Query (RMQ) problem and Lowest Common Ancestor (LCA) problem can both be solved in O(1) per query after O(n) preprocessing time. The method is to reduce the two problems to +1/-1 RMQ, and solve the +1/-1 RMQ problem in O(n)/O(1) time.

## 1. The Three Problems

Problem 1 (Range Minimum Query, RMQ): Given an integer array of size $n$: $a[1], …, a[n]$. A range minimum query $\text{RMQ}_a(l, r) = (\arg) \min_{l \leq i \leq r} a[i]$, returns the value or the position of the minimum element in subarray $a[l…r]$.

Problem 2 (Lowest Common Ancestor, LCA): Given a rooted tree $T$. A lowest common ancestor query $\text{LCA}_T(u, v)$ returns a lowest node that has both $u, v$ as descendants.

Problem 3 (+1/-1 RMQ): +1/-1 RMQ is a special case of RMQ, where the difference of two adjacent elements in the given array is either +1 or -1.

## 2. Equivalence Between the Three Problems

In this section, we show that the three problems are mutually reducible in linear time. Note that the reduction from +1/-1 RMQ to RMQ is immediate.

### 2.1 Reduction from RMQ to LCA

The reduction from RMQ to LCA is to convert the array into Cartesian tree. The Cartesian tree of an array is a binary tree with heap property, and the in-order traversal gives the original array. Note that the minimum element in $a[l…r]$ corresponds to the LCA of $l$ and $r$ in Cartesian tree, and vice versa.

We may convert the array into Cartesian tree in linear time. Just add the element one by one, and maintain a pointer to the rightmost element of the current tree. When a new element comes, moves the pointer up until the number of current node is less than the new element, or until reaching the root of the tree. Then insert a node here, and place the original child subtree as the left subtree of the new node. Define the potential as the depth of the pointee, so the amortized time is $O(1)$ per insertion, and the total time complexity is $O(n)$.

### 2.2 Reduction from LCA to +1/-1 RMQ

To reduce the LCA problem into +1/-1 RMQ problem, we first label each vertex its depth. Then we run depth first search, and output the depth of the current node before visiting any child node and after visiting every child node. The resulting sequence is the Euler tour traversal of the original tree. To compute the LCA of $u$ and $v$, just find any occurrence position of $u$ and $v$, and find the range minimum between them. Since the depth of two adjacent nodes in Euler tour differ by at most 1, this is a +1/-1 RMQ problem.

Note that the number of occurrences of each node is the number of its children, plus 1. The length of the resulting sequence is $2n-1$, which is still linear.

## 3. Solving RMQ in O(n log n)/O(1) Time via Sparse Table

To achieve O(1) query time, a naive solution is to precalculate all $O(n^2)$ queries, which takes too much preprocessing time. In fact, we do not need to preprocess so many answers. The sparse table technique only preprocesses the RMQ of intervals of length power of 2. To answer RMQ query $\text{RMQ}(l, r)$, just return $\min\{\text{RMQ}(l, l+2^k-1), \text{RMQ}(r-2^k+1, r)\}$, where $k$ is the maximum integer such that $2^k \leq r – l + 1$. This actually uses two intervals to cover the entire range; due to the idempotence of minimum operation, the overlapped part does not affect the answer. There are at most $O(\log n)$ powers of 2 not exceeding $n$, and for each power of 2 we have $O(n)$ intervals to preprocess, so the total preprocess time is $O(n \log n)$.

## 4. Solving +1/-1 RMQ in O(n)/O(1) Time via Indirection

Our final step is to shave off the log factor. We use the indirection technique to remove the log factor, but only for +1/-1 RMQ.

We split the original array $a$ into segments of length $\frac{1}{2} \log n$. For each segment, we replace it with the minimum value, and we obtain a new sequence of length $O(\frac{n}{\log n})$. Now every interval that spans multiple segments can generally be decomposed into several contiguous segments and two intervals entirely within some segment. Using sparse table technique we may answer the minimum value in some contiguous segments in $O(1)$ time, with $O(n)$ preprocessing time (the log factor cancels). The remaining part is to answer the RMQ for intervals within segments.

Note that the minimum operation distributes over addition: $\min\{a, b\} + c = \min\{a + c, b + c\}$. Hence, for RMQ problem, the first element of the array doesn’t matter; only the difference matters. How many essentially difference +1/-1 RMQ instances of length $\frac{1}{2}\log n$ are there? Only $O(\sqrt{n})$, since there are only so many different difference arrays. And, for each array of length $\frac{1}{2} \log n$, there are only $O(\log^2 n)$ different intervals. We may set up a lookup table for all intervals of all possible instances of size $\frac{1}{2} \log n$ in $O(\sqrt{n} \log^2 n)$ time. This is dominated by the sparse table part, so the total preprocessing time is still $O(n)$.

For queries that entirely lies in some segment, just read the corresponding value in the lookup table; otherwise, the interval is decomposed into several contiguous segments, which can be answered by sparse table in constant time, and two intervals within some segments, which can be answered by lookup table. The total time per query is therefore $O(1)$.

# [2017-2018 ACM-ICPC Latin American Regional Programming Contest] D. Daunting device

### Statement

Given an array of length $1 \leq L \leq 10^5$, initially filled with 1. Define cnt[i] as the number of elements in the array that are equal to i. You should perform no more than $10^5$ operations. Each operation is four integers $P, X, A, B$, and let $M_1 = (A + S^2) \bmod L, M_2 = (A + (S + B)^2) \bmod L$, where S = cnt[P], then assign all elements with indices in $[min(M_1, M_2), max(M_1, M_2)]$ with value X. After all operations performed, output the maximum cnt[i] over all possible values of i.

### Solution

The main idea of the solution is just to simulate the operations in a reasonable time complexity. In fact, various possible solutions exist, including segment tree and square root decomposition. However, the fastest solution (at least theoretically) I know is employing balanced binary search trees.

We use the nodes of the balanced binary search tree to represent contiguous intervals of the array with same values. Additionally, we maintain an array “cnt”, which has the meaning as described in problem statement.

When updating an interval [l, r) to value X, we delete all nodes representing the interval [l, r), updating the array “cnt”. Note that this may split the nodes that across the boundaries of given interval. Finally, we inserting a new node which represents [l, r) with value X, and updating “cnt”.

Each operation can be done in amortized $O(\log L)$ time, since in each operation, only constant number of nodes are inserted, though it is possible that a great number of nodes are deleted. Hence the total time complexity is $O(n \log L)$.

### Source Code

#include <bits/stdc++.h>
#include <ext/pb_ds/assoc_container.hpp>
using namespace std;
using namespace __gnu_pbds;

#ifdef __LOCAL_DEBUG__
# define _debug(fmt, ...) fprintf(stderr, "\033[94m%s: " fmt "\n\033[0m", \
__func__, ##__VA_ARGS__)
#else
# define _debug(...) ((void) 0)
#endif
#define rep(i, n) for (int i=0; i<(n); i++)
#define Rep(i, n) for (int i=1; i<=(n); i++)
#define range(x) (x).begin(), (x).end()
typedef long long LL;
typedef unsigned long long ULL;

int l, c, n;
int cnt[100005];

typedef tree<int, pair<int, int>, less<int>, rb_tree_tag> rbtree;

#define tree shuorgsrh
rbtree tree;

void update(int l, int r, int c) {
rbtree mpart, rpart;

tree.split(l - 1, mpart);
int dif;
if (tree.size() &&
(dif = tree.rbegin()->first + tree.rbegin()->second.first - l) > 0) {
tree.rbegin()->second.first -= dif;
mpart.insert(make_pair(l, make_pair(dif, tree.rbegin()->second.second)));
}
mpart.split(r, rpart);

if (mpart.size() &&
(dif = mpart.rbegin()->first + mpart.rbegin()->second.first - r) > 0) {
mpart.rbegin()->second.first -= dif;
rpart.insert(make_pair(r, make_pair(dif, mpart.rbegin()->second.second)));
}
for (auto& p : mpart) {
cnt[p.second.second] -= p.second.first;
}
cnt[c] += r - l;
tree.insert(make_pair(l, make_pair(r - l, c)));
tree.join(rpart);
}

int main() {
scanf("%d%d%d", &l, &c, &n);
cnt[1] = l;
tree.insert(make_pair(0, make_pair(l, 1)));
rep (i, n) {
int p, x, a, b;
scanf("%d%d%d%d", &p, &x, &a, &b);
int m1 = (a + 1ll * cnt[p] * cnt[p]) % l;
int m2 = (a + 1ll * (cnt[p] + b) * (cnt[p] + b)) % l;
if (m1 > m2) swap(m1, m2);
m2++;
update(m1, m2, x);
}
cout << *max_element(cnt+1, cnt+c+1) << endl;
return 0;
}