# Solving Sparse Linear System in O(nm) Time

There are many practical applications that involve solving sparse linear system $\mathbf{A}\mathbf {x} = \mathbf{b}$, such as random walk in sparse graphs. Traditional techniques such as Gauss elimination or LU decomposition applies for general linear systems, and they run in cubic time. Fastest known algorithms for general matrix run in about $O(n^{2.373})$ time. However, for sparse matrix, we may achieve a much faster result. Specifically, we may solve the system in $O(mn)$ time, where $m$ is the number of nonempty entries in $\mathbf{A}$.  We assume that the matrix $\mathbf{A}$ is non-singular (i.e., invertible) throughout this passage.

### 1. Inverting a matrix from its annealing polynomial

Given polynomial $p(x)$, if $p(\mathbf{A}) = \mathbf{0}$ for matrix $\mathbf{A}$, then we say $p(x)$ is an annealing polynomial for matrix $\mathbf{A}$. Due to Hamilton-Cayley theorem, the characteristic polynomial $\det( x\mathbf{I} – \mathbf{A} )$ is an annealing polynomial for $\mathbf{A}$. Among all annealing polynomials for matrix $\mathbf{A}$, the one with minimum degree is called the minimal polynomial for $\mathbf{A}$. It can be proved that the minimum polynomial for given matrix is unique up to a constant factor, and any other annealing polynomial is a polynomial multiple of the minimum polynomial.

We can invert a matrix $\mathbf{A}$ if we have a minimal polynomial for $\mathbf{A}$ :
$a_0 I + a_1 \mathbf{A} + a_2 \mathbf{A}^2 + \cdots + a_k \mathbf{A}^k = 0 \tag{*}$Since $\mathbf{A}$ is invertible, 0 is never a root of its characteristic polynomial, hence we must have $a_0 \neq 0$. Multiplying $\mathbf{A}^{-1}$ on both sides yields
$\mathbf{A}^{-1} = -\frac{a_1 I + a_2 \mathbf{A} + a_3 \mathbf{A}^2 + \cdots + a_k \mathbf{A}^{k-1}}{a_0}$this means that we may represent the inversion of $\mathbf{A}$ by the linear combination of powers of  $\mathbf{A}$.

### 2. The Berlekamp-Massey algorithm

Berlek-Massey algorithm solves the following problem in $O(n^2)$ time:

Given a finite sequence $\{x_i\}_{i=1}^n$, find a minimum order linear recurrence consistent with the given sequence. Formally, find a shortest sequence $c_0 = 1, c_1, \cdots, c_{k-1}$, such that $\sum_{l=0}^{k-1} x_{j-l} c_l = 0$ holds for all possible $j$.

This algorithm has many real world applications. The most typical one is to find the shortest linear feedback shift register for a given binary sequence. Also, it can be viewed as interpolating a sequence with exponential terms. One important fact for Berlekamp-Massey algorithm is, for an order-$r$ linearly recurrent sequence, taking the first $2r$ elements as the input of the algorithm suffices to recover the recurrence.

### 3. Finding the minimum polynomial

Note that the annealing polynomial is exactly the linear recurrence of powers of a matrix. However, it is infeasible to compute the minimum polynomial from the powers of $\mathbf{A}$. However, we may randomly pick vectors $\mathbf{u}$ and $\mathbf{v}$ and compute the minimum polynomial from $\mathbf{u}^T \mathbf{A}^i \mathbf{v}$. We claim without proof that, with high probability, the coefficients of the recurrence of the sequence are exactly those of the minimum polynomial. The sequence can be computed in $O(mn)$ time by iteratively doing sparse matrix-vector multiplication in $O(m)$ time. Finally, apply Berlekamp-Massey algorithm to the given sequence.

### 4. Solving the linear system

Since the inverse of a sparse matrix is generally not sparse, we won’t actually compute the inverse of $\mathbf{A}$.  Actually, we can compute $\mathbf{A}^{-1}\mathbf{b}$ via formula (*) in $O(mn)$ time. The procedure is exactly the same as in finding minimum polynomial: just iteratively perform spare matrix-vector multiplication.

# Nordic Collegiate Programming Contest 2018 [NCPC2018] 题解

unsolved

### B. Baby Bites

solved (0:08, +1)

solved (0:17)

solved (4:55)

solved (3:23)

unsolved

unsolved

solved (0:34)

solved (1:00 +1)

solved (1:57 +2)

solved (3:07)

# [Codeforces Round #513] H. Sophisticated Device

### 题目描述

1. ADD %dest, %src1, %src2：R[dest] = R[src1] + R[src2]
2. POW %d, %s：R[dest] = pow(R[src], d)

# [2017-2018 ACM-ICPC Latin American Regional Programming Contest] D. Daunting device

### Statement

Given an array of length $1 \leq L \leq 10^5$, initially filled with 1. Define cnt[i] as the number of elements in the array that are equal to i. You should perform no more than $10^5$ operations. Each operation is four integers $P, X, A, B$, and let $M_1 = (A + S^2) \bmod L, M_2 = (A + (S + B)^2) \bmod L$, where S = cnt[P], then assign all elements with indices in $[min(M_1, M_2), max(M_1, M_2)]$ with value X. After all operations performed, output the maximum cnt[i] over all possible values of i.

### Solution

The main idea of the solution is just to simulate the operations in a reasonable time complexity. In fact, various possible solutions exist, including segment tree and square root decomposition. However, the fastest solution (at least theoretically) I know is employing balanced binary search trees.

We use the nodes of the balanced binary search tree to represent contiguous intervals of the array with same values. Additionally, we maintain an array “cnt”, which has the meaning as described in problem statement.

When updating an interval [l, r) to value X, we delete all nodes representing the interval [l, r), updating the array “cnt”. Note that this may split the nodes that across the boundaries of given interval. Finally, we inserting a new node which represents [l, r) with value X, and updating “cnt”.

Each operation can be done in amortized $O(\log L)$ time, since in each operation, only constant number of nodes are inserted, though it is possible that a great number of nodes are deleted. Hence the total time complexity is $O(n \log L)$.

### Source Code

#include <bits/stdc++.h>
#include <ext/pb_ds/assoc_container.hpp>
using namespace std;
using namespace __gnu_pbds;

#ifdef __LOCAL_DEBUG__
# define _debug(fmt, ...) fprintf(stderr, "\033[94m%s: " fmt "\n\033[0m", \
__func__, ##__VA_ARGS__)
#else
# define _debug(...) ((void) 0)
#endif
#define rep(i, n) for (int i=0; i<(n); i++)
#define Rep(i, n) for (int i=1; i<=(n); i++)
#define range(x) (x).begin(), (x).end()
typedef long long LL;
typedef unsigned long long ULL;

int l, c, n;
int cnt[100005];

typedef tree<int, pair<int, int>, less<int>, rb_tree_tag> rbtree;

#define tree shuorgsrh
rbtree tree;

void update(int l, int r, int c) {
rbtree mpart, rpart;

tree.split(l - 1, mpart);
int dif;
if (tree.size() &&
(dif = tree.rbegin()->first + tree.rbegin()->second.first - l) > 0) {
tree.rbegin()->second.first -= dif;
mpart.insert(make_pair(l, make_pair(dif, tree.rbegin()->second.second)));
}
mpart.split(r, rpart);

if (mpart.size() &&
(dif = mpart.rbegin()->first + mpart.rbegin()->second.first - r) > 0) {
mpart.rbegin()->second.first -= dif;
rpart.insert(make_pair(r, make_pair(dif, mpart.rbegin()->second.second)));
}
for (auto& p : mpart) {
cnt[p.second.second] -= p.second.first;
}
cnt[c] += r - l;
tree.insert(make_pair(l, make_pair(r - l, c)));
tree.join(rpart);
}

int main() {
scanf("%d%d%d", &l, &c, &n);
cnt[1] = l;
tree.insert(make_pair(0, make_pair(l, 1)));
rep (i, n) {
int p, x, a, b;
scanf("%d%d%d%d", &p, &x, &a, &b);
int m1 = (a + 1ll * cnt[p] * cnt[p]) % l;
int m2 = (a + 1ll * (cnt[p] + b) * (cnt[p] + b)) % l;
if (m1 > m2) swap(m1, m2);
m2++;
update(m1, m2, x);
}
cout << *max_element(cnt+1, cnt+c+1) << endl;
return 0;
}


# [2017-2018 ACM-ICPC Latin American Regional Programming Contest] A. Arranging tiles

### 代码

#include <bits/stdc++.h>
using namespace std;

#define rep(i, n) for (int i = 0; i < (n); i++)
#define Rep(i, n) for (int i = 1; i <=(n); i++)
#define range(x) begin(x), end(x)

constexpr double EPS = 1e-8;

int fcmp(double x, double y = 0.0) {
if (fabs(x - y) < EPS) return 0;
return x < y ? -1 : 1;
}

typedef double T;
typedef struct pt {
T x, y;
T operator,(pt a) { return x * a.x + y * a.y; }  // inner product
T operator*(pt a) { return x * a.y - y * a.x; }  // outer product
pt operator+(pt a) { return {x + a.x, y + a.y}; }
pt operator-(pt a) { return {x - a.x, y - a.y}; }

pt operator*(T k) { return {x * k, y * k}; }
pt operator-() { return {-x, -y}; }
} vec;

typedef pair<pt, pt> seg;

int n;
int height;

struct polygon {
int width;
vector<pair<int, int>> lpart, rpart;

void init(vector<pair<int, int>>& pts) {
int minx = INT_MAX, maxx = INT_MIN;
for (auto& p : pts) {
height = max(height, p.second);
minx = min(minx, p.first);
maxx = max(maxx, p.first);
}
width = maxx - minx;
for (auto& p : pts) p.first -= minx;
int i = 1;
while (pts[i].second < height) rpart.push_back(pts[i++]);
rpart.push_back(pts[i++]);
while (i < pts.size()) lpart.push_back(pts[i++]);
lpart.push_back(pts[0]);
reverse(range(lpart));
}
} poly[16];

pt getMidPoint(pair<int, int> pt1, pair<int, int> pt2, double y) {
pt a { (double) pt1.first, (double) pt1.second },
b { (double) pt2.first, (double) pt2.second };
vec v = (b - a) * (1.0 / (b.y - a.y));
return a + v * (y - a.y);
}

double max_overlap(polygon& polyl, polygon& polyr) {
vector<pair<int, int>> &lpts = polyl.rpart, &rpts = polyr.lpart;
int ln = lpts.size(), rn = rpts.size(), yn;
vector<int> ys;
{
vector<int> yl(ln), yr(rn);
rep (i, ln) yl[i] = lpts[i].second;
rep (i, rn) yr[i] = rpts[i].second;
set_union(range(yl), range(yr), back_inserter(ys));
yn = ys.size();
}
int lptr = 0, rptr = 0;
double min_dif = 0.0;
rep (i, yn) {
pt lp = getMidPoint(lpts[lptr], lpts[lptr+1], ys[i]),
rp = getMidPoint(rpts[rptr], rpts[rptr+1], ys[i]);
min_dif = max(min_dif, lp.x - rp.x);
while (lptr < ln-1 && lpts[lptr+1].second <= ys[i]) lptr++;
while (rptr < rn-1 && rpts[rptr+1].second <= ys[i]) rptr++;
}
return polyl.width - min_dif;
}

double g[16][16];
double dp[1 << 14][16];

int main() {
scanf("%d", &n);
rep (i, n) {
int k; scanf("%d", &k);
vector<pair<int, int>> pts;
rep (j, k) {
int x, y; scanf("%d%d", &x, &y);
pts.emplace_back(x, y);
}
poly[i].init(pts);
}
rep (i, n) {
rep (j, n) {
if (i == j) continue;
g[i][j] = max_overlap(poly[i], poly[j]);
}
}
while (rem) {
unsigned bit = rem & -rem;
int i = __builtin_ctz(bit);
rep (j, n) {
if (j == i || ((1 << j) & mask) == 0) continue;
}
rem -= bit;
}
} else {