Graph Coloring

We need some definitions and theorems to discuss about the coloring.

Definitions

A coloring of graph $G$ is an assignment of colors to its vertices.
A $k$-coloring assigns each vertex to exactly one of $k$ colors.
A coloring is proper if no two adjacent vertices share the same color.
A graph is $k$-colorable if it has a proper $k$-coloring.
We write $g(n) = \tilde{O}(f(n))$ if there exists a constant $c \ge 0$ such that $g(n) = O(f(n) \ln^c n)$, ignoring logarithmic factors.

A semicoloring is a coloring where at most $\frac{	V	}{4}$ edges have both endpoints assigned the same color.
A semicoloring need not be proper; it tolerates at most $\frac{	V	}{4}$ monochromatic edges.

Theorems

2-colorability can be decided in polynomial time by BFS/DFS: check whether any adjacent pair is assigned the same color.
Deciding 3-colorability is NP-hard.
Deciding whether a graph is 3-colorable or requires at least 5 colors is NP-hard (i.e., there are no 4-colorable inputs in the hard instances).
Finding a 3-coloring of a 3-colorable graph is NP-hard (since an efficient algorithm would solve 3-colorability).
A graph with maximum degree $\Delta$ can be $(\Delta + 1)$-colored in polynomial time: each vertex always has at least one available color since it has at most $\Delta$ neighbors.
A 3-colorable graph can be $O(\sqrt{n})$-colored in polynomial time.
There is a randomized polynomial-time algorithm that finds a semicoloring with $4\Delta^{\log_3 2}$ colors with probability at least $\frac{1}{2}$, where $\Delta$ is the maximum degree.
There is a randomized polynomial-time algorithm that finds a $\tilde{O}(n^{\log_6 2})$-coloring of a given 3-colorable graph with probability at least $\frac{1}{2}$.

Proof of theorems

Theorem 6 can be done by following algorithm.

$\operatorname{while} \exists$ a vertex $v$ with at least degree $\sqrt{n}$

Choose three new colors $c_1, c_2, c_3$
Color $v$ with $c_1$
Color neighbors of $v$ with $c_2, c_3$
Remove colored vertices

Color the rest of graph with $\sqrt{n}$ new colors

Since the given graph is 3-colorable, the neighbors of $v$ form a 2-colorable subgraph (because $v$ occupies $c_1$, leaving only 2 colors for its neighbors). 2-colorability can be checked and colored in polynomial time. Each iteration uses 3 fresh colors, so no color conflicts arise with other iterations. Finally, the remaining graph after removing all high-degree vertices has maximum degree at most $\sqrt{n}$, requiring at most $\sqrt{n}$ colors. The number of iterations $I$ satisfies $\sqrt{n} \cdot I \le n$, so $I \le \sqrt{n}$. The total number of colors used is at most $3\sqrt{n} + \sqrt{n} = 4\sqrt{n} = O(\sqrt{n})$, and the algorithm runs in polynomial time.

Theorem 7 can be done by following algorithm.

Consider the following vector program for given graph $G$ $=$ $(V,E)$. Let $n$ $=$ $\left\vert V \right\vert$ Minimize $\lambda$
such that $v_i \cdot v_j \le \lambda$ for all $(i, j) \in E$
$v_i \cdot v_i = 1$ for all $i \in V$
$v_i \in \mathcal{R}^n$ for all $i \in V$

Then, there is a feasible solution such that $v_i \cdot v_j$ $=$ $-\frac{1}{2}$ for all $(i,j) \in E$ if given graph is a 3-colorable graph. Proof is like follow.

Consider an equilateral triangle inscribed in the intersection of the unit hypersphere and a hyperplane containing the origin. Fix a 3-coloring and assign one of the three vertices of the triangle as the vector of each vetex in $V$ so that vertices are assigned the same vector iff they are assigned the same color. Notice that all $v_i$ will be on this hypersphere because “$v_i \cdot v_i = 1$ for all $i \in V$”. Now, we have $v_i \cdot v_j$ $=$ $\left\vert v_i \right\vert \left\vert v_j \right\vert \cos (\frac{2\pi}{3})$ $=$ $-\frac{1}{2}$. Notice that angle is $\frac{2\pi}{3}$ because it’s an equilateral triangle.

Then, we can design an algorithm like follow.

Solve algorithm above to obtain optimal solution $v_1, \cdots, v_n$
Choose vector $r_1, \cdots, r_t$ independently and uniformly at random from the unit hypersphere where $t = 2 + \log_3 \Delta$.
Divides the hyper sphere into $2^t$ different regions by half space $H_i = \{x | x \cdot r_i \ge 0\}$ for $0$ $\le$ $i$ $\le$ $t$
Assign a distinct color for all vectices in each resign.

Polynomial running time of semidefinite program/random picking will be updated later. Then, all other process will be in the polynomial time. Notice that this algorithm uses $2^t$ $=$ $2^{2 + \log_3 \Delta}$ $=$ $4 \times 2^{\log_3 \Delta}$ $=$ $4 {\Delta}^{\log_3 2}$ colors. Therefore, it’s enough to show that this makes a semicoloring with $\frac{1}{2}$ probability.

Now, let’s think about edge $e$ $=$ $(v_i, v_j)$. With this, consider some $H_k$ and $r^{\star}_k$. Which $r^{\star}_k$ is the projection of $r_k$ on to the plane defined by $v_i, v_j, 0$. Then, $v_i \cdot r_k$ $=$ $v_i \cdot r^{\star}_k$ and $v_i \cdot r_k$ $=$ $v_i \cdot r^{\star}_k$. Notice that $r_k$ can be decomposed to $r^{\star}_k$ and $r^{\circ}_k$. Then, $v_i \cdot r_k$ $=$ $v_i \cdot (r^{\star}_k + r^{\circ}_k)$ $=$ $v_i \cdot r^{\star}_k$ $+$ $v_i \cdot r^{\circ}_k$ $=$ $v_i \cdot r^{\star}_k$. This works in the same way for $v_j$ either. Now, $Pr[\text{Both } v_i \text{ and } v_j \text{ are in the } H_k]$ $=$ $Pr[\text{Both } v_i \cdot r_k \text{ and } v_j \cdot r_k \text{ has the same sign}]$ $=$ $Pr[\text{Both } v_i \cdot r^{\star}_k \text{ and } v_j \cdot r^{\star}_k \text{ has the same sign}]$

Then, $Pr[\text{Both } v_i \cdot r^{\star}_k \text{ and } v_j \cdot r^{\star}_k \text{ has the same sign}]$ $=$ $\frac{\pi - \theta}{\pi}$ which $\theta$ is the angle between $v_i$ and $v_j$. Proof is like follow. Let’s denote $\theta_i, \theta_j, \theta_k$ as the angle of $v_i, v_j, r^{\star}_k$. Then, there are two cases for $\theta_k$ which both $v_i \cdot r^{\star}_k$ and $v_j \cdot r^{\star}_k$ are positive or negative. With this two cases, there are 2 cases per each case for each of them from criteria “$\theta_i = \theta_j + \pi$” to choose the range of possible angle. Therefore, there are 4 cases in total.

$\theta_j - \frac{\pi}{2}$ $\le$ $\theta_k$ $\le$ $\theta_i + \frac{\pi}{2}$ if $\theta_i$ $\le$ $\theta_j$ $\le$ $\theta_i$ $+$ $\pi$ and both $v_i \cdot r^{\star}_k$ and $v_j \cdot r^{\star}_k$ are positive.
$\theta_j + \frac{\pi}{2}$ $\le$ $\theta_k$ $\le$ $\theta_i + \frac{3\pi}{2}$ if $\theta_i$ $\le$ $\theta_j$ $\le$ $\theta_i$ $+$ $\pi$ and both $v_i \cdot r^{\star}_k$ and $v_j \cdot r^{\star}_k$ are negative.
$\theta_i - \frac{\pi}{2}$ $\le$ $\theta_k$ $\le$ $\theta_j + \frac{\pi}{2}$ if $\theta_i$ $-$ $\pi$ $\le$ $\theta_j$ $\le$ $\theta_i$ and both $v_i \cdot r^{\star}_k$ and $v_j \cdot r^{\star}_k$ are positive.
$\theta_i + \frac{\pi}{2}$ $\le$ $\theta_k$ $\le$ $\theta_j + \frac{3\pi}{2}$ if $\theta_i$ $-$ $\pi$ $\le$ $\theta_j$ $\le$ $\theta_i$ and both $v_i \cdot r^{\star}_k$ and $v_j \cdot r^{\star}_k$ are negative.

Then, we can know claim holds in each cases.

$2\pi Pr[\text{Both } v_i \cdot r^{\star}_k \text{ and } v_j \cdot r^{\star}_k \text{ has the same sign}]$ $=$ $(\theta_i + \frac{\pi}{2})$ $-$ $(\theta_j - \frac{\pi}{2})$ $+$ $(\theta_i + \frac{3\pi}{2})$ - $(\theta_j + \frac{\pi}{2})$ $=$ $2(\theta_i - \theta_j)$ $+$ $2\pi$ $=$ $2(\pi + \theta_i - \theta_j)$ $=$ $2(\pi - (\theta_j - \theta_i))$ if $\theta_i$ $\le$ $\theta_j$ $\le$ $\theta_i$ $+$ $\pi$. As a result, $Pr[\text{Both } v_i \cdot r^{\star}_k \text{ and } v_j \cdot r^{\star}_k \text{ has the same sign}]$ $=$ $\frac{\pi - (\theta_j - \theta_i)}{\pi}$. Therefore claim holds in this case.
$2\pi Pr[\text{Both } v_i \cdot r^{\star}_k \text{ and } v_j \cdot r^{\star}_k \text{ has the same sign}]$ $=$ $(\theta_j + \frac{\pi}{2})$ $-$ $(\theta_i - \frac{\pi}{2})$ $+$ $(\theta_j + \frac{3\pi}{2})$ $-$ $(\theta_i + \frac{\pi}{2})$ $=$ $2(\theta_j - \theta_i)$ $+$ $2\pi$ $=$ $2(\pi + \theta_j - \theta_i)$ $=$ $2(\pi - (\theta_i - \theta_j))$ if $\theta_i$ $-$ $\pi$ $\le$ $\theta_j$ $\le$ $\theta_i$. As a result, $Pr[\text{Both } v_i \cdot r^{\star}_k \text{ and } v_j \cdot r^{\star}_k \text{ has the same sign}]$ $=$ $\frac{\pi - (\theta_i - \theta_j)}{\pi}$. Therefore claim holds in this case.

Notice that $Pr[\text{Both } v_i \text{ and } v_j \text{ are in the } H_k]$ $=$ $\frac{\pi - \theta}{\pi}$ $=$ $1$ $-$ $\frac{\theta}{\pi}$ $=$ $1$ $-$ $\frac{\arccos(v_i \cdot v_j)}{\pi}$ $=$ $1$ $-$ $\frac{1}{\pi}\arccos(v_i \cdot v_j)$.

Therefore, $Pr[\text{Both } i \text{ and } j \text{ assigned to the same color}]$ $=$ $Pr[\text{Both } v_i \text{ and } v_j \text{ are in the same reigon for all half space}]$ $=$ $(1 - \frac{1}{\pi}\arccos(v_i \cdot v_j))^t$. From the semidefinite program and the proof above, two facts hold.

$v_i \cdot v_j \le \lambda$ for all $(i, j) \in E$.
There is a feasible solution with $v_i \cdot v_j = -\frac{1}{2}$ for all $(i, j) \in E$.

Let $\lambda^\star$ denote the optimal value of the semidefinite program. Since the program minimizes $\lambda$, we have $\lambda^\star \le -\frac{1}{2}$.

Therefore, $Pr[\text{Both } i \text{ and } j \text{ assigned to the same color}]$ $=$ $(1 - \frac{1}{\pi}\arccos(v_i \cdot v_j))^t$ $\le$ $(1 - \frac{1}{\pi}\arccos({\lambda}^{\star}))^t$ $\le$ $(1 - \frac{1}{\pi}\arccos(-\frac{1}{2}))^t$ $=$ $(1 - \frac{1}{\pi}\frac{2\pi}{3})^t$ $=$ $(1 - \frac{2}{3})^t$ $=$ $(\frac{1}{3})^{2 + \log_3 \Delta}$ $=$ $\frac{1}{9\Delta}$.

Notice that $\arccos x$ is a monotonic decreasing function for $-1$ $\le$ $x$ $\le$ $1$. Therefore, $1$ $-$ $\frac{1}{\pi}\arccos x$ is a monotonic increasing function.

Now, sum of degree is $2\left\vert E \right\vert$ and possible maximum sum of degree is $n\Delta$. Therefore, $2\left\vert E \right\vert$ $\le$ $n\Delta$ which means $\frac{\left\vert E \right\vert}{\Delta}$ $\le$ $\frac{n}{2}$. As a result, expected number of edges whose endpoints are colred the same is at most $\left\vert E \right\vert \frac{1}{9\Delta}$ $=$ $\frac{\left\vert E \right\vert}{9\Delta}$ $\le$ $\frac{n}{18}$.

By Markov’s inequality, $\Pr[X \ge \frac{n}{4}] \le \frac{E[X]}{n/4} = \frac{4}{n}E[X] \le \frac{4}{n} \cdot \frac{n}{18} = \frac{2}{9} \le \frac{1}{2}$, where $X$ is the number of monochromatic edges. Therefore the claim holds.

Theorem 8 can be done by following algorithm. Let $\sigma = n^{\log_6 3}$.

Let's call the below part as "part 1"

$\operatorname{while} \exists$ a vertex $v$ with degree $\ge$ $\sigma$

Color $v$ and it's neighbors using three new colors
Remove the colored vertices

Let's call the below part as "part 2"

$\operatorname{while}$ the graph is not empty, repeat at most $\log_2 n$ times

Try running Algorithm at theorem 7 with $[\log_2(\log_2 n)] + 1$ times to find a semicoloring
$R \leftarrow$ endpoints of edges whose endpoints are colored the same
Color $V - R$ according to the semicoloring, using new colors
Remove the colored vetices

Color the remaining vertices with distinct new colors

Now, it is trivial that this algorithm runs in a polynomial time. Therefore, it is enough to show it returns $\tilde{O}(n^{\log_6 2})$-coloring at least $\frac{1}{2}$ probability. Now, let’s think about part 1. Then, maximum number of iteration is at most $[\frac{n}{\sigma}] + 1$ because it remvoes at least $\sigma$ vertices at once. Therefore, color used at part 1 is at most $3([\frac{n}{\sigma}] + 1)$. Now, let’s think about part 2. Left part of graph has at most $\sigma$ degree because we removed every possible vertices with more than degree $\sigma$. Now, let’s denote the maximum of degree after part 1 as $\Delta$ and the number of vertices as $n’$. If we see each iteration, it fails to find a semicoloring with $(\frac{1}{2})^{[\log_2(\log_2 n)] + 1}$ probability. Which is $(\frac{1}{2})^{[\log_2(\log_2 n)] + 1}$ $=$ $(2)^{-[\log_2(\log_2 n)] - 1}$ $\le$ $(2)^{-\log_2(\log_2 n) - 1}$ $=$ $(2)^{-\log_2(\log_2 n)}\frac{1}{2}$ $=$ $(2)^{\log_2(\log_2 n)^{-1}}\frac{1}{2}$ $=$ $(\log_2 n)^{-1}\frac{1}{2}$ $=$ $\frac{1}{2\log_2 n}$. Therefore, it fails at most $\frac{1}{2\log_2 n}$ probability. As a result, this algorithm success to find a semicoloring in every iteration at least $(1 - \frac{1}{2\log_2 n})^{\log_2 n}$ $\ge$ $\frac{1}{2}$ probability if $n$ $\ge$ $2$.

Proof is like follow. Notice that if we change $log_2 x$ to $t$ then $(1 - \frac{1}{2\log_2 n})^{\log_2 n}$ $=$ $(1 - \frac{1}{2t})^{t}$. Now it can be derivated easily about $t$ and result will be $(1 - \frac{1}{2t})(\ln(1 - \frac{1}{2t}) + \frac{1}{2t - 1})$. Notice that $t \ge 1$. Then, $1 - \frac{1}{2t} \ge 0$ because $t \ge 1$. For the other part, $\ln(1 - \frac{1}{2t})$ $+$ $\frac{1}{2t - 1}$ $=$ $\ln(\frac{2t - 1}{2t})$ $+$ $\frac{1}{2t - 1}$ $=$ $\ln(\frac{2t - 1}{2t})$ $+$ $\frac{2t}{2t - 1}$ $-$ $\frac{2t - 1}{2t - 1}$ $=$ $\ln(\frac{2t - 1}{2t})$ $+$ $\frac{2t}{2t - 1}$ $-$ $1$ $=$ $\ln X$ $+$ $\frac{1}{X}$ $-$ $1$. for $X$ $=$ $\frac{2t -1}{2t}$. Notice that $0$ $<$ $X$ $<$ $1$. Then, $\ln X$ $+$ $\frac{1}{X}$ $-$ $1$ $\ge$ $0$. Proof is like follow. If we derivated $y$ $=$ $\ln X$ $+$ $\frac{1}{X}$ $-$ $1$, $y’$ $=$ $\frac{1}{X}$ $-$ $\frac{1}{X^2}$ $=$ $\frac{X - 1}{X^2}$. As a result, $y$ decreases in $0$ $<$ $X$ $<$ $1$. However, it’s zero when $X$ $=$ $1$. Therefore, $\ln X$ $+$ $\frac{1}{X}$ $-$ $1$ $\ge$ $0$. As a result, this algorithm success to find a semicoloring in every iteration at least $\frac{1}{2}$ probability if $n$ $\ge$ $2$.

Now, it doesn’t have too much meaning if you think about $n$ $=$ $1$. Therefore, assumption isn’t too tight.

Now, let’s see how many vertices are removed at each iteration. If we start with $n’$ vertices, $\left\vert R \right\vert$ $\le$ $2 \frac{n’}{4}$ $=$ $\frac{n’}{2}$. As a result, it will reduce to be half at least. Therefore, there will be at most $O(1)$ vertices after part 2. Notice that we will do part 2 at most $log_2 n$ time and it will be enough to be so.

As a result color used in each part is like follow.

Part 1 uses at most $3([\frac{n}{\sigma}] + 1)$ $=$ $3([\frac{n}{n^{\log_6 3}}] + 1)$ $\le$ $3(\frac{n}{n^{\log_6 3}} + 1)$ $=$ $3(\frac{n^{\log_6 6}}{n^{\log_6 3}} + 1)$ $=$ $3(n^{\log_6 2} + 1)$ colors.
Part 2 uses at most $4\Delta^{\log_3 2}(\log_2 n)$ $\le$ $4\sigma^{\log_3 2}(\log_2 n)$ $=$ $4(n^{\log_6 3})^{\log_3 2}(\log_2 n)$ $=$ $4n^{\log_6 3 \cdot \log_3 2}(\log_2 n)$ $=$ $4n^{\log_6 2}(\log_2 n)$ colors.
Left part will use at most $O(1)$ colors.

As a result, algorithm uses at most $3n^{\log_6 2}$ $+$ $3$ $+$ $4n^{\log_6 2}(\log_2 n)$ $+$ $O(1)$. Therefore, claim holds.