Interval Scheduling
Problem Definition
The lecture hall scheduling problem: accept as many non-overlapping requests as possible.
Problem 1. Given $n$ intervals $1, \ldots, n$, find the maximum subset of mutually compatible intervals.
- $s(i)$, $f(i)$: start and finish times of interval $i$ ($s(i) < f(i)$)
- Intervals $i$ and $j$ are compatible if $f(i) \leq s(j)$ or $f(j) \leq s(i)$
Incorrect Attempt
1 | |
Counterexample: Choosing the interval with the fewest conflicts first can fail to find the optimal solution.
Correct Algorithm: Select Earliest-Finishing Interval
Intuition: Selecting the interval with the earliest finish time leaves the room free “as soon as possible.”
1 | |
Observation 1. The algorithm returns a set of compatible intervals (each selected interval’s conflicts are immediately removed).
Correctness Proof (Stay-Ahead Argument)
Notation:
- Algorithm’s solution: $\text{sol} = \lbrace i_1, \ldots, i_k\rbrace$ ($i_\ell$ selected in iteration $\ell$)
- Optimal solution: $O = \lbrace j_1, \ldots, j_m\rbrace$ ($f(j_1) \leq \cdots \leq f(j_m)$)
Lemma 1. For all $1 \leq r \leq m$: $k \geq r$ and $f(i_r) \leq f(j_r)$.
Proof. By induction on $r$.
-
Base case ($r=1$): The algorithm selects the earliest finish time, so $k \geq 1$ and $f(i_1) \leq f(j)$ for all $j \in I$.
-
Inductive step ($r \geq 2$): By hypothesis, $f(i_{r-1}) \leq f(j_{r-1})$. Since $O$ is compatible, $f(j_{r-1}) \leq s(j_r)$. Therefore:
So $j_r$ is not removed before iteration $r$, meaning the $r$-th iteration exists and $f(i_r) \leq f(j_r)$. ∎
Theorem 1. The algorithm is correct.
Proof. By Lemma 1, $k \geq m$. Since the algorithm returns a compatible set, $k \leq m$. Therefore $k = m$. ∎
Theorem 2. Time complexity: $O(n \log n)$ (sort by finish time, then scan once).
Certifying Algorithm
We can output a certificate of optimality alongside the solution.
Observation 2. If $J \subseteq I$ is compatible, then $\text{OPT}(I) \geq \lvert J\rvert$.
Lemma 2. If $I_1, \ldots, I_k$ partition $I$ and every pair of intervals within each $I_i$ overlaps, then $\text{OPT}(I) \leq k$.
Proof. For the optimal solution $O$, $\lvert I_i \cap O\rvert \leq 1$ for each $I_i$. Therefore $\text{OPT}(I) = \lvert O\rvert \leq k$. ∎
1 | |
This algorithm outputs sol together with $C_1, \ldots, C_{k^}$ as the optimality certificate. In each $C_k$, all intervals contain $f(i_k)$ and therefore mutually overlap. By Lemma 2, $\text{OPT}(I) \leq k^$, and since the algorithm returns $k^*$ compatible intervals, the solution is optimal.
Characteristics of Greedy Algorithms
The defining characteristic of a greedy algorithm: it makes a myopic choice at each step. While combinatorial choices are often coupled, myopic choices under a carefully selected criterion can yield an optimal algorithm for certain problems.
Minimizing Maximum Lateness
Problem Definition
Problem 2. Given $n$ jobs with processing times $p_i$ and deadlines $d_i$, find a schedule that minimizes the maximum lateness $L := \max_{1 \leq i \leq n} \ell(i)$.
Definitions:
- $s(i)$: start time of job $i$, $f(i) := s(i) + p_i$: completion time
- $\ell(i) := \max(f(i) - d_i,\ 0)$: lateness of job $i$
Failed Approaches
Shortest job first: $p_1=1, d_1=100$, $p_2=2, d_2=2$ → doing job 1 first gives $L=1$, optimally job 2 first gives $L=0$.
Smallest slack first ($d_i - p_i$): $p_1=1, d_1=2$, $p_2=10, d_2=10$ → counterexample exists.
EDD (Earliest Due Date) Rule
1 | |
Correctness Proof
Definition 1. A schedule has idle time if the machine is empty at some time $t$ but has a job scheduled at some $t’ > t$.
Lemma 3. There exists an optimal schedule with no idle time. (Compressing idle time does not increase maximum lateness.)
Definition 2. An inversion $(i,j)$ occurs if job $i$ is scheduled before $j$ but $d_i > d_j$.
Lemma 4. There exists an optimal schedule with no inversions and no idle time.
Proof sketch. If an inversion $(i,j)$ exists, there is an adjacent inversion pair. Swapping them:
- Jobs other than $i, j$: completion time unchanged.
- $j$: $\bar{s}(j) = s(i) < s(j)$ so $\bar{\ell}(j) \leq \ell(j) \leq L$.
- $i$: $\bar{\ell}(i) = \max\lbrace 0, f(j) - d_i\rbrace \leq \max\lbrace 0, f(j) - d_j\rbrace = \ell(j) \leq L$ (since $d_i > d_j$).
Each swap reduces inversions by 1. After at most $\binom{n}{2}$ swaps, we reach an inversion-free schedule. ∎
Lemma 5. All schedules with no inversions and no idle time have the same maximum lateness.
Proof. In any inversion-free schedule, jobs with the same deadline are scheduled consecutively after all earlier-deadline jobs. The last completion time among jobs with deadline $d$ is independent of their internal order, so the maximum lateness is the same. ∎
Theorem 3. The EDD algorithm is correct. (Follows from Lemmas 3, 4, and 5.)
Time complexity: $O(n \log n)$ (dominated by sorting).