Fundamentals Deep Dive

Worklist Algorithms

The engine that drives every dataflow analysis to its fixed point

Round-Robin Worklist Queue Reverse Postorder Convergence

The Problem: Fixed Points on CFGs

We need to compute IN/OUT sets for every block in a CFG. The challenge: blocks depend on each other, especially in loops.

The dependency problem:

• OUT[B2] depends on IN[B2]
• IN[B2] = OUT[B1] ⊔ OUT[B4]
• OUT[B4] depends on IN[B4]
• IN[B4] = OUT[B3]
• OUT[B3] depends on IN[B3]
• IN[B3] = OUT[B2] ← circular!
Analogy: Imagine a spreadsheet where cell A1 references B1, B1 references C1, and C1 references A1. You can't compute any cell in one pass — you need to iterate until all cells stabilize.
Solution: Start with ⊥ everywhere, then iteratively recompute until nothing changes. The question is: in what order do we process blocks?

Naive Round-Robin Iteration

The simplest approach: scan all blocks every round. Stop when no OUT set changes. Simple but wasteful.

Round: 0 | Processed: 0
let round_robin cfg =
init_all_to_bottom cfg;
let changed = ref true in
while !changed do
changed := false;
List.iter (fun b ->
let new_in = merge preds(b) in
let new_out = transfer b new_in in
if new_out ≠ out[b] then
changed := true;
out[b] := new_out
) cfg.blocks
done
Wasted work: In each round, we recompute blocks whose inputs didn't change. If only B3's output changed, why reprocess B1 and B2?

The Worklist Idea

Instead of scanning all blocks, maintain a queue of "dirty" blocks — blocks whose inputs may have changed. Only process what's needed.

Round-Robin
  • Process ALL blocks every round
  • Many blocks unchanged = wasted
  • O(h × n) per round, n rounds worst
  • Simple to implement
Worklist
  • Process only "dirty" blocks
  • Skip stable blocks entirely
  • O(h × e) total work
  • Slightly more complex
Analogy: Round-robin is like a teacher grading ALL exams every day. Worklist is like only re-grading exams that students resubmitted. Same final grades, much less work.
Core invariant: A block is on the worklist if and only if at least one of its predecessors' OUT sets has changed since we last processed it. When the worklist is empty, we've reached the fixed point.

Worklist Algorithm in Code

Initialize worklist with all blocks. Pop a block, process it, and if its OUT changed, add its successors to the worklist.

let worklist_solve cfg =
init_all_to_bottom cfg;
let wl = Queue.create () in
List.iter (Queue.push wl) cfg.blocks;
while not (Queue.is_empty wl) do
let b = Queue.pop wl in
let new_in = merge preds(b) in
let new_out = transfer b new_in in
if new_out ≠ out[b] then begin
out[b] := new_out;
List.iter (Queue.push wl)
(succs b)
end
done
Key differences from round-robin:
  1. Line 5: Pop one block (not iterate all)
  2. Line 8: Only act if OUT actually changed
  3. Lines 10-11: Only add successors (the blocks affected by this change)
  4. Line 4: Empty worklist = fixed point (no more dirty blocks)
Why add successors? If OUT[B] changed, then any block C where B → C has a new input. C needs to be reprocessed. Blocks NOT downstream of B are unaffected — skip them.
Duplicate prevention: Many implementations check if the successor is already on the worklist before adding it. This avoids redundant processing.

Forward vs Backward on the Worklist

The worklist algorithm works for both directions — just swap which neighbors get added when a block changes.

Forward (Reaching Defs)
• IN[B] = ⊔ { OUT[p] | p ∈ preds(B) }
• OUT[B] = transfer(B, IN[B])
• If OUT changed → add successors to WL
Backward (Live Variables)
• OUT[B] = ⊔ { IN[s] | s ∈ succs(B) }
• IN[B] = transfer(B, OUT[B])
• If IN changed → add predecessors to WL
(* Forward: add succs *)
if new_out ≠ out[b] then
out[b] := new_out;
succs(b) |> add_to_worklist
(* Backward: add preds *)
if new_in ≠ in_[b] then
in_[b] := new_in;
preds(b) |> add_to_worklist

Worked Example: Reaching Defs Worklist

Watch the worklist algorithm compute reaching definitions. Compare the work done vs round-robin.

Worklist:
IN / OUT Table:

Tracking Convergence

How many blocks does each approach process? The worklist avoids wasted work on stable blocks.

Comparison on a typical 6-block CFG with 1 loop:

MetricRound-RobinWorklist
Blocks processed188
Unchanged (wasted)100
Rounds3
Wasted work56%0%
Key Insight: The worklist processes exactly the blocks that need recomputing — no more, no less. On large CFGs (thousands of blocks), this difference is dramatic.
Complexity:
Round-robin: O(h × n²) worst case
Worklist: O(h × |E|) where |E| = edges
h = lattice height, n = blocks

🎯 Challenge A: Predict the Worklist

Given the CFG below, answer each question about what happens during worklist iteration.

Forward analysis. Edges: B1→B2, B1→B3, B2→B4, B3→B4, B4→B2 (back)
Q1: OUT[B1] changes. Which blocks get added to the worklist?
Q2: OUT[B4] changes. Which blocks get added?
Q3: We process B3 and its OUT does NOT change. What happens?

Worklist Order Matters

Same algorithm, same result — but different processing orders lead to different amounts of work. Compare FIFO vs LIFO.

FIFO (Queue) — process in order added
Steps: 0
LIFO (Stack) — process most recent first
Steps: 0
Observation: FIFO tends to process blocks in a breadth-first order — natural for forward analysis. LIFO goes depth-first — can propagate information deeper faster but may revisit blocks. Neither is universally better — the optimal order depends on the CFG shape.

Reverse Postorder (RPO)

The optimal traversal order for forward analysis. RPO visits each node after all its predecessors (except back edges) — process definitions before uses.

How to compute RPO:
  1. Run DFS from entry node
  2. Record post-order: when a node finishes (all children done)
  3. Reverse the post-order list
  4. Use this order for the worklist
Post-order:
RPO:
Why RPO? For acyclic parts of the CFG, RPO processes a block only after all its inputs are computed — one pass suffices. Only loops require re-iteration.

RPO vs FIFO: Same CFG, Less Work

Same reaching defs analysis, same result. RPO needs fewer steps because it processes blocks in dependency order.

FIFO Order: B1, B2, B3, B4
RPO Order: B1, B3, B2, B4
RPO advantage: On acyclic CFGs, RPO computes the fixed point in one pass. With loops, RPO still minimizes re-processing because definitions reach uses before uses are analyzed.

Chaotic Iteration

The theoretical foundation: any "fair" ordering converges to the same fixed point — even random! RPO is just the smartest choice.

Chaotic Iteration Theorem:
For monotone transfer functions on a lattice with ACC, any iteration strategy that is fair (every block gets processed infinitely often if it stays on the worklist) will converge to the same least fixed point.

Fair = don't starve any block forever.
All strategies find the same answer. The difference is only in how many steps it takes. RPO minimizes steps; random is worst on average.

Handling Loops (Back Edges)

Loops cause back edges in the CFG. These are the only edges that require re-processing — and where widening connects.

Back Edge Detection:
An edge A → B is a back edge if B was visited before A in DFS (B dominates A).

Impact on worklist:
• Back edges put the loop header back on the worklist
• Each loop iteration grows the analysis state
• With finite lattice: terminates after ≤ height iterations per loop
• With infinite lattice: needs widening at loop headers
Finite lattice
Powerset height = |defs|
Loop re-iterates at most |defs| times
No widening needed
Infinite lattice
Interval height = ∞
Loop re-iterates forever
Apply widening at header

Complexity Analysis

How much work does the worklist algorithm do? It depends on lattice height, CFG edges, and traversal order.

Complexity Formulas:
Round-RobinO(h × n²)
Worklist (FIFO)O(h × |E|)
Worklist (RPO)O(h × |E|) but fewer constant
h = lattice height, n = blocks, |E| = edges
Interactive Calculator:

🎯 Challenge B: Which Order Is Best?

For each CFG shape, pick the best worklist strategy.

CFG 1: Linear Chain
B1 → B2 → B3 → B4 → B5 (no loops)
CFG 2: Diamond with Back Edge
B1→{B2,B3}→B4→B2 (loop on left branch)
CFG 3: Nested Loops
Outer loop (B1→B2→B1) with inner loop (B2→B3→B2)
CFG 4: Backward Analysis (Live Variables)
Same CFG, but propagating information backwards

Applying Worklist to Live Variables (Backward)

Worklist works for backward analyses too — just swap successors ↔ predecessors and IN ↔ OUT.

Live Variable Sets
Worklist Queue
Log
Key difference: In backward analysis, when a block's OUT changes, we add its predecessors to the worklist (they need to recompute their IN).

Real-World Worklist Implementations

How production tools implement worklist iteration — click each to explore.

Click a tool to see its worklist strategy

Key Takeaways

1. Worklist = Targeted Iteration
Instead of blindly re-analyzing every block, only re-analyze blocks whose inputs changed. This transforms O(n) wasted work per round into O(changed) work.
2. Order Matters — A Lot
Reverse Postorder processes blocks in dependency order, so information flows "downhill" in one pass. For acyclic CFGs, RPO converges in a single pass.
3. Loops Are the Hard Part
Back edges create circular dependencies. Widening at loop headers forces convergence for infinite-height domains. Without it, iteration may never terminate.
4. Same Algorithm, Many Analyses
The worklist skeleton is domain-agnostic — plug in any transfer function and lattice. Reaching defs, live vars, taint, intervals — all use the same engine.
Analogy: Think of worklist iteration like a ripple in a pond. A change at one block creates a "ripple" that propagates to neighbors. RPO ensures ripples flow naturally downstream, and widening prevents infinite rippling in loops.

Worklist Algorithms Across the Bootcamp

You'll use worklist iteration throughout the PA Bootcamp. Here's where it appears.

Module 3
Dataflow Foundations
Module 4
Abstract Interpretation
Module 5
Security Analysis
Module 6
Tools Integration
Labs
Hands-On Implementation
Worklist iteration is the engine that powers every analysis you'll build.

Challenge C: Debug the Worklist

Each implementation has a bug. Identify what's wrong.

Bug 1: Never terminates
while worklist ≠ ∅:
b = worklist.dequeue()
new_out = transfer(b, IN[b])
if new_out ≠ OUT[b]:
OUT[b] = new_out
for s in succs(b): worklist.add(s)
worklist.add(b) // re-add self
Bug 2: Misses some facts
IN[b] = ∅
for p in preds(b):
IN[b] = OUT[p] // overwrite
new_out = transfer(b, IN[b])
Bug 3: Wrong answer for loops
// Interval analysis, loop header
// No widening applied
IN[b] = ⊔ OUT[p] for p in preds(b)
OUT[b] = transfer(b, IN[b])
Bug 4: Backward analysis wrong
// Live variables (backward)
OUT[b] = ⊔ IN[s] for s in succs(b)
new_in = transfer(b, OUT[b])
if new_in ≠ IN[b]:
for s in succs(b): worklist.add(s)

Quiz 1: Concept Check

Q1: When does a block get added to the worklist?
Q2: Why is RPO better than FIFO for forward analysis?
Q3: What guarantees convergence for infinite-height lattices?

Quiz 2: Predict the Next 3 Steps

Given this CFG and worklist state, predict what happens next.

B1: x = 5 gen={d1} kill={d4}
B2: y = x + 1 gen={d2} kill={}
B3: x = y gen={d3} kill={d1,d4}
B4: z = x gen={d4} kill={}
Current State:
Worklist: [B3]
OUT[B1]={d1}, OUT[B2]={d1,d2}, OUT[B3]={d2,d3}, OUT[B4]={d2,d3,d4}
(Reaching definitions, forward, FIFO)
Your Predictions:
Step 1: Process B3 → OUT[B3] becomes
Which block(s) added to worklist?
Step 2: Process next → changed?

Quiz 3: Choose the Right Strategy

For each scenario, pick the best worklist optimization. Think about the CFG shape and analysis type.

Scenario 1
Forward constant propagation on a large function (200 blocks, 5 nested loops). Lattice: flat with ⊤/⊥ + constants.
Scenario 2
Backward live variable analysis on straight-line code (no loops, 50 blocks in sequence).
Scenario 3
Interval analysis (domain: [lo, hi]) on a program with a while(true) loop incrementing a counter.
Scenario 4
Taint analysis on a web app: 1000 functions, interprocedural, powerset lattice over source labels.