CS305 -- Formal Language Theory
Use ← → arrows to navigate · 24 slides
A Pushdown Automaton extends an NFA with a stack -- an infinite, last-in-first-out memory.
The stack lets the PDA "remember" unbounded information in LIFO order. This is exactly the power needed to match nested structures like anbn and balanced parentheses.
A DFA has finite memory (just its state). It cannot count beyond a fixed bound.
Consider L = { anbn | n ≥ 0 }:
Solution: Add a stack! Push each a, pop one per b. Accept if stack is empty.
A spring-loaded tray dispenser. You only see the top tray. Push on, pop off. Never reach into the middle.
A PDA has three components: an input tape, a finite control (state), and a stack (unbounded LIFO).
A PDA is a 7-tuple:
The range is a set of (state, string) pairs -- PDAs are nondeterministic. The PDA "chooses" among moves. Accepts if any choice leads to acceptance.
Σ is the input alphabet (e.g. {a,b}). Γ is the stack alphabet (e.g. {a,b,$}). They can overlap, but Γ often has extra symbols like bottom marker $.
Each transition:
Read as: "In state q, reading a, with X on top, move to p and replace X with Y."
String Y is pushed so its leftmost symbol ends on top. If Y = ABC, then A is on top.
Every transition pops the stack top and pushes a replacement. We encode push, pop, and no-change via this single mechanism.
Pop X, push BX. Net: B appears on top, X stays below.
Pop X, push nothing. X is gone.
Pop X, push X back. Stack unchanged.
Pop X, push ABC. A on top (leftmost = top).
a: push ab: pop one aThe stack counts a's. Each b cancels one a. If counts match, stack returns to $.
There are two equivalent ways a PDA can accept a string.
Accept if ALL input consumed and PDA is in state q ∈ F. Stack contents do not matter.
Accept if ALL input consumed and stack is completely EMPTY. Current state does not matter.
These two modes are equivalent in power. Any language accepted by final state can also be accepted by empty stack (via a different PDA), and vice versa. We can mechanically convert between them.
1. Add new start q0', push new bottom marker Z0' under Z0. 2. From every final state, ε-transition to q_drain. 3. q_drain pops everything.
1. Add new start q0', push Z0' under Z0. 2. From EVERY state, ε-transition to q_f when Z0' is on top. 3. q_f is only accept state.
The new bottom marker Z0' is essential! Without it, the original PDA might empty its stack prematurely and the conversion would accept strings it should not.
Language: { w ∈ {(, )}* | w has properly nested, balanced parentheses }
()Seeing ) when stack has only $ means unmatched close paren -- machine gets stuck (reject).
Consider the anbn PDA from Slide 8. After processing the input aaab (3 a's then 1 b), what is the current configuration?
Even-length palindromes! The second half is the reverse of the first.
0110: first half 01, second half 10 = reverse of 01.
But how does the PDA know where the middle is?
The PDA guesses the midpoint! At every position it branches:
Accepts if any guess leads to acceptance.
Clone yourself at every step. One clone keeps pushing, the other switches to matching. Most clones die. The one who guessed right survives.
For finite automata: deterministic = nondeterministic in power.
A deterministic PDA has at most one move per configuration. No branching, no guessing.
{ wwR } is context-free but NOT deterministic context-free. The "guess the middle" trick requires nondeterminism.
Deterministic CFLs matter in practice: most programming languages are DCFLs so parsers run efficiently.
| Language | DPDA? | PDA? |
|---|---|---|
| anbn | Yes | Yes |
| wwR | No | Yes |
| wcwR | Yes | Yes |
This PDA is supposed to accept { anbn | n ≥ 1 } but has a bug. Which transition is wrong?
What is wrong with Transition 5?
Given any CFG G, we build a PDA M with L(M) = L(G). Uses top-down parsing.
The stack holds a "prediction" of remaining input. Variables get expanded (replaced by rule RHS). Terminals get matched and consumed. This simulates a leftmost derivation.
Grammar: S → aSb | ε (generates anbn)
Every PDA can be converted to an equivalent CFG — the reverse direction of the equivalence.
For each pair of states (p, q), create a variable Apq that generates all strings taking the PDA from state p with empty stack to state q with empty stack.
The grammar variable Apq represents "what the PDA does between states p and q." This captures every possible computation path.
This conversion can produce exponentially many rules! The resulting CFG is correct but usually not practical — it's a theoretical tool.
This is one of the most important results in formal language theory.
A language is context-free if and only if some PDA recognizes it.
CFL = {L | some CFG generates L} = {L | some PDA recognizes L}
Like how DFAs, NFAs, and regex are three equivalent ways to describe regular languages, CFGs and PDAs are two equivalent ways to describe context-free languages.
This means: any language you can describe with recursive production rules, you can also recognize with a stack-based machine, and vice versa.
The stack gives power beyond DFAs, but it's still limited.
{anbncn} — needs to match THREE counts{ww} — needs to compare non-nested halves{anbmcndm} — cross-serial dependenciesA stack is LIFO — you can only match in nested (palindromic) patterns. Matching in crossing patterns requires more memory (like a Turing machine tape).
DPDA ⊊ NPDA: Unlike DFA=NFA, deterministic PDAs are strictly weaker than nondeterministic ones! {wwR} needs nondeterminism to guess the middle.
For each language, decide: PDA can recognize, or PDA cannot?
1. {anb2n | n ≥ 0}
2. {anbncn | n ≥ 0}
3. {wwR | w ∈ {0,1}*}
4. {ww | w ∈ {0,1}*}
| Concept | Key Point |
|---|---|
| PDA | NFA + Stack = CFL recognizer |
| 7-tuple | (Q, Σ, Γ, δ, q0, Z0, F) |
| Transitions | δ(q, a, X) = {(p, Y)} |
| Acceptance | Final state or empty stack |
| Equivalence | PDA ≡ CFG (both define CFLs) |
| DPDA ⊊ NPDA | Nondeterminism adds power! |
| Limits | Can't do anbncn or ww |
Q1: What does a PDA add to an NFA?
Q2: Which language can a PDA recognize?
Q3: Are DPDA and NPDA equivalent?
PDA for {anbn}: start in q0 with stack [$]. Input: aabbb
Fill in the missing stack contents and final result:
Fill in the missing parts of these PDA transitions:
1. δ(q0, 0, $) = {(q0, )}
First 0: push onto empty stack
2. δ(q0, 0, 0) = {(q0, )}
More 0's: keep pushing
3. δ(q0, 1, 0) = {(, ε)}
Switch to reading 1's
4. δ(q1, ε, $) = {(qacc, $)}
Accept when stack empty