Proving Languages Are NOT Regular (or Not Context-Free)
Use ← → arrows to navigate
The pumping lemma is a tool for proving negative results.
Every regular language has a "pumping" property. If a language lacks this property, it cannot be regular.
The pumping lemma is a necessary condition for regularity, NOT sufficient. It's a one-way test.
The Pigeonhole Principle meets finite automata.
A DFA with p states reading a string of length ≥ p must visit some state twice — creating a loop.
5 pigeonholes, 6 pigeons ⇒ at least one hole has 2 pigeons. p states, p+1 visits ⇒ at least one state is visited twice.
If L is regular, then ∃ pumping length p such that ∀ s ∈ L with |s| ≥ p, s = xyz satisfying:
Play the adversarial game — you are the prover!
Step 1: Adversary picks p =
Step 2: You pick s = apbp (length ≥ p) ✓
Step 3: Adversary splits s = xyz (|xy|≤p, |y|>0)
Step 4: You pick i to break it!
Every pumping lemma proof follows this exact structure.
Adjust the split and pump value to see why EVERY split fails.
p = 4
|x| = 0
|y| = 2
pump i = 0
Since |xy| ≤ p, y consists entirely of a's. Pumping changes only the count of a's, breaking an=bn.
Prove that the language of doubled strings is not regular.
A language where string lengths are perfect squares.
Perfect squares get further apart: the gap between n² and (n+1)² is 2n+1. Pumping adds at most p, which is too small to reach the next square.
Avoid these pitfalls that trip up most students!
You can't choose how xyz is split — the adversary picks the split. Your proof must work for ALL valid splits.
You need to find at least one value of i that breaks it. But the adversary gets to split, so you must handle any split.
If a language passes the PL, that tells you nothing. The PL can only prove non-regularity.
You choose: the string s and the pump value i.
Adversary chooses: p and the split xyz.
Your proof must work for ANY p and ANY valid split.
Some non-regular languages satisfy the pumping lemma anyway!
The PL is like a metal detector: if it beeps, there's metal. But if it doesn't beep, maybe the metal is too deep. Not beeping doesn't prove "no metal."
The classic example:
This language is not regular, but it satisfies the pumping lemma! For any string of length ≥ p, you can always pump it and stay in L (because most strings have i ≠ 1, so the j=k constraint doesn't apply).
PL passing ⇒ inconclusive
PL failing ⇒ definitely not regular
Language: L = {0n12n | n ≥ 0}. We pick s = 0p12p. The adversary splits with x = ε, y = 02, z = 0p-212p.
If we pump with i = 0 (delete y), what is xy°z?
A more powerful version for proving languages are not context-free.
If L is context-free, then ∃ p such that ∀ s ∈ L with |s| ≥ p, s = uvxyz satisfying:
In a sufficiently long string, the parse tree must repeat a variable.
If string s is long enough, the parse tree is tall enough that some variable A must appear twice on some root-to-leaf path.
Like the DFA pigeonhole, but for parse trees: if the tree is tall enough, a variable must repeat. The subtree between the two A's gives us the pumpable portion (v and y).
u = before outer A
v = outer A's left yield minus inner A
x = inner A's yield
y = outer A's right yield minus inner A
z = after outer A
A student writes this proof that {anbn} is not regular:
What's wrong with line 2?
The classic example requiring the CFL pumping lemma.
The doubled-string language needs even more power than a PDA.
{ww} is the "poster child" of non-CFL languages. It requires matching across non-nested boundaries, which a PDA's stack can't handle.
Side-by-side comparison of regular vs CFL pumping.
| Regular PL | CFL PL | |
|---|---|---|
| Decomposition | s = xyz (3 parts) | s = uvxyz (5 parts) |
| Non-empty | |y| > 0 | |vy| > 0 |
| Length bound | |xy| ≤ p | |vxy| ≤ p |
| Pumping | xyiz ∈ L | uvixyiz ∈ L |
| Pump pieces | 1 piece (y) | 2 pieces (v, y) pumped together |
| Intuition | DFA loop (pigeonhole) | Parse tree repeated variable |
| Proves | NOT regular | NOT context-free |
Regular PL: finding a loop in a circle (DFA). CFL PL: finding a repeated node in a tree (parse tree). More structure = more pieces.
The pumping lemma isn't the only tool. Here are stronger alternatives.
A language L is regular iff it has a finite number of equivalence classes under the distinguishability relation.
If you can find infinitely many strings that are pairwise distinguishable (each pair needs a different "suffix" to tell them apart), then L is not regular. This is both necessary AND sufficient!
A stronger version of the CFL pumping lemma where you mark certain positions, and v and y must contain marked positions.
Ogden's gives you more control over WHERE the pumping happens. Useful when the basic CFL PL fails for a particular language.
For each language, decide: Regular PL, CFL PL, or neither (language IS regular/CFL)?
1. L = {an | n is prime}
2. L = {anbm | n ≤ m}
3. L = {anbncn}
4. L = (ab)*
Q1: In a PL proof, who picks the string s?
Q2: In the CFL PL, how many pieces is the string split into?
Q3: If a language passes the PL, what can we conclude?
Language: L = {anb2n | n ≥ 0}. Fill in the blanks:
Drag the proof steps into the correct order: