The Pumping Lemma

Proving Languages Are NOT Regular (or Not Context-Free)

Use ← → arrows to navigate

The Big Picture

The pumping lemma is a tool for proving negative results.

What it tells you

  • A language is NOT regular
  • A language is NOT context-free

Key Idea

Every regular language has a "pumping" property. If a language lacks this property, it cannot be regular.

What it does NOT tell you

  • It does not prove a language IS regular
  • Passing the pumping lemma ≠ regular

Warning

The pumping lemma is a necessary condition for regularity, NOT sufficient. It's a one-way test.

Intuition: Why Pumping Works

The Pigeonhole Principle meets finite automata.

A DFA with p states reading a string of length ≥ p must visit some state twice — creating a loop.

Analogy

5 pigeonholes, 6 pigeons ⇒ at least one hole has 2 pigeons. p states, p+1 visits ⇒ at least one state is visited twice.

The Pumping Lemma for Regular Languages

Pumping Lemma (Formal Statement)

If L is regular, then ∃ pumping length p such that ∀ s ∈ L with |s| ≥ p, s = xyz satisfying:

1. |y| > 0       (y is not empty)
2. |xy| ≤ p      (loop in first p chars)
3. xyiz ∈ L     ∀ i ≥ 0   (pump y any number of times)

The Pumping Game: Prove {anbn} is NOT Regular

Play the adversarial game — you are the prover!

Step 1: Adversary picks p =

Step 2: You pick s = apbp (length ≥ p) ✓

Step 3: Adversary splits s = xyz (|xy|≤p, |y|>0)

Step 4: You pick i to break it!

0

The Proof as a Flowchart

Every pumping lemma proof follows this exact structure.

Explore: Pumping apbp

Adjust the split and pump value to see why EVERY split fails.

p = 4

|x| = 0

|y| = 2

pump i = 0

Key Insight

Since |xy| ≤ p, y consists entirely of a's. Pumping changes only the count of a's, breaking an=bn.

Example 2: {ww | w ∈ {0,1}*} is NOT Regular

Prove that the language of doubled strings is not regular.

1. Assume L = {ww} is regular
2. Let p be the pumping length
3. Choose s = 0ᵖ1ᵖ0ᵖ1ᵖ ∈ L
4. s = xyz with |xy| ≤ p, |y| > 0
5. y = 0ᵏ (y is all 0's, within first p chars)
6. Pump i=2: xy²z = 0ᵖ⁺ᵏ1ᵖ0ᵖ1ᵖ
7. First half ≠ second half ⇒ NOT ww
8. Contradiction! L is NOT regular. ✓

Example 3: {1 | n ≥ 0} is NOT Regular

A language where string lengths are perfect squares.

1. Assume L = {1^(n²)} is regular
2. Let p be the pumping length
3. Choose s = 1^(p²) ∈ L
4. s = xyz, |xy| ≤ p, |y| > 0
5. |y| = k where 1 ≤ k ≤ p
6. Pump i=2: |xy²z| = p² + k
7. p² < p²+k ≤ p²+p < (p+1)²
8. p²+k is NOT a perfect square!
9. Contradiction! L is NOT regular.

Key Insight

Perfect squares get further apart: the gap between n² and (n+1)² is 2n+1. Pumping adds at most p, which is too small to reach the next square.

Common Mistakes in Pumping Proofs

Avoid these pitfalls that trip up most students!

Mistake 1: Fixing the split

You can't choose how xyz is split — the adversary picks the split. Your proof must work for ALL valid splits.

Mistake 2: Only checking i=0

You need to find at least one value of i that breaks it. But the adversary gets to split, so you must handle any split.

Mistake 3: Claiming PL proves regularity

If a language passes the PL, that tells you nothing. The PL can only prove non-regularity.

Correct Approach

You choose: the string s and the pump value i.
Adversary chooses: p and the split xyz.
Your proof must work for ANY p and ANY valid split.

When the Pumping Lemma Fails

Some non-regular languages satisfy the pumping lemma anyway!

Analogy

The PL is like a metal detector: if it beeps, there's metal. But if it doesn't beep, maybe the metal is too deep. Not beeping doesn't prove "no metal."

The classic example:

L = {aibjck | i,j,k ≥ 0 and if i=1 then j=k}

This language is not regular, but it satisfies the pumping lemma! For any string of length ≥ p, you can always pump it and stay in L (because most strings have i ≠ 1, so the j=k constraint doesn't apply).

Takeaway

PL passing ⇒ inconclusive
PL failing ⇒ definitely not regular

Challenge A: Predict the Pump

Challenge

Language: L = {0n12n | n ≥ 0}. We pick s = 0p12p. The adversary splits with x = ε, y = 02, z = 0p-212p.

If we pump with i = 0 (delete y), what is xy°z?

The Pumping Lemma for Context-Free Languages

A more powerful version for proving languages are not context-free.

CFL Pumping Lemma

If L is context-free, then ∃ p such that ∀ s ∈ L with |s| ≥ p, s = uvxyz satisfying:

1. |vy| > 0       (v and y not both empty)
2. |vxy| ≤ p      (middle portion bounded)
3. uvixyiz ∈ L    ∀ i ≥ 0   (pump v and y together)

Why CFL Pumping Works: Parse Trees

In a sufficiently long string, the parse tree must repeat a variable.

If string s is long enough, the parse tree is tall enough that some variable A must appear twice on some root-to-leaf path.

Analogy

Like the DFA pigeonhole, but for parse trees: if the tree is tall enough, a variable must repeat. The subtree between the two A's gives us the pumpable portion (v and y).

The Five Pieces

u = before outer A
v = outer A's left yield minus inner A
x = inner A's yield
y = outer A's right yield minus inner A
z = after outer A

Challenge B: Fix the Proof Bug

Spot the Error

A student writes this proof that {anbn} is not regular:

1. Assume L = {a^n b^n} is regular with pumping length p
2. Let s = a^p b^p. Pick x=ε, y=ab, z=a^(p-1)b^(p-1)
3. Pump i=0: xz = a^(p-1)b^(p-1) ∈ L. No contradiction...
4. Therefore the proof fails??

What's wrong with line 2?

CFL Example: {anbncn} is NOT Context-Free

The classic example requiring the CFL pumping lemma.

1. Assume L = {a^n b^n c^n} is CFL
2. Let p be CFL pumping length
3. Pick s = a^p b^p c^p
4. s = uvxyz, |vxy|≤p, |vy|>0
5. vxy can't span all 3 symbols
6. Case: vxy in a's and b's only
7. Pump i=2: more a's+b's, same c's
8. #a or #b ≠ #c ⇒ NOT in L!
9. ALL cases lead to contradiction!

CFL Example: {ww | w ∈ {0,1}*} is NOT Context-Free

The doubled-string language needs even more power than a PDA.

1. Assume L = {ww} is CFL
2. Pick s = 0^p 1^p 0^p 1^p
3. s = uvxyz, |vxy|≤p
4. vxy spans at most 2 adjacent blocks
5. Pumping disrupts the symmetry
6. First half ≠ second half
7. Contradiction! L not CFL.

Key Insight

{ww} is the "poster child" of non-CFL languages. It requires matching across non-nested boundaries, which a PDA's stack can't handle.

Comparing the Two Pumping Lemmas

Side-by-side comparison of regular vs CFL pumping.

Regular PLCFL PL
Decompositions = xyz (3 parts)s = uvxyz (5 parts)
Non-empty|y| > 0|vy| > 0
Length bound|xy| ≤ p|vxy| ≤ p
Pumpingxyiz ∈ Luvixyiz ∈ L
Pump pieces1 piece (y)2 pieces (v, y) pumped together
IntuitionDFA loop (pigeonhole)Parse tree repeated variable
ProvesNOT regularNOT context-free

Analogy

Regular PL: finding a loop in a circle (DFA). CFL PL: finding a repeated node in a tree (parse tree). More structure = more pieces.

Beyond Pumping

The pumping lemma isn't the only tool. Here are stronger alternatives.

Myhill-Nerode Theorem

A language L is regular iff it has a finite number of equivalence classes under the distinguishability relation.

Key Idea

If you can find infinitely many strings that are pairwise distinguishable (each pair needs a different "suffix" to tell them apart), then L is not regular. This is both necessary AND sufficient!

Ogden's Lemma

A stronger version of the CFL pumping lemma where you mark certain positions, and v and y must contain marked positions.

Key Idea

Ogden's gives you more control over WHERE the pumping happens. Useful when the basic CFL PL fails for a particular language.

Challenge C: Classify the Language

Which pumping lemma do you need?

For each language, decide: Regular PL, CFL PL, or neither (language IS regular/CFL)?

1. L = {an | n is prime}

2. L = {anbm | n ≤ m}

3. L = {anbncn}

4. L = (ab)*

Summary & Cheat Sheet

Regular PL Template

1. Assume L is regular (for contradiction)
2. Let p be the pumping length
3. Choose s = [string in L, |s|≥p]
4. For ANY split s=xyz with |xy|≤p, |y|>0:
5. Show xy^i z ∉ L for some i
6. Contradiction! L is NOT regular.

CFL PL Template

1. Assume L is CFL (for contradiction)
2. Let p be the CFL pumping length
3. Choose s = [string in L, |s|≥p]
4. For ANY split s=uvxyz, |vxy|≤p, |vy|>0:
5. Show uv^i xy^i z ∉ L for some i
6. Contradiction! L is NOT CFL.

Quiz: Multiple Choice

Q1: In a PL proof, who picks the string s?

Q2: In the CFL PL, how many pieces is the string split into?

Q3: If a language passes the PL, what can we conclude?

Quiz: Trace the Proof

Complete the Pumping Proof

Language: L = {anb2n | n ≥ 0}. Fill in the blanks:

1. Assume L is regular with pumping length p
2. Choose s = a^p b^(2p) ∈ L, |s| = 3p ≥ p ✓
3. s = xyz, |xy| ≤ p, so y = a^k for some k ≥ 1
4. Pump with i =
5. xy^i z = a^() b^(2p)
6. For this to be in L, need #b = 2 × #a
7. But 2p ≠ 2×() since k ≥ 1
8. Contradiction! L is NOT regular.

Quiz: Build Your Own Proof

Prove {0n1n2n} is not context-free

Drag the proof steps into the correct order:

Pump i=2: increases count of at most 2 symbols
Let p be the CFL pumping length
Third symbol count unchanged ⇒ not in L
Assume L = {0^n 1^n 2^n} is CFL
|vxy| ≤ p, so vxy spans at most 2 of {0,1,2}
Contradiction! L is NOT context-free.
Pick s = 0^p 1^p 2^p
Your order: