A context-free formal language or a Type 2 language
is produced by the rules where is a variable, and .

Proposition

For a context free grammar and any word
there is a (not necessarily unique) Parse Tree such that .

Theorem

Every context free grammar
has a Chomsky Normal Form grammar
such that

Proof (sketch)

The product rules we need to kick out are:

  1. Unit productions
  2. bad productions where but
  3. very bad productions where but contains terminal symbols.
    First kick the very bad productions out easily.
    Then add unit closure rules (for every and add rule )
    and prove that removing unit productions from a unit closed grammar
    doesn’t change the language.
    Then remove the bad productions
    by introducing a bunch of new variables for each of the bad productions
    and sticking them together.

Lemma

Context-Free Pumping Lemma