Let be a Discrete Memoryless Source with values in
Let be an Optimal Code for
Then its Expected word length satisfies:

where ,
and is the Mathematical Entropy.

Proof

The lower bound is given by combining Gibbs’ inequality and Kraft’s inequality.
Take where is just the normalizing factor.

Where by Kraft’s inequality.
Hence:

We get equality iff for some integers .

For the upper bound, let
We have

So by McMillan’s Theorem
there is a prefix-free code with word lengths .
Also