Let
following a distribution
Let
Mathematical entropy is defined as:
In other words:
By convention, we take
We usually only care about
Joint Entropy
Conditional Entropy
Fano’s Inequality
Mutual Information
Lemma
For
Additionally
if and only if
Proof
Take
Intuition
Entropy is a measure of ‘randomness’ or ‘uncertainty’
The entropy
(its a two sided coin because we use
Example 1
Suppose
we identify
so
Example 2
(to get this, think of a binary tree!)
So Example 1 is more random than Example 2.