Gradient descent
Intuition:
so we reduced
The Algorithm:
- Start at some
For all do: - Make a step
- Identify a descent direction
- Set a step-size
- Identify a descent direction
- Stop when some criteria hold.
Usual assumptions in literature are Beta-smooth and Alpha-strongly convex
Now we analyse gradient descent with the update
Theorem:
For the above version of gradient descent
Proof:
Using induction, we conclude the result.
The quantity
Example:
Condition number of
To make it faster we can do a change of variables
Then the condition number is 1.