Gradient descent

Intuition:

so we reduced

The Algorithm:

  1. Start at some
    For all do:
  2. Make a step
    • Identify a descent direction
    • Set a step-size
  3. Stop when some criteria hold.

Usual assumptions in literature are Beta-smooth and Alpha-strongly convex

Now we analyse gradient descent with the update .

Theorem:

For the above version of gradient descent

Proof:




Using induction, we conclude the result.

The quantity is called “condition number” of .

Example:




Condition number of is now 0.01 which is slow.
To make it faster we can do a change of variables .
Then the condition number is 1.