Newton's method

It uses a descent with the following step:

Computationally heavy as you need to calculate the hessian all the time.

This is some quadratic - we find the minimum of it.
Take gradient and set to zero Minimum

Setting this to zero gives:
In 1-dimension, Newton’s method is a way for solving .
Let .

Let be an Alpha-strongly convex function.
Let be -Lipschitz.
Then Newton’s method satisfies the bound:

This is exponentially faster than gradient descent.
But needs more memory so it’s not used for high dimensions.

NON EXAMINABLE!!!

Notes on Cambridge Maths