One of the key problems in optimization is understanding optima of non-convex functions, because of its both theoretical and practical importance. In general, optimizing a non-convex function is NP-hard. The difficulty arises because of two reasons: finding the global solution among many local solutions, and presence of saddle points. First order methods (which rely on the first-order derivative) have been employed before to tackle the problem but are prone to saddle points i.e. $\nabla f$(saddle point) = 0. In this thesis we will review recent work on this problem, in particular by Jason D. Lee et al (First-order Methods Almost Always Avoid Saddle Points). The authors study non-convex functions which satisfy a mild property called the “Saddle-point property” and show that first order methods on non-convex functions satisfying this property almost always escape saddle points. Along the way, we’ll see how Stable Manifold Theorem guarantees that the set of starting points (for first-order methods) that get trapped around saddle points has measure zero and strengthens the intuition for “Strict-Saddle Property”. Finally, we will talk about the problem of tensor decomposition which is inherently non-convex and how it satisfies “Strict-Saddle Property”.

Master’s Thesis PDF, Slides