Nonconvex

Sharp Analysis for Nonconvex SGD Escaping from Saddle Points

In this paper, we prove that the simplest Stochastic Gradient Descent (SGD) algorithm is able to efficiently escape from saddle points and find an (eps, O(eps^0.5))-approximate second-order stationary point in O˜(eps^-3.5) stochastic gradient …