On the Lower Bound of Minimizing Polyak-Łojasiewicz Functions

Abstract

Polyak-Łojasiewicz (PL) [Polyak, 1963] condition is a weaker condition than the strong convexity but suffices to ensure a global convergence for the Gradient Descent algorithm. In this paper, we study the lower bound of algorithms using first-order oracles to find an approximate optimal solution. We show that any first-order algorithm requires at least Ω(Lμlog1ε) gradient costs to find an ε-approximate optimal solution for a general L-smooth function that has an μ-PL constant. This result demonstrates the optimality of the Gradient Descent algorithm to minimize smooth PL functions in the sense that there exists a ``hard’’ PL function such that no first-order algorithm can be faster than Gradient Descent when ignoring a numerical constant. In contrast, it is well-known that the momentum technique, e.g. [Nesterov, 2003, chap. 2] can provably accelerate Gradient Descent to O(Lμ̂‾‾√log1ε) gradient costs for functions that are L-smooth and μ̂-strongly convex. Therefore, our result distinguishes the hardness of minimizing a smooth PL function and a smooth strongly convex function as the complexity of the former cannot be improved by any polynomial order in general.

Publication
Conference on Learning Theory
Next
Previous