Cong Fang

fangcong AT pku.edu.cn

Assistant Professor

Peking University


I will be an assiaiant Professor at Peking Univeristy soon. I am now a postdoctoral researcher at University of Pennsylvania, hosted by Prof. Weijie Su and Qi Long. I was a postdoctoral researcher at Princeton University in 2019, hosted by Prof. Jason D. Lee. I also work closely with Prof. Tong Zhang at HKUST. I received my Ph.D. at Peking Univerity, advised by Prof. Zhouchen Lin.

I work on the foundation of machine learning. My research interests are broadly in machine learning algorithms and theory. I am hiring self-motivated Ph.D. and interns who have great interests in theory about machine learning to work with me (at PKU). Please see more information about me and the position on my homepage: https://congfang-ml.github.io/.


  • Machine Learning
  • Optimization


  • Ph.D. in Computer Engineering, 2014-2019

    Peking University

Publications @ZERO Lab

Training Neural Networks by Lifted Proximal Operator Machines. TPAMI, 2020.

We present the lifted proximal operator machine (LPOM) to train fully-connected feed-forward neural networks. LPOM represents the …

Decentralized Accelerated Gradient Methods With Increasing Penalty Parameters. IEEE T. Signal Processing, 2020.

In this paper, we study the communication and (sub)gradient computation costs in distributed optimization and give a sharp complexity …

Accelerated First-Order Optimization Algorithms for Machine Learning. P IEEE, 2020.

Numerical optimization serves as one of the pillars of machine learning. To meet the demands of big data applications, lots of efforts …

Lifted Proximal Operator Machines. AAAI, 2019.

By rewriting the activation function as an equivalent proximal operator, we approximate a feed-forward neural network by adding the …

Sharp Analysis for Nonconvex SGD Escaping from Saddle Points. COLT, 2019.

In this paper, we prove that the simplest Stochastic Gradient Descent (SGD) algorithm is able to efficiently escape from saddle points …

SPIDER Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator. NIPS, 2018.

We propose a new technique named Stochastic Path-Integrated Differential EstimatoR (Spider), which can be used to track many …

Faster and Non-ergodic O(1/K) Stochastic Alternating Direction Method of Multipliers. NIPS, 2017.

we propose a new stochastic ADMM which elaborately integrates Nesterov’s extrapolation and VR techniques.

Feature Learning via Partial Differential Equation with Applications to Face Recognition. PR, 2017.

We propose a novel Partial Differential Equation (PDE) based method for feature learning. The feature learned by our PDE is …

Parallel Asynchronous Stochastic Variance Reduction for Nonconvex Optimization. AAAI, 2017.

We propose the Asynchronous Stochastic Variance Reduced Gradient (ASVRG) algorithm for nonconvex finite-sum problems.

A Robust Hybrid Method for Text Detection in Natural Scenes by Learning-based Partial Differential Equations. Neurocomputing, 2015.

We present a robust hybrid method that uses learning-based PDEs for detecting texts from natural scene images.