2 | ZERO Lab

Symmetry Discovery for Different Data Types

Equivariant neural networks incorporate symmetries into their architecture, achieving higher generalization performance. However, constructing equivariant neural networks typically requires prior knowledge of data types and symmetries, which is …

High-Rank Irreducible Cartesian Tensor Decomposition and Bases of Equivariant Spaces

Irreducible Cartesian tensors (ICTs) play a crucial role in the design of equivariant graph neural networks, as well as in theoretical chemistry and chemical physics. Meanwhile, the design space of available linear operations on tensors that preserve …

Restarted Nonconvex Accelerated Gradient Descent: No More Polylogarithmic Factor in the O(ε−7/4) Complexity

This paper studies accelerated gradient methods for nonconvex optimization with Lipschitz continuous gradient and Hessian. We propose two simple accelerated gradient methods, restarted accelerated gradient descent (AGD) and restarted heavy ball (HB) …

Training Much Deeper Spiking Neural Networks with a Small Number of Time-Steps

Spiking Neural Network (SNN) is a promising energy-efficient neural architecture when implemented on neuromorphic hardware. The Artificial Neural Network (ANN) to SNN conversion method, which is the most effective SNN training method, has …

Variance Reduced EXTRA and DIGing and Their Optimal Acceleration for Strongly Convex Decentralized Optimization

We study stochastic decentralized optimization for the problem of training machine learning models with large-scale distributed data. We extend the widely used EXTRA and DIGing methods with variance reduction (VR), and propose two methods: VR-EXTRA …

Training Neural Networks by Lifted Proximal Operator Machines

We present the lifted proximal operator machine (LPOM) to train fully-connected feed-forward neural networks. LPOM represents the activation function as an equivalent proximal operator and adds the proximal operators to the objective function of a …

Decentralized Accelerated Gradient Methods With Increasing Penalty Parameters

In this paper, we study the communication and (sub)gradient computation costs in distributed optimization and give a sharp complexity analysis for the proposed distributed accelerated gradient methods. We present two algorithms based on the framework …

2