Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs

Amir Globerson

Alon Brutzkus

ICML (2017)

Download Google Scholar

Abstract

Deep learning models are often successfully
trained using gradient descent, despite the worst
case hardness of the underlying non-convex optimization
problem. The key question is then under
what conditions can one prove that optimization
will succeed. Here we provide a strong result
of this kind. We consider a neural net with
one hidden layer and a convolutional structure
with no overlap, and a ReLU activation function.
For this architecture we show that learning
is NP-complete in the general case, but that
when the input distribution is Gaussian, gradient
descent converges to the global optimum in polynomial
time. To the best of our knowledge, this
is the first global optimality guarantee of gradient
descent on a convolutional neural network with
ReLU activations

Research Areas

Machine Intelligence

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs

Abstract

Research Areas

Meet the teams driving innovation

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs

Abstract

Research Areas

Meet the teams driving innovation

AI/ML Foundations  & Capabilities