Neural conditional gradients

Schramowski, Patrick; Bauckhage, Christian; Kersting, Kristian

2018

Paper (Preprint, Research Paper, Review Paper, White Paper, etc.)

Abstract

The move from hand-designed to learned optimizers in machine learning has been quite successful for gradient-based and -free optimizers. When facing a constrained problem, however, maintaining feasibility typically requires a projection step, which might be computationally expensive and not differentiable. We show how the design of projection-free convex optimization algorithms can be cast as a learning problem based on Frank-Wolfe Networks: recurrent networks implementing the Frank-Wolfe algorithm aka. conditional gradients. This allows them to learn to exploit structure when, e.g., optimizing over rank-1 matrices. Our LSTM-learned optimizers outperform hand-designed as well learned but unconstrained ones. We demonstrate this for training support vector machines and softmax classifiers.

Author(s)

Schramowski, Patrick

TU Darmstadt

Bauckhage, Christian

Fraunhofer-Institut für Intelligente Analyse- und Informationssysteme IAIS

Kersting, Kristian

TU Darmstadt

Options

Neural conditional gradients