We consider adversarial online learning in a non-convex setting
under the assumption that the learner has an access to an offline
optimization oracle. In the most general unstructured setting of
prediction with expert advice, Hazan and Koren (2016)...
Deep learning builds upon the mysterious ability of
gradient-based methods to solve related non-convex optimization
problems. However, a complete theoretical understanding is missing
even in the simpler setting of training a deep linear neural...
In modern “Big Data” applications, structured learning is the
most widely employed methodology. Within this paradigm, the
fundamental challenge lies in developing practical, effective
algorithmic inference methods. Often (e.g., deep learning)...
We revisit the question of reducing online learning to
approximate optimization of the offline problem. In this setting,
we give two algorithms with near-optimal performance in the full
information setting: they guarantee optimal regret and
require...
Datasets are often used multiple times with each successive
analysis depending on the outcomes of previous analyses on the same
dataset. Standard techniques for ensuring generalization and
statistical validity do not account for this adaptive...
Three fundamental factors determine the quality of a statistical
learning algorithm: expressiveness, optimization and
generalization. The classic strategy for handling these factors is
relatively well understood. In contrast, the radically
different...
We consider the problem of adversarial (non-stochastic) online
learning with partial information feedback, where at each round, a
decision maker selects an action from a finite set of alternatives.
We develop a black-box approach for such problems...
Conventional wisdom in deep learning states that increasing
depth improves expressiveness but complicates optimization. In this
talk I will argue that, sometimes, increasing depth can speed up
optimization.