Workshop on New Directions in Optimization, Statistics and Machine Learning
Deep equilibrium models via monotone operators
In this talk, I will first introduce our recent work on the Deep Equilibrium Model (DEQ). Instead of stacking nonlinear layers, as is common in deep learning, this approach finds the equilibrium point of the repeated iteration of a single non-linear layer, then backpropagates through the layer directly using the implicit function theorem. The resulting method achieves or matches state of the art performance in many domains (while consuming much less memory), and can theoretically express any "traditional" deep network with just a single layer. However, existing work in DEQs leave open the question of the existence and uniqueness of these fixed points and attempts to compute them largely through heuristic methods. In the second part of the talk, I will thus introduce a new class of DEQ models based upon monotone operator theory. I will illustrate how we can parameterize these networks such that they are guaranteed to have a unique equilibrium point, and show how to apply operator splitting methods to efficiently find these fixed points. Finally, I will close by highlighting some open challenges in these areas.