Theoretical Machine Learning Seminar
Prediction with a Short Memory
We consider the problem of predicting the next observation given a sequence of past observations, and consider the extent to which accurate prediction requires complex algorithms that explicitly leverage long-range dependencies. Perhaps surprisingly, our positive results show that for a broad class of sequences, there is an algorithm that predicts well on average, and bases its predictions only on the most recent few observation together with a set of simple summary statistics of the past observations. This result applies to both sequences generated by HMMs and those generated by a general class of distributions, where the distribution over the sequences have long-range dependencies
This is joint work with: Vatsal Sharan, Percy Liang and Greg Valiant.