Computer Science/Discrete Mathematics Seminar I
Recent advances in high dimensional robust statistics
It is classically understood how to learn the parameters of a Gaussian even in high dimensions from independent samples. However, estimators like the sample mean are very fragile to noise. In particular, a single corrupted sample can arbitrarily distort the sample mean. More generally we would like to be able to estimate the parameters of a distribution even if a small fraction of the samples are corrupted, potentially adversarially. Classically algorithms for this problem have all fallen into one of two categories: those whose error depends polynomially on the dimension (like the coordinate-wise median), and those whose runtimes are exponential in the dimension (like the Tukey median). We discuss recent work to overcome this barrier, achieving dimension independent error in polynomial time.