Approximating large frequency moments with pick-and-drop sampling

Given data stream D={p1,p2,...,pm} of size m of numbers from {1,...,n}, the frequency of i is defined as fi=|{j:pj=i}|. The k-th frequency moment of D is defined as Fk=ni=1fki. We consider the problem of approximating frequency moments for k3. For any constant c we show an O(n12/klog(n)log(c)(n)) upper bound on the space complexity of the problem. Here log(c)(n) is the iterative log function. We make the following assumptions: n and m are polynomially far; approximation error ϵ and parameter k are constants. We observe a natural bijection between streams and special matrices. Our main technical contribution is a non-uniform sampling method on matrices. We call our method a *pick-and-drop sampling*; it samples a heavy element (i.e., element i with frequency Ω(Fk)) with probability Ω(1/n12/k) and gives approximation ~fi(1ϵ)fi. In addition, the estimations never exceed the real values, that is ~fjfj for all j. As a result, we reduce the space complexity of finding a heavy element to O(n12/klog(n)) bits. We apply our method of recursive sketches and resolve the problem with O(n12/klog(n)log(c)(n)) bits. This is joint work with Rafail Ostrovsky.

Date

Speakers

Vladimir Braverman

Affiliation

Johns Hopkins University