AMS Without 4-Wise Independence on Product Domains
Vladimir Braverman, Kai-Min Chung, Zhenming Liu, Michael Mitzenmacher, Rafail Ostrovsky
In their seminal work, Alon, Matias, and Szegedy introduced several sketching techniques, including showing that 4-wise independence is sufficient to obtain good approximations of the second frequency moment. In this work, we show that their sketching technique can be extended to product domains [n]^k by using the product of 4-wise independent functions on [n]. Our work extends that of Indyk and McGregor, who showed the result for k = 2. Their primary motivation was the problem of identifying correlations in data streams. In their model, a stream of pairs (i,j) \in [n]^2 arrive, giving a joint distribution (X,Y), and they find approximation algorithms for how close the joint distribution is to the product of the marginal distributions under various metrics, which naturally corresponds to how close X and Y are to being independent. By using our technique, we obtain a new result for the problem of approximating the \ell_2 distance between the joint distribution and the product of the marginal distributions for k-ary vectors, instead of just pairs, in a single pass. Our analysis gives a randomized algorithm that is a (1 \pm \epsilon) approximation (with probability 1-\delta) that requires space logarithmic in $n$ and m and proportional to 3^k.
Preliminary version in STACS 2010.
(This paper is the result of a merge. For historical reasons, and for slightly different proofs, see:
Vladimir Braverman, Rafail Ostrovsky Meassuring k-Wise Indepedence of Streaming Data, June 29, 2008; and
Vladimir Braverman, Rafail Ostrovsky AMS Without 4-Wise Independence on Product Domains, September 17, 2009.)
Fetch PostScript file of the paper Fetch PDF file of the paper
|Back to Publications List|