Rafail Ostrovsky - Publications

Weighted sampling without replacement from data streams

Vladimir Braverman, Rafail Ostrovsky, Gregory Vorsanger


Weighted sampling without replacement has proved to be a very important tool in designing new algorithms. Efraimidis and Spirakis [5] presented an algorithms for weighted sampling without replacement from data streams. Their algorithm works under the assumption of precise computations over the interval[0,1] [0,1]. Cohen and Kaplan [3] used similar methods for their bottom-k sketches.

Efraimidis and Spirakis ask as an open question whether using finite precision arithmetic impacts the accuracy of their algorithm. In this paper we show a method to avoid this problem by providing a precise reduction fro K-sampling without replacement to K-sampling with replacement. We call the resulting method Cascade Sampling.

comment: Inf. Process. Lett. 115(12): 923-926 (2015)

Fetch PDF file of the paper

Back to Publications List