Wednesday, May 28, 2008

making a multiple hash of it

so just re-reading this really nice classic paper
Probabilistic Counting Algorithms for Data Base Applications (1985)
Philippe Flajolet, G. N. Martin, and realized that one of the tricks they use
is to do multiple hashes of the key (they then count the occurrences of bits
in a neat way to get the cardinality - very cool - so how does this relate to
Bloom Filters which count the approximate set membership, and to CAN's, which use multiple hashes to distribute keys in a search space as uniformly as possible, or even to coordinate estimation systems? Seems like there's a small monograph in this...

No comments: