Skip to content

Latest commit

 

History

History
21 lines (16 loc) · 775 Bytes

README.md

File metadata and controls

21 lines (16 loc) · 775 Bytes

FlajoletMartin

Implements the Flajolet-Martin algorithm for approximately counting unique items in a data stream,
• Flajolet and Martin, Probabilistic Counting Algorithms for Data Base Applications, Journal of Computer and System Sciences 31, 182-209 (1985).

Required libraries

Boost
http://www.boost.org

FarmHash
https://github.com/google/farmhash

Other files (for tests)

main.cpp
Test main that approximately counts the number of unique words in a specified text file.

constitution-words-only.txt
Text of the United States Constitution, with punctuation removed. This contains 925 unique words.

declaration-words-only.txt
Text of the Declaration of Independence, with punctuation removed. This contains 585 unique words.