.\" Automatically generated by Pod::Man 4.14 (Pod::Simple 3.40) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "Bloom::Filter 3pm" .TH Bloom::Filter 3pm "2021-01-09" "perl v5.32.0" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" Bloom::Filter \- Sample Perl Bloom filter implementation .SH "DESCRIPTION" .IX Header "DESCRIPTION" A Bloom filter is a probabilistic algorithm for doing existence tests in less memory than a full list of keys would require. The tradeoff to using Bloom filters is a certain configurable risk of false positives. This module implements a simple Bloom filter with configurable capacity and false positive rate. Bloom filters were first described in a 1970 paper by Burton Bloom, see . .SH "SYNOPSIS" .IX Header "SYNOPSIS" .Vb 1 \& use Bloom::Filter \& \& my $bf = Bloom::Filter\->new( capacity => 10, error_rate => .001 ); \& \& $bf\->add( @keys ); \& \& while ( <> ) { \& chomp; \& print "Found $_\en" if $bf\->check( $_ ); \& } .Ve .SH "CONSTRUCTORS" .IX Header "CONSTRUCTORS" .ie n .IP "new %PARAMS" 4 .el .IP "new \f(CW%PARAMS\fR" 4 .IX Item "new %PARAMS" Create a brand new instance. Allowable params are \f(CW\*(C`error_rate\*(C'\fR, \f(CW\*(C`capacity\*(C'\fR. .IP "init" 4 .IX Item "init" Calculates the best number of hash functions and optimum filter length, creates some random salts, and generates a blank bit vector. Called automatically by constructor. .SH "ACCESSORS" .IX Header "ACCESSORS" .IP "capacity" 4 .IX Item "capacity" Returns the total capacity of the Bloom filter .IP "error_rate" 4 .IX Item "error_rate" Returns the configured maximum error rate .IP "length" 4 .IX Item "length" Returns the length of the Bloom filter in bits .IP "key_count" 4 .IX Item "key_count" Returns the number of items currently stored in the filter .IP "on_bits" 4 .IX Item "on_bits" Returns the number of 'on' bits in the filter .IP "salts" 4 .IX Item "salts" Returns the list of salts used to create the hash functions .SH "PUBLIC METHODS" .IX Header "PUBLIC METHODS" .ie n .IP "add @KEYS" 4 .el .IP "add \f(CW@KEYS\fR" 4 .IX Item "add @KEYS" Adds the list of keys to the filter. Will fail, return \f(CW\*(C`undef\*(C'\fR and complain if the number of keys in the filter exceeds the configured capacity. .ie n .IP "check @KEYS" 4 .el .IP "check \f(CW@KEYS\fR" 4 .IX Item "check @KEYS" Checks the provided key list against the Bloom filter, and returns a list of equivalent length, with true or false values depending on whether there was a match. .SH "INTERNAL METHODS" .IX Header "INTERNAL METHODS" .IP "_calculate_shortest_filter_length \s-1CAPACITY ERR_RATE\s0" 4 .IX Item "_calculate_shortest_filter_length CAPACITY ERR_RATE" Given a desired error rate and maximum capacity, returns the optimum combination of vector length (in bits) and number of hash functions to use in building the filter, where \*(L"optimum\*(R" means shortest vector length. .IP "_get_cells \s-1KEY\s0" 4 .IX Item "_get_cells KEY" Given a key, hashes it using the list of salts and returns an array of cell indexes corresponding to the key. .SH "AUTHOR" .IX Header "AUTHOR" Originally written by Maciej Ceglowski . Currently maintained by Grzegorz Rożniecki . .SH "CONTRIBUTORS" .IX Header "CONTRIBUTORS" Dmitriy Ryaboy (big speedup in February 2007, thanks!) .SH "COPYRIGHT AND LICENSE" .IX Header "COPYRIGHT AND LICENSE" (c) 2004 Maciej Ceglowski .PP This is free software, distributed under version 2 of the \s-1GNU\s0 Public License (\s-1GPL\s0).