Bloom filter¶
A Bloom filter is a space-efficient probabilistic data structure used to solve the problem of cache penetration.^[600-developer-principle-bloomfilter.md, 600-developer__principle__bloomfilter.md]
Core Characteristics¶
The Bloom filter is designed to test whether an element is a member of a set.^[600-developer-principle-bloomfilter.md, 600-developer__principle__bloomfilter.md] It is characterized by its high space efficiency and speed, though it operates with a degree of probability, meaning it allows for a configurable rate of false positives.^[600-developer-principle-bloomfilter.md, 600-developer__principle__bloomfilter.md]
Implementation¶
In Java development, the Guava library provides a widely used implementation.^[600-developer-principle-bloomfilter.md, 600-developer__principle__bloomfilter.md] To utilize it, one must define the expected number of inserted elements and the desired false positive probability during creation.^[600-developer-principle-bloomfilter.md, 600-developer__principle__bloomfilter.md]
The primary operations involve adding elements to the filter and checking for their potential existence.^[600-developer-principle-bloomfilter.md, 600-developer__principle__bloomfilter.md] Queries return a boolean result indicating if an element might be in the set, acknowledging the possibility of false positives.^[600-developer-principle-bloomfilter.md, 600-developer__principle__bloomfilter.md]
Code Example¶
The following example demonstrates creating a filter, adding strings, and checking for membership using the put and mightContain methods:^[600-developer-principle-bloomfilter.md, 600-developer__principle__bloomfilter.md]
import java.nio.charset.Charset;
import com.google.common.hash.BloomFilter;
import com.google.common.hash.Funnels;
public class Bloom {
public static void main(String[] args) {
// Create a filter expecting 1000 elements with a low false positive probability
BloomFilter<String> b = BloomFilter.create(Funnels.stringFunnel(Charset.forName("utf-8")), 1000, 0.000001);
// Add elements
b.put("121");
b.put("122");
b.put("123");
// Check if element might exist (returns boolean)
System.out.println(b.mightContain("12321"));
}
}
Merging Filters¶
Bloom filters support the putAll operation, which allows the contents of one filter to be merged into another, provided they were created with identical parameters.^[600-developer-principle-bloomfilter.md, 600-developer__principle__bloomfilter.md]
Sources¶
600-developer-principle-bloomfilter.md600-developer__principle__bloomfilter.md