Skip to content

Bloom filter

A Bloom filter is a space-efficient probabilistic data structure used to solve the problem of cache penetration.^[600-developer-principle-bloomfilter.md, 600-developer__principle__bloomfilter.md]

Core Characteristics

The Bloom filter is designed to test whether an element is a member of a set.^[600-developer-principle-bloomfilter.md, 600-developer__principle__bloomfilter.md] It is characterized by its high space efficiency and speed, though it operates with a degree of probability, meaning it allows for a configurable rate of false positives.^[600-developer-principle-bloomfilter.md, 600-developer__principle__bloomfilter.md]

Implementation

In Java development, the Guava library provides a widely used implementation.^[600-developer-principle-bloomfilter.md, 600-developer__principle__bloomfilter.md] To utilize it, one must define the expected number of inserted elements and the desired false positive probability during creation.^[600-developer-principle-bloomfilter.md, 600-developer__principle__bloomfilter.md]

The primary operations involve adding elements to the filter and checking for their potential existence.^[600-developer-principle-bloomfilter.md, 600-developer__principle__bloomfilter.md] Queries return a boolean result indicating if an element might be in the set, acknowledging the possibility of false positives.^[600-developer-principle-bloomfilter.md, 600-developer__principle__bloomfilter.md]

Code Example

The following example demonstrates creating a filter, adding strings, and checking for membership using the put and mightContain methods:^[600-developer-principle-bloomfilter.md, 600-developer__principle__bloomfilter.md]

import java.nio.charset.Charset;
import com.google.common.hash.BloomFilter;
import com.google.common.hash.Funnels;

public class Bloom {
    public static void main(String[] args) {
        // Create a filter expecting 1000 elements with a low false positive probability
        BloomFilter<String> b = BloomFilter.create(Funnels.stringFunnel(Charset.forName("utf-8")), 1000, 0.000001);

        // Add elements
        b.put("121");
        b.put("122");
        b.put("123");

        // Check if element might exist (returns boolean)
        System.out.println(b.mightContain("12321"));
    }
}

Merging Filters

Bloom filters support the putAll operation, which allows the contents of one filter to be merged into another, provided they were created with identical parameters.^[600-developer-principle-bloomfilter.md, 600-developer__principle__bloomfilter.md]

Sources

  • 600-developer-principle-bloomfilter.md
  • 600-developer__principle__bloomfilter.md