Skip to content

Cache penetration

Cache penetration is a specific scenario in system design where a data cache is bypassed, resulting in direct load on the backend database.^[600-developer-principle-bloomfilter.md]

Cause

This phenomenon typically occurs when an application requests data that does not exist in the system.^[600-developer-principle-bloomfilter.md]

Impact

Because the data is not present, the cache generates a "miss," and the request is forwarded to the database.^[600-developer-principle-bloomfilter.md] If a high volume of requests targets non-existent data, the cache is rendered ineffective (penetrated), leading to excessive pressure on the database which can cause performance degradation or outages.^[600-developer-principle-bloomfilter.md]

Solutions

One effective mitigation strategy is the use of a Bloom Filter^[600-developer-principle-bloomfilter.md].

Bloom filter

A Bloom filter is a probabilistic data structure used to test whether an element is a member of a set^[600-developer-principle-bloomfilter.md] (in this case, the set of valid keys in the database). It allows the system to quickly verify if a requested key exists before querying the cache or database^[600-developer-principle-bloomfilter.md].

  • If the filter indicates the key does not exist, the request can be rejected immediately, preventing database access.
  • If the filter indicates the key might exist, the system proceeds to check the cache and then the database.

In Java development, the Guava library provides a BloomFilter implementation commonly used for this purpose^[600-developer-principle-bloomfilter.md].

Sources

^[600-developer-principle-bloomfilter.md]