Cache penetration¶
Cache penetration is a specific scenario in system design where a data cache is bypassed, resulting in direct load on the backend database.^[600-developer-principle-bloomfilter.md]
Cause¶
This phenomenon typically occurs when an application requests data that does not exist in the system.^[600-developer-principle-bloomfilter.md]
Impact¶
Because the data is not present, the cache generates a "miss," and the request is forwarded to the database.^[600-developer-principle-bloomfilter.md] If a high volume of requests targets non-existent data, the cache is rendered ineffective (penetrated), leading to excessive pressure on the database which can cause performance degradation or outages.^[600-developer-principle-bloomfilter.md]
Solutions¶
One effective mitigation strategy is the use of a Bloom Filter^[600-developer-principle-bloomfilter.md].
Bloom filter¶
A Bloom filter is a probabilistic data structure used to test whether an element is a member of a set^[600-developer-principle-bloomfilter.md] (in this case, the set of valid keys in the database). It allows the system to quickly verify if a requested key exists before querying the cache or database^[600-developer-principle-bloomfilter.md].
- If the filter indicates the key does not exist, the request can be rejected immediately, preventing database access.
- If the filter indicates the key might exist, the system proceeds to check the cache and then the database.
In Java development, the Guava library provides a BloomFilter implementation commonly used for this purpose^[600-developer-principle-bloomfilter.md].
Sources¶
^[600-developer-principle-bloomfilter.md]