Redis Connection Retry Pattern¶
The Redis Connection Retry Pattern is a resilience mechanism designed to handle transient connection failures when interacting with a Redis cluster, particularly within high-availability setups using Redis Sentinel[400-devops__09-Scripting-Language__python__introduction__part-5.database.redis__README.md].[400-devops__09-Scripting-Language__python__introduction__part-5.database.redis__README.md]
Context¶
In a distributed Redis environment managed by Sentinel, the topology of the database can change dynamically. For example, if the current master node fails, Sentinel will elect a new master^[400-devops__09-Scripting-Language__python__introduction__part-5.database.redis__README.md]. During this failover process, attempts to write to the old master or read from nodes that are transitioning states may result in a redis.exceptions.ConnectionError^[400-devops__09-Scripting-Language__python__introduction__part-5.database.redis__README.md].
Implementation¶
To ensure application stability, commands sent to Redis should be wrapped in a function that implements a retry loop^[400-devops__09-Scripting-Language__python__introduction__part-5.database.redis__README.md].
Logic Structure¶
A standard retry function typically includes the following logic^[400-devops__09-Scripting-Language__python__introduction__part-5.database.redis__README.md]:
- Loop: A
while Trueloop is used to continuously attempt the operation. - Try-Catch: The specific Redis command (e.g.,
getorset) is executed inside atryblock. - Exception Handling: The
exceptblock specifically catchesredis.exceptions.ConnectionErrorandredis.exceptions.TimeoutError. - Retry Counter & Limit: A counter tracks the number of retry attempts. If the count exceeds a defined
max_retries, the exception is re-raised to halt execution and prevent infinite loops. - Backoff: If an error occurs and the limit hasn't been reached, the function pauses for a set duration (e.g., 5 seconds) using
time.sleepbefore the next iteration.
Example Code¶
def redis_command(command, *args):
max_retries = 3
count = 0
backoffSeconds = 5
while True:
try:
return command(*args)
except (redis.exceptions.ConnectionError, redis.exceptions.TimeoutError):
count += 1
if count > max_retries:
raise
print('Retrying in {} seconds'.format(backoffSeconds))
time.sleep(backoffSeconds)
Usage¶
Instead of calling the Redis client directly (e.g., master.set('key', 'value')), the application calls the wrapper function with the client method as an argument^[400-devops__09-Scripting-Language__python__introduction__part-5.database.redis__README.md].
# Direct call (fragile)
# master.set('foo', 'bar')
# Wrapped call (resilient)
redis_command(redis_master.set, 'foo', 'bar')
This abstraction allows the application to wait out the failover process. Once Sentinel finishes promoting a new master, subsequent retries in the loop will successfully connect to the new topology^[400-devops__09-Scripting-Language__python__introduction__part-5.database.redis__README.md].
Related Concepts¶
- [[Redis Sentinel]]
- [[High Availability]]
- [[Circuit Breaker Pattern]]
Sources¶
400-devops__09-Scripting-Language__python__introduction__part-5.database.redis__README.md