Skip to content

Latest commit

 

History

History
39 lines (17 loc) · 1.61 KB

Timeout and Retries.md

File metadata and controls

39 lines (17 loc) · 1.61 KB

Timeout Pattern

1. Context

In a distributed environment, operations may hang indefinitely due to network issues, server overloads, or other unpredictable factors. This can lead to resource exhaustion and cascading failures across the system.

2. Problem

How can you ensure that operations do not hang indefinitely, thereby avoiding resource exhaustion and improving system resilience?

3. Solution

The Timeout Pattern sets a maximum time limit for an operation to be completed. If the operation exceeds this time, it is aborted, and an exception is raised or a fallback mechanism is invoked. This ensures that resources are not indefinitely blocked.

4. Benefits

  • Resource Efficiency: Helps avoid resource exhaustion by not letting operations hang indefinitely.

  • Enhanced Resilience: Prevents one slow or hanging operation from causing a ripple effect of delays or failures across the system.

  • Predictability: Makes the system behavior more predictable under failure conditions.

5. Drawbacks

  • False Positives: If the timeout is too short, the operation may be aborted even when there's no real issue, leading to unnecessary retries or failures.

  • Complexity: Needs to be fine-tuned for different types of operations and network conditions, adding complexity to the system.

6. Alternative Solutions

  • Circuit Breaker: Can halt all calls to a particular service if it’s identified as a failing component, allowing it to recover.

  • Bulkhead: Isolates failures but doesn’t provide the time-bound execution that a timeout ensures.