feat: adding circuit breaker feature#8266
Conversation
| return cached | ||
|
|
||
| try: | ||
| record = self._get_record(name) |
There was a problem hiding this comment.
Does _get_record really need to throw an exception here? Can we not just check if the value returned is `None?
| opened_at=opened_at, | ||
| expiry_timestamp=self._durable_ttl(), | ||
| ) | ||
| self._update_record(record) |
There was a problem hiding this comment.
What happens if this fails? Does it leave the DynamoDB row permanently stale?
| raise | ||
|
|
||
| def _update_record(self, record: CircuitStateRecord) -> None: | ||
| update_expression = "SET #state = :state, #failure_count = :failure_count" |
There was a problem hiding this comment.
I think this query creation logic could be a separate private method
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #8266 +/- ##
===========================================
- Coverage 96.72% 96.64% -0.08%
===========================================
Files 286 296 +10
Lines 14347 14700 +353
Branches 1201 1231 +30
===========================================
+ Hits 13877 14207 +330
- Misses 341 353 +12
- Partials 129 140 +11 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|




Issue number: closes #8257
Summary
Changes
This PR adds a Circuit Breaker utility (under the
circuit_breaker_alphanamespace) so a Lambda function can stop calling an unhealthy downstream and let it recover, instead of piling on retries.It ships as alpha on purpose: I want about a month of feedback before we lock the public API and promote it to GA.
The failure counter lives in memory per execution environment, so a healthy circuit writes nothing; we only persist state transitions. State is shared via DynamoDB, fails open if the store is unreachable, and uses a conditional write to elect a single probe during recovery (no thundering herd). You handle rejected requests with an
on_circuit_opencallback and observe state changes with anon_transitionhook.User experience
Before, you had to build the state machine, shared storage, and recovery logic yourself. Now you wrap the function that makes the downstream call:
With no config, sensible defaults apply (open after 5 failures, probe after 30s, close after 3 successes). When open, the call is skipped and the callback's value is returned, or a
CircuitBreakerOpenErroris raised.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.