Skip to content

fix: Honor x-ld-fd-fallback header in fdv2 initializer phase (SDK-2203)#365

Open
keelerm84 wants to merge 6 commits intov7from
mk/sdk-2203/fallback-from-init
Open

fix: Honor x-ld-fd-fallback header in fdv2 initializer phase (SDK-2203)#365
keelerm84 wants to merge 6 commits intov7from
mk/sdk-2203/fallback-from-init

Conversation

@keelerm84
Copy link
Copy Markdown
Member

@keelerm84 keelerm84 commented Apr 17, 2026

Prior to this change, the Go SDK only inspected the x-ld-fd-fallback
response header on FDv2 synchronizer responses (streaming and polling).
If an initializer received the header, the signal was silently dropped
and the SDK would continue attempting subsequent initializers and FDv2
synchronizers rather than reverting to FDv1. This diverged from the spec
and from the Node SDK's behavior.

DataInitializer.Fetch now returns (basis, fallbackToFDv1, err). The
FDv2 data system branches on the bool, applying any accompanying Basis
before swapping the synchronizer list for the FDv1 fallback builder --
so evaluations can serve the server-provided payload while FDv1 spins
up. When no FDv1 fallback is configured, the data system logs and
clears the synchronizer list, mirroring the synchronizer-triggered path.

A shared isFDv1FallbackRequested helper and fdv1FallbackHeader constant
replace the duplicated header-string checks across streaming and polling
data sources.


Note

Medium Risk
Changes the DataInitializer interface and FDv2 startup control flow to trigger protocol fallback, which can affect initialization and data source state transitions. Risk is mitigated by extensive new unit/e2e tests covering success+fallback and error+fallback paths.

Overview
Ensures FDv2 honors X-LD-FD-Fallback during the initializer phase, allowing the SDK to switch to FDv1 when the server requests it (including when the initializer response is otherwise successful).

Updates subsystems.DataInitializer.Fetch to return (basis, fallbackToFDv1, err) and threads this signal through PollingDataSourceV2, StreamingDataSourceV2, file-data v2, and test data sources. The FDv2 data system now applies any accompanying Basis before reverting to FDv1, and transitions to Off (with a preserved error) if fallback is requested but no FDv1 fallback synchronizer is configured.

Refactors repeated header checks into a shared isFDv1FallbackRequested helper, and adds targeted unit + end-to-end tests for fallback on 200 responses, initializer-triggered fallback skipping FDv2 synchronizers, and the no-fallback-configured shutdown behavior.

Reviewed by Cursor Bugbot for commit 1aec421. Bugbot is set up for automated code reviews on this repo. Configure here.

Prior to this change, the Go SDK only inspected the x-ld-fd-fallback
response header on FDv2 synchronizer responses (streaming and polling).
If an initializer received the header, the signal was silently dropped
and the SDK would continue attempting subsequent initializers and FDv2
synchronizers rather than reverting to FDv1. This diverged from the spec
and from the Node SDK's behavior.

DataInitializer.Fetch now returns (basis, fallbackToFDv1, err). The
FDv2 data system branches on the bool, applying any accompanying Basis
before swapping the synchronizer list for the FDv1 fallback builder --
so evaluations can serve the server-provided payload while FDv1 spins
up. When no FDv1 fallback is configured, the data system logs and
clears the synchronizer list, mirroring the synchronizer-triggered path.

A shared isFDv1FallbackRequested helper and fdv1FallbackHeader constant
replace the duplicated header-string checks across streaming and polling
data sources.
@keelerm84 keelerm84 requested a review from a team as a code owner April 17, 2026 20:12
Comment thread internal/datasystem/fdv2_datasystem.go
…ack configured

If an initializer requested FDv1 fallback but no fdv1FallbackBuilder was
configured, the run loop cleared the synchronizer list and logged a
warning but never updated the data source status. runSynchronizers then
closed closeWhenReady and returned without a status update, leaving the
status permanently stuck at Initializing.

Mirror the synchronizer-triggered path (syncFDv1 with nil fallback
builder) by calling UpdateStatus(DataSourceStateOff, ...) in the
initializer-triggered branch as well. Add a regression test that
configures a custom data system with only an initializer (no FDv1
fallback) and asserts the status transitions to Off.
@tanderson-ld tanderson-ld self-requested a review April 20, 2026 14:45
}
if basis != nil {
f.environmentIDProvider.SetEnvironmentID(basis.EnvironmentID)
f.store.Apply(basis.ChangeSet, basis.Persist)
Copy link
Copy Markdown
Contributor

@tanderson-ld tanderson-ld Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like there could be a way to set this up such that the same apply block down below could be reused, but also not gonna hold up the PR on this. This block of code will probably go away long term.

Do you want f.loggers.Infof("Initialized via %s", initializer.Name())?

Prior to this change, when an initializer returned a valid payload
alongside the FDv1 fallback signal, the payload was applied to the
store silently. Add an Info log so operators can confirm which
initializer's data was applied before the protocol switch.
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit b62ac91. Configure here.

Comment thread internal/datasystem/fdv2_datasystem.go
The header was checked on initializer responses and on synchronizer
error responses, but not when the streaming or polling synchronizer got
a 200 with a valid payload. In that case the SDK applied the payload
and then kept the FDv2 connection open indefinitely, ignoring the
server's request to revert to FDv1.

Streaming: consumeStream now tracks whether the response headers
carried x-ld-fd-fallback: true, and after any event that emits a Valid
result (IntentNone or EventPayloadTransferred) it emits an Off /
RevertToFDv1 result and closes the stream. Extracted reportMalformedEvent
out of the closure to keep consumeStream within cyclomatic threshold.

Polling: the poll() helper now returns (fallback, err). On a successful
response whose headers request fallback, it emits the Valid result with
any accompanying ChangeSet and then an Off / RevertToFDv1 result; Sync
returns from the goroutine.

Unit tests cover both the streaming and polling synchronizer
success-with-fallback paths. End-to-end test
TestFDV2CanFallBackToV1FromStreamingSuccess verifies the full path:
streaming sync returns 200 + valid SSE + fallback header, payload is
applied, client reverts to FDv1, FDv1 data wins.
…utine

Previously poll() emitted both the Valid result and the Off/RevertToFDv1
result internally, so the Sync goroutine's exit on fallback looked like
a bare `return` with no visible status signal. That made the emission
easy to miss when reading the diff.

Move the Off/RevertToFDv1 emission out of poll() and into the Sync
goroutine, next to where the error-path fallback already emits the same
signal at line 130. poll() now returns (environmentID, fallback, err):
it still emits the Valid result on success, but the fallback emission
happens at the Sync call site alongside the return. No behavior change.
When an initializer requested FDv1 fallback but no fdv1FallbackBuilder
was configured, the Off status was published with a zero-value
DataSourceErrorInfo because the initializer's underlying error was
only logged via Warnf, never recorded in status. The synchronizer
path does not have this problem: consumeSynchronizerResults calls
UpdateStatus(Interrupted, result.Error) before hitting the same
UpdateStatus(Off, LastError) call, so LastError is populated.

Thread the error info back from runInitializers to the caller so it
can be passed directly to UpdateStatus when transitioning to Off. When
fallback accompanies a successful response (no HTTP error), errorInfo
stays empty, matching the synchronizer path's behavior for that case.

Extend TestFDV2InitializerFallbackWithoutFDv1FallbackTransitionsToOff
to assert LastError is non-empty so programmatic status monitors can
see why the data source shut down.
// the response headers carried x-ld-fd-fallback: true; in that case the caller is responsible
// for emitting the Off/RevertToFDv1 signal and exiting the sync goroutine. A non-nil err means
// the request failed and no Valid result was emitted; the caller handles it per the existing
// error path.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the pros/cons of returning the bool from this function (and having the polling data source deal with it) vs sending the bool through resultChan via DataSynchronizerResult.RevertToFDv1?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants