fix(controller): return error instead of panicking on unknown auth mode#1791
fix(controller): return error instead of panicking on unknown auth mode#1791SarthakB11 wants to merge 2 commits intokagent-dev:mainfrom
Conversation
getAuthenticator panicked when authCfg.Mode was anything other than trusted-proxy or unsecure. a panic at startup bypasses defer-driven cleanup (e.g. the tracing shutdown registered earlier in app.Start), produces a stack trace where a typed error message would be clearer, and surfaces to kubernetes only as a CrashLoopBackOff with no reason set on the pod. return a wrapped fmt.Errorf instead and propagate it up through the ExtensionConfig closure so app.Start can log and exit cleanly. swap the panic-recover test for one that asserts the typed error path. closes the panic-on-startup behavior at go/core/cmd/controller/main.go:51. Signed-off-by: SarthakB11 <sarthak.bhardwaj21b@iiitg.ac.in>
There was a problem hiding this comment.
Pull request overview
This PR changes the controller startup path so unknown auth-mode values return an error from getAuthenticator instead of panicking, and updates the unit test to cover the error-returning behavior. In the controller codebase, this affects early bootstrap/auth initialization before the HTTP server is wired up.
Changes:
- Change
getAuthenticatorto return(AuthProvider, error)and return an error for unsupported auth modes. - Propagate authenticator construction failures through the controller's
app.Startextension-config callback. - Replace the panic-based test with an error-path test for unknown auth modes.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
go/core/cmd/controller/main.go |
Updates controller bootstrap to propagate invalid auth-mode failures as errors from getAuthenticator. |
go/core/cmd/controller/auth_mode_test.go |
Adjusts unit tests to match the new error-returning authenticator behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| authenticator, err := getAuthenticator(struct{ Mode, UserIDClaim string }{"proxy", ""}) | ||
| if err == nil { | ||
| t.Fatal("expected error for unknown auth mode, got nil") |
There was a problem hiding this comment.
good catch. pushed 4431347f which asserts the error message includes both the invalid mode name and the list of supported modes, so a regression that drops either signal would fail the test now.
| authenticator := getAuthenticator(bootstrap.Config.Auth) | ||
| authenticator, err := getAuthenticator(bootstrap.Config.Auth) | ||
| if err != nil { | ||
| return nil, err |
There was a problem hiding this comment.
you're right, the defer-skip is real. app.Start calls os.Exit(1) at app.go:496 when the extension-config callback returns err, skipping the tracing-shutdown defer registered at app.go:320 that the previous panic would have unwound through.
practical flush impact at this specific call site is small (very few spans are buffered between tracing init and getAuthenticator), but the defer-skip applies to every config-failure path in startup, not just this one. the underlying issue is app.Start using os.Exit rather than returning errors.
the suggested panic(err) keeps getAuthenticator returning a typed error (so the test assertions still hold) and panics at the call site so defers run. happy to apply it. otherwise i can refactor app.Start to thread errors back to main as a separate PR. which would you prefer?
…nd valid options per copilot review feedback: the prior assertion only checked that some error was returned, so a regression that dropped the invalid mode name or the supported-values hint would still pass. assert both pieces of the user-facing message explicitly. Signed-off-by: SarthakB11 <sarthak.bhardwaj21b@iiitg.ac.in>
summary
getAuthenticatoratgo/core/cmd/controller/main.go:51panicked whenauthCfg.Modewas anything other thantrusted-proxyorunsecure. a panic at startup bypasses defer-driven cleanup (e.g. the tracing shutdown registered earlier inapp.Start), surfaces to kubernetes only as aCrashLoopBackOffwith no reason set on the pod, and prints a stack trace where a typed error message would be clearer.fmt.Errorf("unknown auth mode %q (valid modes: unsecure, trusted-proxy)", authCfg.Mode)instead and propagate the error up through theExtensionConfigclosure soapp.Startcan log and exit cleanly.ai model disclosure
used claude code (claude opus 4.7) to triage and draft the diff. self-reviewed
go/core/cmd/controller/main.goand the existingauth_mode_test.goto confirmgetAuthenticatoris only called from a closure that already returnserror, so propagation is straightforward. verified locally via: