refactor: add NodeBuilder to simplify container.Node creation#198
refactor: add NodeBuilder to simplify container.Node creation#198chatton wants to merge 3 commits into
Conversation
Replace the 3-step container.Node setup (NewNode + SetContainerLifecycle + CreateAndSetupVolume) with a fluent NodeBuilder API. Old functions are preserved but marked deprecated for backwards compatibility.
📝 WalkthroughWalkthroughAdds a fluent NodeBuilder API and migrates many Node/agent/relayer constructors to use it, centralizes container naming helpers, suppresses staticcheck in tests that still use deprecated constructors, and improves docker keyring exec/cleanup behavior. ChangesContainer Node Builder Refactoring
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@framework/docker/container/node.go`:
- Around line 119-134: The Build method on NodeBuilder should validate required
inputs before creating the Node: check that containerName and b.homeDir are
non-empty, and if b.hostNetwork is false ensure b.networkID is non-empty; return
a clear error (with context mentioning NodeBuilder.Build and containerName) if
any check fails. Do these checks at the top of NodeBuilder.Build (before calling
NewNode/NewLifecycle/CreateAndSetupVolume) so invalid config is rejected fast
and later calls like NewNode or CreateAndSetupVolume are not invoked with
missing values.
In `@framework/docker/cosmos/chain_builder.go`:
- Around line 493-501: The fallback branch incorrectly re-checks
nodeConfig.nodeType causing the validator path to be unreachable; when
nodeType==0, derive it from the validator flag instead of re-reading
nodeConfig.nodeType. Update the branch in chain_builder.go to: if nodeType == 0
{ if nodeConfig.Validator { nodeType = types.NodeTypeValidator } else { nodeType
= types.NodeTypeConsensusFull } }, referencing nodeConfig, nodeType,
types.NodeTypeValidator and types.NodeTypeConsensusFull to locate the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4b7ac380-e844-4471-beb9-1f5f9491c78d
📒 Files selected for processing (18)
framework/docker/chain_node_test.goframework/docker/container/node.goframework/docker/cosmos/chain_builder.goframework/docker/cosmos/node.goframework/docker/dataavailability/network_builder.goframework/docker/dataavailability/node.goframework/docker/evstack/chain_builder.goframework/docker/evstack/evmsingle/node.goframework/docker/evstack/node.goframework/docker/evstack/reth/node.goframework/docker/evstack/spamoor/node.goframework/docker/hyperlane/agent.goframework/docker/hyperlane/forward_relayer.goframework/docker/hyperlane/hyperlane.goframework/docker/ibc/relayer/hermes.goframework/docker/jaeger/node.goframework/docker/node_volume_test.goframework/docker/victoriatraces/node.go
| func (b *NodeBuilder) Build(ctx context.Context, containerName string) (*Node, error) { | ||
| n := NewNode(b.networkID, b.dockerClient, b.testName, b.image, b.homeDir, b.index, b.nodeType, b.logger) | ||
| lc := NewLifecycle(b.logger, b.dockerClient, containerName) | ||
| if b.hostNetwork { | ||
| lc.SetHostNetwork(b.hostNetwork) | ||
| } | ||
| n.ContainerLifecycle = lc | ||
| volName := containerName | ||
| if b.volumeName != "" { | ||
| volName = b.volumeName | ||
| } | ||
| if err := n.CreateAndSetupVolume(ctx, volName); err != nil { | ||
| return nil, fmt.Errorf("setup node %q: %w", containerName, err) | ||
| } | ||
| return n, nil | ||
| } |
There was a problem hiding this comment.
Validate required builder inputs in Build before creating the node.
Build currently allows invalid state (e.g., empty containerName/homeDir, or empty networkID when host mode is off), which can fail later with less actionable Docker errors. Add upfront validation here to fail fast.
Suggested patch
func (b *NodeBuilder) Build(ctx context.Context, containerName string) (*Node, error) {
+ if containerName == "" {
+ return nil, fmt.Errorf("containerName cannot be empty")
+ }
+ if b.homeDir == "" {
+ return nil, fmt.Errorf("homeDir cannot be empty")
+ }
+ if !b.hostNetwork && b.networkID == "" {
+ return nil, fmt.Errorf("networkID cannot be empty when host network is disabled")
+ }
+
n := NewNode(b.networkID, b.dockerClient, b.testName, b.image, b.homeDir, b.index, b.nodeType, b.logger)
lc := NewLifecycle(b.logger, b.dockerClient, containerName)
if b.hostNetwork {
lc.SetHostNetwork(b.hostNetwork)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func (b *NodeBuilder) Build(ctx context.Context, containerName string) (*Node, error) { | |
| n := NewNode(b.networkID, b.dockerClient, b.testName, b.image, b.homeDir, b.index, b.nodeType, b.logger) | |
| lc := NewLifecycle(b.logger, b.dockerClient, containerName) | |
| if b.hostNetwork { | |
| lc.SetHostNetwork(b.hostNetwork) | |
| } | |
| n.ContainerLifecycle = lc | |
| volName := containerName | |
| if b.volumeName != "" { | |
| volName = b.volumeName | |
| } | |
| if err := n.CreateAndSetupVolume(ctx, volName); err != nil { | |
| return nil, fmt.Errorf("setup node %q: %w", containerName, err) | |
| } | |
| return n, nil | |
| } | |
| func (b *NodeBuilder) Build(ctx context.Context, containerName string) (*Node, error) { | |
| if containerName == "" { | |
| return nil, fmt.Errorf("containerName cannot be empty") | |
| } | |
| if b.homeDir == "" { | |
| return nil, fmt.Errorf("homeDir cannot be empty") | |
| } | |
| if !b.hostNetwork && b.networkID == "" { | |
| return nil, fmt.Errorf("networkID cannot be empty when host network is disabled") | |
| } | |
| n := NewNode(b.networkID, b.dockerClient, b.testName, b.image, b.homeDir, b.index, b.nodeType, b.logger) | |
| lc := NewLifecycle(b.logger, b.dockerClient, containerName) | |
| if b.hostNetwork { | |
| lc.SetHostNetwork(b.hostNetwork) | |
| } | |
| n.ContainerLifecycle = lc | |
| volName := containerName | |
| if b.volumeName != "" { | |
| volName = b.volumeName | |
| } | |
| if err := n.CreateAndSetupVolume(ctx, volName); err != nil { | |
| return nil, fmt.Errorf("setup node %q: %w", containerName, err) | |
| } | |
| return n, nil | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@framework/docker/container/node.go` around lines 119 - 134, The Build method
on NodeBuilder should validate required inputs before creating the Node: check
that containerName and b.homeDir are non-empty, and if b.hostNetwork is false
ensure b.networkID is non-empty; return a clear error (with context mentioning
NodeBuilder.Build and containerName) if any check fails. Do these checks at the
top of NodeBuilder.Build (before calling
NewNode/NewLifecycle/CreateAndSetupVolume) so invalid config is rejected fast
and later calls like NewNode or CreateAndSetupVolume are not invoked with
missing values.
…ainer-node-creation
- Remove dead branch in cosmos ChainBuilder that checked nodeType after confirming it was zero. Use derived nodeType for Validator field. - Poll ExecInspect until exec completes in docker keyring execCommand to prevent race between rm and subsequent container reads. - Resolve key address before local delete so the .address file can be cleaned up from the container.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
framework/docker/internal/docker_keyring.go (1)
440-445:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winDon’t swallow
.addresscleanup errors.Line 444 ignores
execCommanderrors, which can leave stale address files silently and break consistency after successful key deletion. Return (or at least wrap) this error.Suggested fix
if record != nil { addr, err := record.GetAddress() if err == nil { addrFilePath := filepath.Join(d.containerKeyringDir, addr.String()+".address") - _ = d.execCommand(context.TODO(), []string{"rm", "-f", addrFilePath}) + if err := d.execCommand(context.TODO(), []string{"rm", "-f", addrFilePath}); err != nil { + return fmt.Errorf("failed to delete address file %s: %w", addrFilePath, err) + } } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@framework/docker/internal/docker_keyring.go` around lines 440 - 445, The cleanup of the ".address" file swallows errors from d.execCommand when removing addrFilePath: after obtaining addr via record.GetAddress() and building addrFilePath (using d.containerKeyringDir and addr.String()+".address"), check the error returned by d.execCommand(context.TODO(), []string{"rm","-f", addrFilePath}) and propagate it (or wrap it with context) instead of discarding; update the surrounding function (where record, GetAddress, addrFilePath and d.execCommand are used) to return that error so callers can handle cleanup failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@framework/docker/internal/docker_keyring.go`:
- Around line 316-328: The exec polling loop that calls
d.dockerClient.ExecInspect with exec.ID currently blocks forever; change it to
respect context cancellation or a bounded timeout by either requiring a context
with deadline or wrapping the loop with a select that checks ctx.Done() (and/or
a timeout timer) on each iteration, and return a descriptive error if cancelled
or timed out; update the polling block around ExecInspect, inspect.Running, cmd
and ctx to abort when ctx.Done() triggers and propagate the context error
instead of looping indefinitely.
---
Outside diff comments:
In `@framework/docker/internal/docker_keyring.go`:
- Around line 440-445: The cleanup of the ".address" file swallows errors from
d.execCommand when removing addrFilePath: after obtaining addr via
record.GetAddress() and building addrFilePath (using d.containerKeyringDir and
addr.String()+".address"), check the error returned by
d.execCommand(context.TODO(), []string{"rm","-f", addrFilePath}) and propagate
it (or wrap it with context) instead of discarding; update the surrounding
function (where record, GetAddress, addrFilePath and d.execCommand are used) to
return that error so callers can handle cleanup failures.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 107b1110-d893-4046-a543-f125e8a7dc19
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (5)
framework/docker/cosmos/chain_builder.goframework/docker/cosmos/node.goframework/docker/evstack/node.goframework/docker/evstack/spamoor/node.goframework/docker/internal/docker_keyring.go
| for { | ||
| inspect, err := d.dockerClient.ExecInspect(ctx, exec.ID, client.ExecInspectOptions{}) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to inspect exec result: %w", err) | ||
| } | ||
| if !inspect.Running { | ||
| if inspect.ExitCode != 0 { | ||
| return fmt.Errorf("command %v exited with non-zero status: %d", cmd, inspect.ExitCode) | ||
| } | ||
| return nil | ||
| } | ||
| time.Sleep(50 * time.Millisecond) | ||
| } |
There was a problem hiding this comment.
Add a bounded timeout/cancellation path for exec polling.
Line 316 introduces an unbounded wait loop; with context.TODO() callers (e.g., Line 436/444), this can hang forever if Docker exec never exits. Wrap the poll path with a timeout (or require a deadline-bound ctx) and abort on ctx.Done().
Suggested fix
func (d *dockerKeyring) execCommand(ctx context.Context, cmd []string) error {
+ if _, hasDeadline := ctx.Deadline(); !hasDeadline {
+ var cancel context.CancelFunc
+ ctx, cancel = context.WithTimeout(ctx, 30*time.Second)
+ defer cancel()
+ }
+
exec, err := d.dockerClient.ExecCreate(ctx, d.containerID, client.ExecCreateOptions{
Cmd: cmd,
})
@@
- for {
+ ticker := time.NewTicker(50 * time.Millisecond)
+ defer ticker.Stop()
+ for {
+ select {
+ case <-ctx.Done():
+ return fmt.Errorf("exec command timed out/canceled: %w", ctx.Err())
+ case <-ticker.C:
+ }
+
inspect, err := d.dockerClient.ExecInspect(ctx, exec.ID, client.ExecInspectOptions{})
if err != nil {
return fmt.Errorf("failed to inspect exec result: %w", err)
}
if !inspect.Running {
if inspect.ExitCode != 0 {
return fmt.Errorf("command %v exited with non-zero status: %d", cmd, inspect.ExitCode)
}
return nil
}
- time.Sleep(50 * time.Millisecond)
}
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@framework/docker/internal/docker_keyring.go` around lines 316 - 328, The exec
polling loop that calls d.dockerClient.ExecInspect with exec.ID currently blocks
forever; change it to respect context cancellation or a bounded timeout by
either requiring a context with deadline or wrapping the loop with a select that
checks ctx.Done() (and/or a timeout timer) on each iteration, and return a
descriptive error if cancelled or timed out; update the polling block around
ExecInspect, inspect.Running, cmd and ctx to abort when ctx.Done() triggers and
propagate the context error instead of looping indefinitely.
Summary
NodeBuilderAPI tocontainerpackage that handles Node + Lifecycle + Volume setup in a singleBuild()callNewNode,SetContainerLifecycle, andCreateAndSetupVolumemarked deprecated but preserved for backwards compatibilityWithVolumeName(for shared volumes) andWithHostNetwork(for host networking)nodeNamehelpers in each package to decouple name computation from the embedded NodeCloses #131
Summary by CodeRabbit
Refactor
Chores