Skip to content

add Settings domain with DB persistence and admin CRUD API#939

Open
bitcoin-coder-bob wants to merge 30 commits intomasterfrom
bob/settings-domain-and-repo
Open

add Settings domain with DB persistence and admin CRUD API#939
bitcoin-coder-bob wants to merge 30 commits intomasterfrom
bob/settings-domain-and-repo

Conversation

@bitcoin-coder-bob
Copy link
Copy Markdown
Collaborator

@bitcoin-coder-bob bitcoin-coder-bob commented Feb 27, 2026

closes #934

  • Introduce a Settings domain model that moves 16 runtime-tunable config values (ban threshold/duration, exit delays, VTXO tree expiry, round participant limits, amount limits, etc.) out of environment
    variables and into the database
  • Validate all settings fields (amount bounds, participant limits, exit delays, ban thresholds, update_fields field names) with descriptive errors
  • Add SettingsRepository with implementations for Postgres, SQLite, and Badger, including SQL migrations
  • Add three new Admin gRPC/REST endpoints: GetSettings, UpdateSettings (allows for partial updates), and ClearSettings
  • On startup, seed default settings to the DB if none exist (scheduler-aware: block vs time-based defaults)
  • Settings changes propagate live to the running app service via an UpdateSettings callback — no restart required
  • Remove the corresponding env vars from config, docker-compose, and .env files

Summary by CodeRabbit

  • New Features

    • Admin API to get, update, and clear runtime settings with persisted defaults and live-apply support; runtime service now accepts live setting changes.
  • Documentation

    • Added Admin Settings API docs and comprehensive settings table; removed several legacy environment variables from docs.
  • Chores

    • Simplified local run-simulation startup (fixed session duration) and adjusted compose/dev envs (removed some env vars, added CSV allow flag).
  • Tests

    • Added unit tests for settings repo/service and updated E2E setup to apply admin settings automatically.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 27, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a persistent, runtime-configurable Settings domain with Get/Update/Clear admin APIs, DB migrations and repo implementations (badger/sqlite/postgres), wiring to load/apply settings at startup and runtime, plus docs, Makefile, env and e2e test updates to use admin-managed settings.

Changes

Cohort / File(s) Summary
API Spec & Protos
api-spec/openapi/swagger/ark/v1/admin.openapi.json, api-spec/protobuf/ark/v1/admin.proto
Add GET/POST /v1/admin/settings and POST /v1/admin/settings/clear endpoints and corresponding request/response schemas/messages for Settings.
gRPC Handlers & Permissions
internal/interface/grpc/handlers/adminservice.go, internal/interface/grpc/permissions/permissions.go
Add gRPC handlers for GetSettings/UpdateSettings/ClearSettings and map RPCs to EntityManager permissions.
Domain & Repo Contracts
internal/core/domain/settings.go, internal/core/domain/settings_repo.go, internal/core/ports/repo_manager.go
Introduce domain.Settings model, validation and locktime helper; add SettingsRepository interface and RepoManager.Settings() accessor.
Application Layer
internal/core/application/admin.go, internal/core/application/admin_test.go, internal/core/application/service.go, internal/core/application/types.go
Add AdminService Get/Update/Clear methods with onSettingsUpdated callback; add service.UpdateSettings runtime application logic and unit tests.
Config & Wiring
internal/config/config.go
Remove many env-based keys, add defaultSettings load/seed flow, and pass onSettingsUpdated callback into admin service to propagate runtime changes.
DB Migrations & SQLC
internal/infrastructure/db/*/migration/20260227000000_add_settings.*, internal/infrastructure/db/*/sqlc/query.sql, .../sqlc/queries/*
Add settings table migrations; SQLC queries/models for UpsertSettings, SelectLatestSettings, ClearSettings and Setting/UpsertSettingsParams types.
Repo Implementations
internal/infrastructure/db/badger/settings_repo.go, internal/infrastructure/db/postgres/settings_repo.go, internal/infrastructure/db/sqlite/settings_repo.go
Add Badger/Postgres/SQLite SettingsRepository implementations (Get/Upsert/Clear/Close); badger Upsert includes retry logic.
DB Service Integration & Tests
internal/infrastructure/db/service.go, internal/infrastructure/db/service_test.go
Wire settingsStore factory into infra DB service, expose service.Settings(), ensure cleanup, and add tests exercising settings repo lifecycle.
E2E Tests & Helpers
internal/test/e2e/utils_test.go, internal/test/e2e/single_batch_smoke_test.go
Add updateSettings helper invoked during test setup to apply admin settings; remove env-var references from comments.
Docs, Makefile & Env files
README.md, Makefile, docker-compose.regtest.yml, envs/arkd.dev.env, envs/arkd.light.env
README gains Admin Settings docs and settings table; many ARKD_* env-vars removed from docs and env files; Makefile run-simulation simplified and regtest toggles ARKD_ALLOW_CSV_BLOCK_TYPE.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant GRPC as "gRPC Handler"
    participant AdminApp as "Admin Service"
    participant RepoMgr as "Repo Manager"
    participant SettingsRepo as "Settings Repo"
    participant AppSvc as "App Service"

    Client->>GRPC: UpdateSettings(request)
    GRPC->>AdminApp: UpdateSettings(settings)
    AdminApp->>AdminApp: validate & set UpdatedAt
    AdminApp->>RepoMgr: Settings()
    RepoMgr-->>AdminApp: SettingsRepository
    AdminApp->>SettingsRepo: Upsert(settings)
    SettingsRepo->>SettingsRepo: persist to storage
    SettingsRepo-->>AdminApp: OK
    AdminApp->>AppSvc: onSettingsUpdated / UpdateSettings(settings)
    AppSvc->>AppSvc: recalc derived values & tapscript
    AppSvc-->>AdminApp: OK
    AdminApp-->>GRPC: UpdateSettingsResponse
    GRPC-->>Client: Response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • altafan
  • louisinger
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 21.43% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately summarizes the main objective: introducing a Settings domain model with database persistence and admin CRUD API endpoints.
Linked Issues check ✅ Passed The PR comprehensively implements all coding requirements from issue #934: Settings domain and repository interface with CRUD operations, persistent storage (Postgres/SQLite/Badger), three admin gRPC/REST endpoints (GetSettings/UpdateSettings/ClearSettings), seeding defaults on startup, live propagation of changes, and removal of corresponding environment variables.
Out of Scope Changes check ✅ Passed All changes align with the linked issue scope. Code modifications are limited to implementing the Settings domain, repository, migrations, handlers, and updating configuration workflows. Unrelated refactoring or feature additions are absent.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bob/settings-domain-and-repo

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@bitcoin-coder-bob bitcoin-coder-bob marked this pull request as ready for review March 3, 2026 16:26
@arkanaai
Copy link
Copy Markdown
Contributor

arkanaai Bot commented Mar 18, 2026

🔍 Arkana PR Review — arkd#939: Settings Domain with DB Persistence

Scope: Moves 16 runtime-tunable config values (ban threshold/duration, exit delays, VTXO tree expiry, round participant limits, amount limits, etc.) from environment variables into a DB-persisted Settings domain with admin CRUD API. Touches 38 files, +2413/−199 lines.

Overall: Well-structured domain model following the existing repository pattern. Clean separation across Postgres, SQLite, and Badger backends with proper migrations. Good test coverage. A few concerns below, some security-relevant.


🔴 Security: No Input Validation on UpdateSettings

internal/interface/grpc/handlers/adminservice.goUpdateSettings passes proto values straight through without any validation. Since proto3 uses zero values for unset fields, a partial update attempt (e.g., only setting ban_threshold) would zero out all other fields including critical ones like unilateral_exit_delay, boarding_exit_delay, and vtxo_tree_expiry.

Risks:

  • Setting unilateral_exit_delay = 0 or boarding_exit_delay = 0 would break the unilateral exit guarantee — users could lose the ability to safely exit.
  • Setting vtxo_tree_expiry = 0 would create immediately-expired VTXO trees.
  • Setting max_tx_weight = 0 would reject all transactions.
  • Negative or nonsensical values for delays/amounts aren't rejected.

Suggestion: Add a Validate() method on domain.Settings that enforces minimum safe values, and call it in UpdateSettings before persisting. At minimum: exit delays > 0, vtxo_tree_expiry > unilateral_exit_delay, boarding_exit_delay > vtxo_tree_expiry, etc. The existing validation in config.go:Validate() already checks some of these constraints — factor those checks into a shared function.

Also consider using proto optional fields or a field mask to distinguish "set to zero" from "not provided", enabling safe partial updates.


🟡 Bug: ClearSettings Doesn't Reset Running Service

internal/core/application/admin.goClearSettings calls repo.Clear() and onInfoChange() but does not call onSettingsUpdated or reset the in-memory service settings to defaults. The running arkd keeps the previously loaded settings until restart. This is inconsistent with the "live propagation" behavior of UpdateSettings.

func (a *adminService) ClearSettings(ctx context.Context) error {
    if err := a.repoManager.Settings().Clear(ctx); err != nil {
        return err
    }
    // Missing: reset service to default settings
    a.onInfoChange()
    return nil
}

Should either: (1) re-seed defaults and call onSettingsUpdated with them, or (2) document that ClearSettings requires a restart and update the API docs accordingly.


🟡 Duplicated Locktime Logic

internal/core/application/service.go:toRelativeLocktime() duplicates the block-vs-time threshold logic from config.go:determineLocktimeType() with a hardcoded minAllowedSequence = 512. If this threshold ever changes in one place but not the other, settings loaded at startup vs settings updated via API would interpret the same values differently.

Suggestion: Extract the shared logic into a single function in domain or a shared package.


🟢 Positives

  • Clean architecture: Domain model → repository interface → 3 backend implementations follows the existing patterns well.
  • Live propagation: UpdateSettings callback recalculates derived values (dust amounts, checkpoint tapscript) correctly in service.go:UpdateSettings().
  • Good test coverage: Unit tests for admin service, integration tests for all 3 DB backends, e2e tests updated to use the new API.
  • Proper auth gating: New endpoints use EntityManager read/write permissions consistent with other admin endpoints.
  • Migration files: Present for both Postgres and SQLite with proper up/down scripts.
  • README updated: Documents the new admin settings API with defaults table.

Minor Notes

  • Postgres UpsertSettings always inserts with id=0 (Go zero value). Works because GENERATED BY DEFAULT AS IDENTITY allows explicit values, but it's implicit — a comment would help.
  • The NewAdminService signature now requires an onSettingsUpdated callback — breaking change for any external consumers of that constructor.
  • settings table schema uses BIGINT NOT NULL DEFAULT 0 for all fields. Since -1 is a valid sentinel for amount limits ("unset"/"dust"), the DEFAULT 0 in the schema doesn't match the domain semantics. Not a functional bug (the app always provides explicit values), but could be confusing.

Verdict: The architectural approach is solid and the implementation is thorough. The missing input validation on UpdateSettings is the most critical item — it's a live admin API that directly controls protocol-security parameters (exit delays, VTXO expiry). Would recommend addressing the validation gap before merge.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (5)
Makefile (2)

184-200: Remove stale MIN/MAX handling from run-simulation output and shell vars.

Line 197-Line 199 compute/log MIN and MAX, but they are never passed to the test command. In internal/test/e2e/single_batch_smoke_test.go (Line 23-Line 65), only -num-clients is parsed, while min/max participants are hardcoded in test config. This log is misleading.

Proposed cleanup
 	`@bash` -c '\
 		CLIENTS="$${CLIENTS:-5}"; \
-		MIN="$${MIN:-$$CLIENTS}"; \
-		MAX="$${MAX:-128}"; \
-		echo "Running batch settlement test with $$CLIENTS clients (MIN=$$MIN, MAX=$$MAX)..."; \
+		echo "Running batch settlement test with $$CLIENTS clients..."; \
 		go test -v -count=1 -timeout 1200s github.com/arkade-os/arkd/internal/test/e2e -run TestBatchSettleMultipleClients -args -smoke -num-clients=$$CLIENTS; \
 	'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 184 - 200, The run-simulation Makefile target prints
and sets unused MIN and MAX variables which is misleading; remove the MIN and
MAX assignments and their use in the echo so the target only reads CLIENTS and
passes -num-clients to the test. Update the run-simulation recipe's shell block
to drop MIN="$${MIN:-$$CLIENTS}" and MAX="$${MAX:-128}" and change the echo to
only reference $$CLIENTS (leaving the go test invocation that runs
TestBatchSettleMultipleClients unchanged); this aligns the Makefile with the
test in internal/test/e2e/single_batch_smoke_test.go which only parses
-num-clients.

191-192: Make ARKD_SESSION_DURATION overrideable while keeping default 60.

Line 192 hardcodes ARKD_SESSION_DURATION=60. Consider defaulting to 60 but allowing callers to override (make run-simulation ARKD_SESSION_DURATION=120) for troubleshooting slower environments.

Proposed tweak
-	`@ARKD_SESSION_DURATION`=60 docker compose -f docker-compose.regtest.yml up --build -d
+	`@ARKD_SESSION_DURATION`=$${ARKD_SESSION_DURATION:-60} docker compose -f docker-compose.regtest.yml up --build -d
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 191 - 192, The recipe currently hardcodes
ARKD_SESSION_DURATION=60 in the Docker command; make this
default-but-overridable by either (a) adding a Makefile variable declaration
like ARKD_SESSION_DURATION ?= 60 and using ARKD_SESSION_DURATION in the recipe,
or (b) using shell expansion in the recipe
ARKD_SESSION_DURATION=${ARKD_SESSION_DURATION:-60} docker compose ... so callers
can run make run-simulation ARKD_SESSION_DURATION=120 to override while keeping
60 as the default.
internal/core/application/service.go (1)

304-349: Consider thread safety for runtime settings updates.

UpdateSettings modifies multiple service fields that may be read concurrently by round processing goroutines (e.g., banDuration, roundMinParticipantsCount, checkpointExitDelay). Without synchronization, this could cause data races.

Options to consider:

  1. Document that settings changes take effect between rounds (eventual consistency)
  2. Add a mutex to protect settings reads/writes
  3. Use atomic operations for individual fields

If the current behavior (unsynchronized updates) is intentional for simplicity, consider adding a comment documenting when it's safe to call this method.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/application/service.go` around lines 304 - 349, The
UpdateSettings method mutates many runtime fields (e.g., banDuration,
banThreshold, unilateralExitDelay, publicUnilateralExitDelay,
checkpointExitDelay, boardingExitDelay, batchExpiry, roundMinParticipantsCount,
roundMaxParticipantsCount, vtxoMinAmount, vtxoMaxAmount, utxoMinAmount,
utxoMaxAmount, settlementMinExpiryGap, vtxoNoCsvValidationCutoffTime,
maxTxWeight, checkpointTapscript, forfeitPubkey) without synchronization; add a
sync.RWMutex (e.g., settingsMu) to the service struct and wrap the entire
UpdateSettings body with settingsMu.Lock()/Unlock(), and update all readers of
those fields (round processing goroutines, any methods reading these fields) to
use settingsMu.RLock()/RUnlock(); alternatively, if unsynchronized updates are
intentional, add a clear comment on UpdateSettings documenting that it must only
be called between rounds and is not concurrency-safe.
internal/core/domain/settings.go (1)

25-55: Avoid the 16-argument positional constructor.

All of these parameters are int64, so a single swap still compiles and silently assigns the wrong setting. A dedicated input struct (or plain struct literal at call sites) is much safer to review and extend.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/domain/settings.go` around lines 25 - 55, The NewSettings
function takes 16 positional int64 arguments which risks silent swaps; replace
it with a single explicit config struct (e.g. SettingsConfig or SettingsOptions)
containing named fields matching Settings (BanThreshold, BanDuration,
UnilateralExitDelay, PublicUnilateralExitDelay, CheckpointExitDelay,
BoardingExitDelay, VtxoTreeExpiry, RoundMinParticipantsCount,
RoundMaxParticipantsCount, VtxoMinAmount, VtxoMaxAmount, UtxoMinAmount,
UtxoMaxAmount, SettlementMinExpiryGap, VtxoNoCsvValidationCutoffDate,
MaxTxWeight), change NewSettings signature to accept that struct (or a pointer)
and construct &Settings from the struct fields, and update all callers to pass a
struct literal (or use functional options) instead of positional args so each
value is explicit and reviewable.
api-spec/protobuf/ark/v1/admin.proto (1)

381-408: Keep updated_at out of the update payload.

internal/interface/grpc/handlers/adminservice.go ignores req.settings.updated_at, so reusing the same Settings message for reads and writes advertises a writable field the server never honors. A dedicated update message keeps the contract accurate and avoids clients round-tripping server-managed data.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api-spec/protobuf/ark/v1/admin.proto` around lines 381 - 408, The proto
reuses message Settings for both reads and writes while server ignores
req.settings.updated_at; change the write contract so clients cannot send
updated_at by adding a dedicated update message (e.g., message SettingsUpdate or
SettingsWritable) that mirrors Settings but omits the updated_at field, then
change UpdateSettingsRequest to use that new message instead of Settings; update
any generated stubs/clients and ensure adminservice.go handler continues to
operate unchanged (and that server-managed updated_at remains set server-side).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api-spec/openapi/swagger/ark/v1/admin.openapi.json`:
- Around line 1695-1768: The UpdateSettingsRequest/Settings schema must be
updated at the proto/message level to match the handler contract: decide whether
the endpoint is a full-replace or a patch and implement that explicitly—if
full-replace, make UpdateSettingsRequest.settings required and make each
Settings field required (remove writable updatedAt or mark updatedAt as
output/readOnly); if patch, change Settings fields to use nullable wrapper types
(e.g., google.protobuf.Int64Value) or a FieldMask so omitted fields are
distinguishable from zero and won’t overwrite stored values; update the Settings
message to not expose updatedAt as an input field (mark it output-only) and then
regen the OpenAPI/JSON so UpdateSettingsRequest aligns with adminservice's
behavior (refer to UpdateSettingsRequest, Settings, and the adminservice handler
logic).

In `@internal/core/application/admin.go`:
- Around line 634-639: ClearSettings removes persisted settings but doesn't
update the in-memory runtime config; call the settings update handler after
clearing so the node re-seeds defaults and applies the change. Modify
ClearSettings (the function that calls a.repoManager.Settings().Clear) to invoke
a.onSettingsUpdated() (in addition to or instead of a.onInfoChange())
immediately after the clear succeeds so the in-memory settings and any
subscribers are refreshed (Config.loadSettings is the startup equivalent to
emulate).
- Around line 620-630: The UpdateSettings flow currently persists settings
(repoManager.Settings().Upsert) before running live-apply guards, which can lead
to races and invalid persisted values; change UpdateSettings (method
adminService.UpdateSettings) to serialize concurrent updates (add and use a
mutex field on adminService e.g., updateSettingsMu) and perform
validation/live-apply before making durable: acquire the mutex, run the same
validation/startup checks from internal/config/config.go and call
a.onSettingsUpdated(ctx, settings) while still in-memory, only if
validation/onSettingsUpdated succeeds then set settings.UpdatedAt and call
repoManager.Settings().Upsert(ctx, settings), finally release the mutex and call
a.onInfoChange(); propagate errors from validation/apply/upsert unchanged.
- Around line 620-631: UpdateSettings currently writes settings but does not
refresh the cached participant defaults
(roundMinParticipantsCount/roundMaxParticipantsCount), so subsequent
UpdateScheduledSessionConfig calls may still use startup defaults; after
successfully upserting settings and invoking a.onSettingsUpdated, update the
adminService cached fields by assigning a.roundMinParticipantsCount and
a.roundMaxParticipantsCount from the new settings (or call an existing helper
that derives those defaults), ensuring this happens before a.onInfoChange() so
future calls to UpdateScheduledSessionConfig use the new values.

In `@internal/core/application/service.go`:
- Around line 295-302: Add a clarifying doc comment above the toRelativeLocktime
function explaining why values >= minAllowedSequence (512) are treated as
seconds and values < 512 as blocks; mention the BIP68 granularity (512-second
units) and that configuration validation enforces time-based values be multiples
of 512 so this threshold determines whether arklib.RelativeLocktime uses
arklib.LocktimeTypeSecond vs arklib.LocktimeTypeBlock.

In `@internal/infrastructure/db/badger/settings_repo.go`:
- Around line 64-79: The Upsert method in settingsRepository currently retries
on badger.ErrConflict without honoring cancellation; modify Upsert to stop
retrying as soon as ctx is canceled by checking ctx.Done() between attempts and
before sleeping/attempting another r.store.Upsert, returning ctx.Err() if
canceled; preserve existing retry count logic (maxRetries) and continue only
while ctx is active, referencing the Upsert method, settingsKey, maxRetries and
the same badger.ErrConflict check.

In `@internal/infrastructure/db/service.go`:
- Around line 229-232: The settings-store initialization can fail after the DB
connection and earlier repositories are created, leaving those handles open;
wrap resource creation in a cleanup stack or use a defer-until-success pattern:
after creating the DB connection and each repository, push their corresponding
cleanup functions (e.g., db.Close()/Disconnect and repo.Close()/Shutdown) onto a
slice of funcs, and if settingsStoreFactory(...) returns an error, iterate that
slice to call each cleanup before returning the error; apply the same pattern
around the other failing sites noted (the analogous blocks at the later
occurrences referenced) so all partially-initialized resources are closed on
error.

In `@internal/interface/grpc/handlers/adminservice.go`:
- Around line 589-621: The UpdateSettings handler (adminHandler.UpdateSettings)
must validate invariants before calling a.adminService.UpdateSettings: ensure
numeric/duration fields are within acceptable bounds (non-negative for
BanThreshold, BanDuration, UnilateralExitDelay, PublicUnilateralExitDelay,
CheckpointExitDelay, BoardingExitDelay, VtxoTreeExpiry, SettlementMinExpiryGap,
MaxTxWeight), ensure participant counts satisfy RoundMinParticipantsCount <
RoundMaxParticipantsCount, ensure min/max pairs are ordered (VtxoMinAmount <=
VtxoMaxAmount, UtxoMinAmount <= UtxoMaxAmount), and validate any date fields
(VtxoNoCsvValidationCutoffDate) as needed; if any check fails return
status.Error(codes.InvalidArgument, "...") with a clear message. Implement these
checks at the start of adminHandler.UpdateSettings (before calling
a.adminService.UpdateSettings / Upsert) and use the same status codes/messages
pattern as other handlers.

In `@README.md`:
- Around line 102-122: The README admin-settings section is out of sync: update
the documentation to list and describe the newly added endpoints GET
/v1/admin/settings and POST /v1/admin/settings/clear in addition to POST
/v1/admin/settings, and extend the settings table to include the two missing
persisted settings settlement_min_expiry_gap and
vtxo_no_csv_validation_cutoff_date with their descriptions and default values;
ensure the table rows use the same formatting as existing entries (backticked
keys like `settlement_min_expiry_gap`, `vtxo_no_csv_validation_cutoff_date`) and
mention which endpoints manage them.

---

Nitpick comments:
In `@api-spec/protobuf/ark/v1/admin.proto`:
- Around line 381-408: The proto reuses message Settings for both reads and
writes while server ignores req.settings.updated_at; change the write contract
so clients cannot send updated_at by adding a dedicated update message (e.g.,
message SettingsUpdate or SettingsWritable) that mirrors Settings but omits the
updated_at field, then change UpdateSettingsRequest to use that new message
instead of Settings; update any generated stubs/clients and ensure
adminservice.go handler continues to operate unchanged (and that server-managed
updated_at remains set server-side).

In `@internal/core/application/service.go`:
- Around line 304-349: The UpdateSettings method mutates many runtime fields
(e.g., banDuration, banThreshold, unilateralExitDelay,
publicUnilateralExitDelay, checkpointExitDelay, boardingExitDelay, batchExpiry,
roundMinParticipantsCount, roundMaxParticipantsCount, vtxoMinAmount,
vtxoMaxAmount, utxoMinAmount, utxoMaxAmount, settlementMinExpiryGap,
vtxoNoCsvValidationCutoffTime, maxTxWeight, checkpointTapscript, forfeitPubkey)
without synchronization; add a sync.RWMutex (e.g., settingsMu) to the service
struct and wrap the entire UpdateSettings body with settingsMu.Lock()/Unlock(),
and update all readers of those fields (round processing goroutines, any methods
reading these fields) to use settingsMu.RLock()/RUnlock(); alternatively, if
unsynchronized updates are intentional, add a clear comment on UpdateSettings
documenting that it must only be called between rounds and is not
concurrency-safe.

In `@internal/core/domain/settings.go`:
- Around line 25-55: The NewSettings function takes 16 positional int64
arguments which risks silent swaps; replace it with a single explicit config
struct (e.g. SettingsConfig or SettingsOptions) containing named fields matching
Settings (BanThreshold, BanDuration, UnilateralExitDelay,
PublicUnilateralExitDelay, CheckpointExitDelay, BoardingExitDelay,
VtxoTreeExpiry, RoundMinParticipantsCount, RoundMaxParticipantsCount,
VtxoMinAmount, VtxoMaxAmount, UtxoMinAmount, UtxoMaxAmount,
SettlementMinExpiryGap, VtxoNoCsvValidationCutoffDate, MaxTxWeight), change
NewSettings signature to accept that struct (or a pointer) and construct
&Settings from the struct fields, and update all callers to pass a struct
literal (or use functional options) instead of positional args so each value is
explicit and reviewable.

In `@Makefile`:
- Around line 184-200: The run-simulation Makefile target prints and sets unused
MIN and MAX variables which is misleading; remove the MIN and MAX assignments
and their use in the echo so the target only reads CLIENTS and passes
-num-clients to the test. Update the run-simulation recipe's shell block to drop
MIN="$${MIN:-$$CLIENTS}" and MAX="$${MAX:-128}" and change the echo to only
reference $$CLIENTS (leaving the go test invocation that runs
TestBatchSettleMultipleClients unchanged); this aligns the Makefile with the
test in internal/test/e2e/single_batch_smoke_test.go which only parses
-num-clients.
- Around line 191-192: The recipe currently hardcodes ARKD_SESSION_DURATION=60
in the Docker command; make this default-but-overridable by either (a) adding a
Makefile variable declaration like ARKD_SESSION_DURATION ?= 60 and using
ARKD_SESSION_DURATION in the recipe, or (b) using shell expansion in the recipe
ARKD_SESSION_DURATION=${ARKD_SESSION_DURATION:-60} docker compose ... so callers
can run make run-simulation ARKD_SESSION_DURATION=120 to override while keeping
60 as the default.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0ef746c5-f1c4-43ba-8188-e57d581bb56a

📥 Commits

Reviewing files that changed from the base of the PR and between 137daae and da53ecf.

⛔ Files ignored due to path filters (4)
  • api-spec/protobuf/gen/ark/v1/admin.pb.go is excluded by !**/*.pb.go, !**/gen/**
  • api-spec/protobuf/gen/ark/v1/admin.pb.rgw.go is excluded by !**/gen/**
  • api-spec/protobuf/gen/ark/v1/admin_grpc.pb.go is excluded by !**/*.pb.go, !**/gen/**
  • api-spec/protobuf/gen/ark/v1/indexer.pb.rgw.go is excluded by !**/gen/**
📒 Files selected for processing (34)
  • Makefile
  • README.md
  • api-spec/openapi/swagger/ark/v1/admin.openapi.json
  • api-spec/protobuf/ark/v1/admin.proto
  • docker-compose.regtest.yml
  • envs/arkd.dev.env
  • envs/arkd.light.env
  • internal/config/config.go
  • internal/core/application/admin.go
  • internal/core/application/admin_test.go
  • internal/core/application/service.go
  • internal/core/application/types.go
  • internal/core/domain/settings.go
  • internal/core/domain/settings_repo.go
  • internal/core/ports/repo_manager.go
  • internal/infrastructure/db/badger/settings_repo.go
  • internal/infrastructure/db/postgres/migration/20260227000000_add_settings.down.sql
  • internal/infrastructure/db/postgres/migration/20260227000000_add_settings.up.sql
  • internal/infrastructure/db/postgres/settings_repo.go
  • internal/infrastructure/db/postgres/sqlc/queries/models.go
  • internal/infrastructure/db/postgres/sqlc/queries/query.sql.go
  • internal/infrastructure/db/postgres/sqlc/query.sql
  • internal/infrastructure/db/service.go
  • internal/infrastructure/db/service_test.go
  • internal/infrastructure/db/sqlite/migration/20260227000000_add_settings.down.sql
  • internal/infrastructure/db/sqlite/migration/20260227000000_add_settings.up.sql
  • internal/infrastructure/db/sqlite/settings_repo.go
  • internal/infrastructure/db/sqlite/sqlc/queries/models.go
  • internal/infrastructure/db/sqlite/sqlc/queries/query.sql.go
  • internal/infrastructure/db/sqlite/sqlc/query.sql
  • internal/interface/grpc/handlers/adminservice.go
  • internal/interface/grpc/permissions/permissions.go
  • internal/test/e2e/single_batch_smoke_test.go
  • internal/test/e2e/utils_test.go
💤 Files with no reviewable changes (2)
  • envs/arkd.light.env
  • envs/arkd.dev.env

Comment on lines +1695 to +1768
"Settings": {
"title": "Settings",
"type": "object",
"properties": {
"banDuration": {
"type": "integer",
"format": "int64"
},
"banThreshold": {
"type": "integer",
"format": "int64"
},
"boardingExitDelay": {
"type": "integer",
"format": "int64"
},
"checkpointExitDelay": {
"type": "integer",
"format": "int64"
},
"maxTxWeight": {
"type": "integer",
"format": "int64"
},
"publicUnilateralExitDelay": {
"type": "integer",
"format": "int64"
},
"roundMaxParticipantsCount": {
"type": "integer",
"format": "int64"
},
"roundMinParticipantsCount": {
"type": "integer",
"format": "int64"
},
"settlementMinExpiryGap": {
"type": "integer",
"format": "int64"
},
"unilateralExitDelay": {
"type": "integer",
"format": "int64"
},
"updatedAt": {
"type": "integer",
"format": "int64"
},
"utxoMaxAmount": {
"type": "integer",
"format": "int64"
},
"utxoMinAmount": {
"type": "integer",
"format": "int64"
},
"vtxoMaxAmount": {
"type": "integer",
"format": "int64"
},
"vtxoMinAmount": {
"type": "integer",
"format": "int64"
},
"vtxoNoCsvValidationCutoffDate": {
"type": "integer",
"format": "int64"
},
"vtxoTreeExpiry": {
"type": "integer",
"format": "int64"
}
}
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

UpdateSettingsRequest does not match the handler’s real contract.

The schema makes settings optional, makes every nested field optional, and reuses the response model with writable-looking updatedAt. In internal/interface/grpc/handlers/adminservice.go, Lines 592-614, {} is rejected, omitted fields are read as zero and overwrite stored settings, and updatedAt is ignored. Please fix this at the proto/message level before regen so the published contract is either an explicit full-replace shape or a true patch shape.

Based on learnings, the project prioritizes backward compatibility for API endpoints.

Also applies to: 1861-1869

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api-spec/openapi/swagger/ark/v1/admin.openapi.json` around lines 1695 - 1768,
The UpdateSettingsRequest/Settings schema must be updated at the proto/message
level to match the handler contract: decide whether the endpoint is a
full-replace or a patch and implement that explicitly—if full-replace, make
UpdateSettingsRequest.settings required and make each Settings field required
(remove writable updatedAt or mark updatedAt as output/readOnly); if patch,
change Settings fields to use nullable wrapper types (e.g.,
google.protobuf.Int64Value) or a FieldMask so omitted fields are distinguishable
from zero and won’t overwrite stored values; update the Settings message to not
expose updatedAt as an input field (mark it output-only) and then regen the
OpenAPI/JSON so UpdateSettingsRequest aligns with adminservice's behavior (refer
to UpdateSettingsRequest, Settings, and the adminservice handler logic).

Comment thread internal/core/application/admin.go Outdated
Comment thread internal/core/application/admin.go Outdated
Comment thread internal/core/application/admin.go
Comment thread internal/core/application/service.go Outdated
Comment thread internal/infrastructure/db/badger/settings_repo.go
Comment thread internal/infrastructure/db/service.go
Comment thread internal/interface/grpc/handlers/adminservice.go
Comment thread README.md Outdated
@bitcoin-coder-bob
Copy link
Copy Markdown
Collaborator Author

addressed the comments from @arkanaai (#939 (comment)) in: 5271785

@arkanaai
Copy link
Copy Markdown
Contributor

arkanaai Bot commented Mar 18, 2026

Thanks for addressing the feedback, the changes look good overall. A few notes on the commit:

✅ Locktime deduplication — clean, / is the right home.

✅ ClearSettings — now correctly re-seeds defaults and calls onSettingsUpdated. The test rename to "resets settings to defaults" reflects the actual behaviour. Good.

✅ Validate() — covers the critical cases (zero exit delays, min/max participants ordering, public ≥ unilateral). Two small follow-up items:

  1. Missing ordering constraint: VtxoTreeExpiry isn't checked against UnilateralExitDelay. Trees should expire before users need to exit, so probably want VtxoTreeExpiry > UnilateralExitDelay (or at least document why it's intentionally omitted).

  2. String-based error detection in the gRPC handler:

    This is fragile — if the error message ever changes, validation errors silently become codes.Internal instead of codes.InvalidArgument. Consider using a sentinel error or errors.As with a typed validation error instead.

Neither is a blocker, but worth a quick fix before merge.

@arkanaai
Copy link
Copy Markdown
Contributor

arkanaai Bot commented Mar 18, 2026

🔍 PR Review: Settings Domain with DB Persistence and Admin CRUD API

Reviewer: Arkana (automated)
Scope: +2514/-221 across 38 files — migrates operator settings from env vars to DB-persisted Settings domain with admin API


Summary

This PR moves ~15 operator-configurable parameters (exit delays, VTXO tree expiry, round limits, ban thresholds, amount bounds, etc.) out of environment variables into a new Settings domain entity backed by DB persistence (badger/sqlite/postgres). It adds three admin API endpoints (GET/POST /v1/admin/settings, POST /v1/admin/settings/clear) and enables hot-reload of these parameters into the running service without restart.

✅ What looks good

  • Clean domain modeling: domain.Settings with Validate() and SettingsRepository interface is well-structured.
  • Multi-backend support: Proper implementations for badger, sqlite, and postgres with migrations.
  • Default seeding on first startup: loadSettings() in config seeds defaults if none exist — good for fresh deployments.
  • Admin permissions properly gated: Macaroon permissions correctly map to EntityManager read/write.
  • Comprehensive tests: Unit tests for admin service settings CRUD and integration tests across all DB backends.
  • E2E tests updated: setupArkd() now calls updateSettings() to configure test parameters via API.
  • ClearSettings resets to defaults rather than leaving the system in an undefined state.

⚠️ Concerns

1. Race condition in service.UpdateSettings() (medium severity)

internal/core/application/service.goUpdateSettings() mutates ~15 fields on the service struct (s.banDuration, s.batchExpiry, s.unilateralExitDelay, etc.) without any synchronization. If a round is being processed concurrently, it could read a mix of old and new values mid-update. Consider holding the service mutex (or adding one if there isnt one) during the settings update, or using an atomic swap of a settings snapshot.

2. No guard against changing exit delays for live VTXOs (high severity — protocol correctness)

Changing unilateral_exit_delay, boarding_exit_delay, or vtxo_tree_expiry at runtime affects new rounds, but existing VTXOs were committed with the old values baked into their scripts. The system should either:

  • Document clearly that these changes only affect future rounds (and ensure no code path applies new delays to existing VTXOs), or
  • Reject changes to exit delays while there are unsettled VTXOs, or
  • At minimum, log a prominent warning when these critical parameters change.

This is the most security-sensitive aspect of this PR. An operator accidentally reducing unilateral_exit_delay mid-operation could create confusion about exit guarantees.

3. Postgres upsert ID defaults to 0 implicitly

In postgres/settings_repo.go, UpsertSettingsParams.ID is never set (defaults to 0). This works as a singleton pattern via ON CONFLICT(id), but its implicit. Consider either:

  • Using a constant settingsSingletonID = 1 and setting it explicitly, or
  • Adding a comment explaining the singleton pattern.

The same applies to sqlite/settings_repo.go.

4. Validate() could be stricter

domain/settings.go Validate() doesnt check:

  • VtxoTreeExpiry must be reasonable (e.g., > UnilateralExitDelay)
  • CheckpointExitDelay relationship to other delays
  • MaxTxWeight must be > 0
  • BanThreshold must be > 0

Some of these are enforced later in config.Validate(), but since UpdateSettings calls settings.Validate() directly, the API endpoint could accept values that config.Validate() would reject.

5. updated_at stored as unix timestamp loses sub-second precision

The postgres/sqlite schemas store updated_at as BIGINT (unix seconds). The badger store preserves Go time.Time directly. This means time.Unix(settings.UpdatedAt, 0) in the SQL repos truncates to seconds. Not a bug, but a minor inconsistency worth noting.

6. Settings schema uses BIGINT GENERATED BY DEFAULT AS IDENTITY (postgres)

This allows manual ID insertion (the 0 from point 3), but also allows auto-increment if multiple rows are inserted without explicit ID. Since the code always passes 0, this works, but GENERATED ALWAYS would be safer to prevent accidental multi-row scenarios. Or drop auto-generation entirely since this is a singleton table.

📝 Minor notes

  • Docker compose and dev env files cleanly remove the old env vars — good cleanup.
  • The Makefile simulation simplification (removing MIN/MAX params) is a nice side effect.
  • Proto and OpenAPI specs are consistent with the implementation.
  • minAllowedSequence correctly moved to domain.MinAllowedSequence as a public constant.

Verdict

Architecturally sound refactor that improves operability significantly. The main concerns are around race safety during hot-reload and protocol correctness guarantees when exit delays change for live VTXOs. These should be addressed before merge.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
internal/core/application/admin.go (2)

631-637: ⚠️ Potential issue | 🟡 Minor

Cached participant count fields not refreshed after settings update.

After UpdateSettings succeeds, a.roundMinParticipantsCount and a.roundMaxParticipantsCount still hold their startup values. This affects UpdateScheduledSessionConfig which uses these as fallbacks (lines 279-283, 304-308).

Suggested fix
     if a.onSettingsUpdated != nil {
         if err := a.onSettingsUpdated(ctx, settings); err != nil {
             return err
         }
     }
+    a.roundMinParticipantsCount = settings.RoundMinParticipantsCount
+    a.roundMaxParticipantsCount = settings.RoundMaxParticipantsCount
     a.onInfoChange()
     return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/application/admin.go` around lines 631 - 637, After
UpdateSettings succeeds the cached fields a.roundMinParticipantsCount and
a.roundMaxParticipantsCount are not being refreshed, so
UpdateScheduledSessionConfig still sees old startup values; after the successful
call to a.onSettingsUpdated (i.e. inside the same block where you handle the
non-nil callback) update the cached values from the new settings (assign
settings.RoundMinParticipantsCount and settings.RoundMaxParticipantsCount into
a.roundMinParticipantsCount and a.roundMaxParticipantsCount) before calling
a.onInfoChange so subsequent UpdateScheduledSessionConfig calls use the updated
fallbacks.

623-638: ⚠️ Potential issue | 🟠 Major

Persist-then-apply ordering can leave inconsistent state on callback failure.

If onSettingsUpdated fails after Upsert succeeds, the database contains the new settings while the runtime keeps the old configuration. Without transaction support (per the SettingsRepository interface), consider reversing the order: apply first, persist on success.

Suggested reordering
 func (a *adminService) UpdateSettings(ctx context.Context, settings domain.Settings) error {
     if err := settings.Validate(); err != nil {
         return fmt.Errorf("invalid settings: %w", err)
     }
-    settings.UpdatedAt = time.Now()
-    if err := a.repoManager.Settings().Upsert(ctx, settings); err != nil {
-        return err
-    }
     if a.onSettingsUpdated != nil {
         if err := a.onSettingsUpdated(ctx, settings); err != nil {
             return err
         }
     }
+    settings.UpdatedAt = time.Now()
+    if err := a.repoManager.Settings().Upsert(ctx, settings); err != nil {
+        return err
+    }
     a.onInfoChange()
     return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/application/admin.go` around lines 623 - 638, UpdateSettings
currently persists settings via Settings().Upsert before calling the runtime
callback onSettingsUpdated, risking DB/runtime divergence if the callback fails;
change the order so the in-memory/runtime application happens first (call
a.onSettingsUpdated(ctx, settings) if non-nil), only if that succeeds then call
a.repoManager.Settings().Upsert(ctx, settings), and finally call
a.onInfoChange(); ensure you still validate and set settings.UpdatedAt before
applying, and propagate errors from the callback or the Upsert unchanged so
failures abort without leaving the runtime/stored state inconsistent.
🧹 Nitpick comments (2)
internal/core/domain/settings.go (1)

39-69: Consider adding validation for CheckpointExitDelay and MaxTxWeight.

The Validate() method checks UnilateralExitDelay, BoardingExitDelay, and VtxoTreeExpiry for positive values, but omits CheckpointExitDelay and MaxTxWeight. For consistency and to prevent invalid configurations:

Suggested additions
 func (s Settings) Validate() error {
     if s.UnilateralExitDelay <= 0 {
         return fmt.Errorf("unilateral exit delay must be greater than 0")
     }
+    if s.CheckpointExitDelay <= 0 {
+        return fmt.Errorf("checkpoint exit delay must be greater than 0")
+    }
     if s.BoardingExitDelay <= 0 {
         return fmt.Errorf("boarding exit delay must be greater than 0")
     }
     // ... existing checks ...
+    if s.MaxTxWeight <= 0 {
+        return fmt.Errorf("max tx weight must be greater than 0")
+    }
     return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/domain/settings.go` around lines 39 - 69, Add validation in
Settings.Validate to check CheckpointExitDelay and MaxTxWeight are positive,
mirroring existing checks: after the existing
UnilateralExitDelay/BoardingExitDelay/VtxoTreeExpiry checks add an if
s.CheckpointExitDelay <= 0 { return fmt.Errorf("checkpoint exit delay must be
greater than 0") } and an if s.MaxTxWeight <= 0 { return fmt.Errorf("max tx
weight must be greater than 0") }, keeping the checks inside the
Settings.Validate function.
internal/config/config.go (1)

369-371: Consider inlining determineLocktimeType.

This function is now a simple wrapper around domain.ToRelativeLocktime. Consider removing the indirection:

-func determineLocktimeType(locktime int64) arklib.RelativeLocktime {
-    return domain.ToRelativeLocktime(locktime)
-}

Then call domain.ToRelativeLocktime() directly in loadSettings().

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/config/config.go` around lines 369 - 371, The helper function
determineLocktimeType is an unnecessary wrapper around
domain.ToRelativeLocktime; remove the determineLocktimeType function and update
callers (e.g., loadSettings) to call domain.ToRelativeLocktime(locktime)
directly, ensuring any imports or references remain correct and tests/usage of
loadSettings continue to compile.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/core/application/admin.go`:
- Around line 640-656: ClearSettings currently does Clear + Upsert before
calling onSettingsUpdated and does not refresh cached participant counts; change
the ordering to match UpdateSettings by: perform
repoManager.Settings().Clear(ctx), set defaults.UpdatedAt, call
a.onSettingsUpdated(ctx, defaults) (if non-nil) before persisting, then Upsert
the defaults, and after successful Upsert refresh the cached participant counts
using the same helper used by UpdateSettings and finally call a.onInfoChange();
ensure you propagate any errors from onSettingsUpdated, the refresh helper, and
Upsert.

In `@internal/core/application/service.go`:
- Around line 295-340: UpdateSettings currently mutates shared service fields
(banDuration, banThreshold, unilateralExitDelay, publicUnilateralExitDelay,
checkpointExitDelay, boardingExitDelay, batchExpiry, roundMinParticipantsCount,
roundMaxParticipantsCount, vtxoMinAmount, vtxoMaxAmount, utxoMinAmount,
utxoMaxAmount, settlementMinExpiryGap, vtxoNoCsvValidationCutoffTime,
maxTxWeight, checkpointTapscript) without synchronization; add a settings mutex
(e.g., s.settingsMu sync.RWMutex) and wrap the entire UpdateSettings body with
s.settingsMu.Lock()/Unlock(), then change readers that access these fields
(examples: startRound, startConfirmation) to use s.settingsMu.RLock()/RUnlock()
when reading to prevent data races and ensure consistent view of settings.

---

Duplicate comments:
In `@internal/core/application/admin.go`:
- Around line 631-637: After UpdateSettings succeeds the cached fields
a.roundMinParticipantsCount and a.roundMaxParticipantsCount are not being
refreshed, so UpdateScheduledSessionConfig still sees old startup values; after
the successful call to a.onSettingsUpdated (i.e. inside the same block where you
handle the non-nil callback) update the cached values from the new settings
(assign settings.RoundMinParticipantsCount and
settings.RoundMaxParticipantsCount into a.roundMinParticipantsCount and
a.roundMaxParticipantsCount) before calling a.onInfoChange so subsequent
UpdateScheduledSessionConfig calls use the updated fallbacks.
- Around line 623-638: UpdateSettings currently persists settings via
Settings().Upsert before calling the runtime callback onSettingsUpdated, risking
DB/runtime divergence if the callback fails; change the order so the
in-memory/runtime application happens first (call a.onSettingsUpdated(ctx,
settings) if non-nil), only if that succeeds then call
a.repoManager.Settings().Upsert(ctx, settings), and finally call
a.onInfoChange(); ensure you still validate and set settings.UpdatedAt before
applying, and propagate errors from the callback or the Upsert unchanged so
failures abort without leaving the runtime/stored state inconsistent.

---

Nitpick comments:
In `@internal/config/config.go`:
- Around line 369-371: The helper function determineLocktimeType is an
unnecessary wrapper around domain.ToRelativeLocktime; remove the
determineLocktimeType function and update callers (e.g., loadSettings) to call
domain.ToRelativeLocktime(locktime) directly, ensuring any imports or references
remain correct and tests/usage of loadSettings continue to compile.

In `@internal/core/domain/settings.go`:
- Around line 39-69: Add validation in Settings.Validate to check
CheckpointExitDelay and MaxTxWeight are positive, mirroring existing checks:
after the existing UnilateralExitDelay/BoardingExitDelay/VtxoTreeExpiry checks
add an if s.CheckpointExitDelay <= 0 { return fmt.Errorf("checkpoint exit delay
must be greater than 0") } and an if s.MaxTxWeight <= 0 { return fmt.Errorf("max
tx weight must be greater than 0") }, keeping the checks inside the
Settings.Validate function.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a1b59888-eb73-40cb-9875-bd5d2eefd762

📥 Commits

Reviewing files that changed from the base of the PR and between da53ecf and 5271785.

📒 Files selected for processing (6)
  • internal/config/config.go
  • internal/core/application/admin.go
  • internal/core/application/admin_test.go
  • internal/core/application/service.go
  • internal/core/domain/settings.go
  • internal/interface/grpc/handlers/adminservice.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • internal/interface/grpc/handlers/adminservice.go
  • internal/core/application/admin_test.go

Comment thread internal/core/application/admin.go
Comment on lines +295 to +340
func (s *service) UpdateSettings(settings domain.Settings) error {
s.banDuration = time.Duration(settings.BanDuration) * time.Second
s.banThreshold = settings.BanThreshold
s.unilateralExitDelay = domain.ToRelativeLocktime(settings.UnilateralExitDelay)
s.publicUnilateralExitDelay = domain.ToRelativeLocktime(settings.PublicUnilateralExitDelay)
s.checkpointExitDelay = domain.ToRelativeLocktime(settings.CheckpointExitDelay)
s.boardingExitDelay = domain.ToRelativeLocktime(settings.BoardingExitDelay)
s.batchExpiry = domain.ToRelativeLocktime(settings.VtxoTreeExpiry)
s.roundMinParticipantsCount = settings.RoundMinParticipantsCount
s.roundMaxParticipantsCount = settings.RoundMaxParticipantsCount
s.vtxoMinAmount = settings.VtxoMinAmount
s.vtxoMaxAmount = settings.VtxoMaxAmount
s.utxoMinAmount = settings.UtxoMinAmount
s.utxoMaxAmount = settings.UtxoMaxAmount
s.settlementMinExpiryGap = time.Duration(settings.SettlementMinExpiryGap) * time.Second
s.vtxoNoCsvValidationCutoffTime = time.Unix(settings.VtxoNoCsvValidationCutoffDate, 0)
s.maxTxWeight = uint64(settings.MaxTxWeight)

// Recalculate dust-derived values.
ctx := context.Background()
dustAmount, err := s.wallet.GetDustAmount(ctx)
if err != nil {
return fmt.Errorf("failed to get dust amount: %s", err)
}

s.vtxoMinAmount, s.utxoMinAmount = resolveMinAmounts(
s.vtxoMinAmount, s.utxoMinAmount, int64(dustAmount),
)

// Recalculate checkpoint tapscript if forfeit pubkey is already set.
if s.forfeitPubkey != nil {
checkpointClosure := &script.CSVMultisigClosure{
Locktime: s.checkpointExitDelay,
MultisigClosure: script.MultisigClosure{
PubKeys: []*btcec.PublicKey{s.forfeitPubkey},
},
}
checkpointTapscript, err := checkpointClosure.Script()
if err != nil {
return fmt.Errorf("failed to encode checkpoint tapscript: %s", err)
}
s.checkpointTapscript = checkpointTapscript
}

return nil
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Data race: concurrent field updates without synchronization.

UpdateSettings modifies multiple shared service fields (banDuration, vtxoMinAmount, unilateralExitDelay, etc.) that are read concurrently by round execution goroutines (startRound, startConfirmation, etc.). Without mutex protection, this can cause data races and inconsistent state.

Consider adding a dedicated mutex for settings or reusing an existing one:

Suggested approach
 type service struct {
+    settingsMu sync.RWMutex
     // ... existing fields
 }

 func (s *service) UpdateSettings(settings domain.Settings) error {
+    s.settingsMu.Lock()
+    defer s.settingsMu.Unlock()
+
     s.banDuration = time.Duration(settings.BanDuration) * time.Second
     // ... rest of updates
 }

Then use s.settingsMu.RLock() in readers like startConfirmation when accessing these fields.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/application/service.go` around lines 295 - 340, UpdateSettings
currently mutates shared service fields (banDuration, banThreshold,
unilateralExitDelay, publicUnilateralExitDelay, checkpointExitDelay,
boardingExitDelay, batchExpiry, roundMinParticipantsCount,
roundMaxParticipantsCount, vtxoMinAmount, vtxoMaxAmount, utxoMinAmount,
utxoMaxAmount, settlementMinExpiryGap, vtxoNoCsvValidationCutoffTime,
maxTxWeight, checkpointTapscript) without synchronization; add a settings mutex
(e.g., s.settingsMu sync.RWMutex) and wrap the entire UpdateSettings body with
s.settingsMu.Lock()/Unlock(), then change readers that access these fields
(examples: startRound, startConfirmation) to use s.settingsMu.RLock()/RUnlock()
when reading to prevent data races and ensure consistent view of settings.

…x/min participants on ClearSettings, commenting, MaxTxWeight validation, badger settings repo ctx cancellation in retry loop in Upsert
@arkanaai
Copy link
Copy Markdown
Contributor

arkanaai Bot commented Mar 18, 2026

🔍 Follow-up Review — New Commits (da53ecf, a61a9f1)

Thanks for addressing the feedback, @bitcoin-coder-bob. This is a solid round of fixes.

✅ Issues Resolved

  1. Race conditionsettingsMu mutex now guards both UpdateSettings and ClearSettings. The comment on service.UpdateSettings correctly documents the caller-holds-lock contract. Fixed.

  2. ClearSettings inconsistency — Now properly resets roundMinParticipantsCount and roundMaxParticipantsCount on the admin service after clearing DB. Fixed.

  3. Apply-before-persist ordering — Settings are now applied to the live service first, then persisted. If onSettingsUpdated fails, the DB retains the previous valid settings. Good defensive pattern. Fixed.

  4. MaxTxWeight validationValidate() now rejects MaxTxWeight <= 0. Fixed.

  5. Badger retry respects contextselect on ctx.Done() prevents infinite retry loops if the context is cancelled. Fixed.

  6. README & comments — Endpoint table, ToRelativeLocktime godoc, and inline comments all improve clarity.

🟡 Remaining Concern (from prior review, still open)

No guard against changing exit delays for live VTXOs. If an operator reduces unilateral_exit_delay or boarding_exit_delay while VTXOs with longer timelocks are in-flight, those VTXOs were committed with the old delay baked into their scripts. The DB settings would diverge from what's actually enforced on-chain. This is a protocol correctness issue rather than a code bug — the validation layer doesn't have visibility into existing VTXO commitments.

This doesn't need to block merge, but a follow-up issue to add "minimum floor" validation (or at least a warning log when reducing exit delays) would be valuable.

🟢 Overall

The mutex, apply-before-persist, and validation additions are all the right calls. This is in much better shape. No new blocking issues.

@bitcoin-coder-bob
Copy link
Copy Markdown
Collaborator Author

bitcoin-coder-bob commented Mar 18, 2026

addressed @arkanaai comment for string based error detction (#939 (comment)) in: 2802d62

however for VtxoTreeExpiry > UnilateralExitDelay while we can normalize the comparison using RelativeLocktime.Seconds(), the block scheduler defaults would fail it (vtxoTreeExpiry=20 blocks ≈ 12,000s vs
unilateralExitDelay=86,400s), and config.go:Validate() doesn't enforce this either. Skipping for now to avoid breaking block scheduler configs. The bit inside defaultSettings() in config.go:

	if c.SchedulerType == "block" {
		vtxoTreeExpiry = 20      // 20 blocks
		checkpointExitDelay = 10 // 10 blocks
	}

this is the case that would break things, since now vtxoTreeExpiry is now not going to exceed UnilaterExitDelay. The Validate() method on Settings doesn't know which scheduler is in use. So it would either need the scheduler type passed in, or the check would have to be done at the config level instead. Either way it would break block scheduler configs as they exist today.

@arkanaai
Copy link
Copy Markdown
Contributor

arkanaai Bot commented Mar 18, 2026

Makes sense — the block scheduler case is a legitimate exception and you're right that Validate() doesn't have the scheduler context to distinguish them. Adding the check there would either false-positive on block configs or require threading scheduler type through, neither of which is worth it here.

Skipping the VtxoTreeExpiry > UnilateralExitDelay check is reasonable. If it's worth documenting the invariant somewhere for future reference, a comment in Validate() or defaultSettings() noting "block scheduler configs intentionally have vtxoTreeExpiry < unilateralExitDelay" would help the next person not re-raise this.

Otherwise this looks good to me — both original blockers are resolved.

@arkanaai
Copy link
Copy Markdown
Contributor

arkanaai Bot commented Mar 18, 2026

🔍 Arkana Review — Follow-up (commit 2802d62)

Scope: Single commit since last review — introduces ErrInvalidSettings typed error.

Changes

  1. domain/settings.go — New ErrInvalidSettings struct with Reason field. All Validate() returns now use this typed error instead of raw fmt.Errorf.
  2. application/admin.goUpdateSettings now returns the validation error directly (no wrapping), preserving the error type for callers.
  3. grpc/handlers/adminservice.go — Replaces fragile strings.Contains(err.Error(), "invalid settings:") with proper errors.As(err, &validationErr). This is the right pattern — type assertions are stable across refactors, string matching is not.

Assessment

Clean improvement. The typed error approach:

  • Makes the gRPC handler resilient to message wording changes
  • Allows SDK consumers to programmatically distinguish validation failures
  • Follows Go error handling best practices (errors.As over string inspection)

No security, correctness, or cross-repo concerns with this change.

Prior Review Items Status

All previously flagged issues from earlier commits remain addressed. The one non-blocking concern from the prior review still stands: no guard against reducing exit delays for live VTXOs with existing timelocks. This is a protocol-level design decision, not a code bug.

No further action needed on this commit.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
internal/core/application/admin.go (1)

652-673: Note: ClearSettings applies to the running service after persisting, unlike UpdateSettings.

In UpdateSettings, onSettingsUpdated is called before Upsert to fail-fast on live-apply errors. In ClearSettings, it's called after Upsert. If onSettingsUpdated were to fail (unlikely given context snippet 1 shows it just assigns fields), the DB would have the new defaults while the runtime retains old values.

Since service.UpdateSettings (the concrete callback) only performs field assignments and cannot fail in practice, this ordering difference is low-risk. However, for consistency and defensive coding, consider aligning the ordering with UpdateSettings.

💡 Optional: Align ordering with UpdateSettings
 func (a *adminService) ClearSettings(ctx context.Context) error {
 	a.settingsMu.Lock()
 	defer a.settingsMu.Unlock()

 	if err := a.repoManager.Settings().Clear(ctx); err != nil {
 		return err
 	}
 	defaults := a.defaultSettings
 	defaults.UpdatedAt = time.Now()
+	if a.onSettingsUpdated != nil {
+		if err := a.onSettingsUpdated(ctx, defaults); err != nil {
+			return err
+		}
+	}
 	if err := a.repoManager.Settings().Upsert(ctx, defaults); err != nil {
 		return err
 	}
-	if a.onSettingsUpdated != nil {
-		if err := a.onSettingsUpdated(ctx, defaults); err != nil {
-			return err
-		}
-	}
 	a.roundMinParticipantsCount = defaults.RoundMinParticipantsCount
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/application/admin.go` around lines 652 - 673, The ClearSettings
workflow currently calls repoManager.Settings().Upsert(...) before invoking
a.onSettingsUpdated(...), risking the DB being updated while the runtime fails
to apply live changes; change ClearSettings to mirror UpdateSettings by calling
a.onSettingsUpdated(ctx, defaults) before attempting
a.repoManager.Settings().Upsert(ctx, defaults) so failures in onSettingsUpdated
cause an early return and avoid persisting inconsistent state (keep locking
behavior with a.settingsMu and leave subsequent assignments to
a.roundMinParticipantsCount/a.roundMaxParticipantsCount and a.onInfoChange()
as-is).
internal/core/domain/settings.go (1)

53-88: Consider adding validation for amount min/max ordering when both are set.

The Validate() method checks several constraints but doesn't validate that VtxoMinAmount <= VtxoMaxAmount and UtxoMinAmount <= UtxoMaxAmount when both are explicitly set (i.e., not -1). From the context, -1 appears to mean "unset/unlimited", so validation should only apply when both values are positive.

💡 Optional enhancement
 	if s.MaxTxWeight <= 0 {
 		return &ErrInvalidSettings{"max tx weight must be greater than 0"}
 	}
+	if s.VtxoMinAmount > 0 && s.VtxoMaxAmount > 0 && s.VtxoMinAmount > s.VtxoMaxAmount {
+		return &ErrInvalidSettings{"vtxo min amount must be <= vtxo max amount"}
+	}
+	if s.UtxoMinAmount > 0 && s.UtxoMaxAmount > 0 && s.UtxoMinAmount > s.UtxoMaxAmount {
+		return &ErrInvalidSettings{"utxo min amount must be <= utxo max amount"}
+	}
 	return nil
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/domain/settings.go` around lines 53 - 88, The Validate() method
on Settings lacks checks ensuring min ≤ max for amount pairs; add validation in
Settings.Validate to verify when both VtxoMinAmount and VtxoMaxAmount are set
(i.e., != -1) that VtxoMinAmount <= VtxoMaxAmount and similarly when both
UtxoMinAmount and UtxoMaxAmount are set that UtxoMinAmount <= UtxoMaxAmount,
returning &ErrInvalidSettings with clear messages like "vtxo min amount must be
<= vtxo max amount" and "utxo min amount must be <= utxo max amount" when the
checks fail.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/core/domain/settings.go`:
- Around line 26-31: Add explicit upper-bound checks in Settings.Validate to
ensure delay fields do not overflow uint32: validate that UnilateralExitDelay,
BoardingExitDelay, and VtxoTreeExpiry are each <= math.MaxUint32 and return a
descriptive ErrInvalidSettings (e.g., "unilateral exit delay exceeds maximum
uint32 value") on violation; this prevents the silent truncation when
ToRelativeLocktime casts an int64 to uint32.

---

Nitpick comments:
In `@internal/core/application/admin.go`:
- Around line 652-673: The ClearSettings workflow currently calls
repoManager.Settings().Upsert(...) before invoking a.onSettingsUpdated(...),
risking the DB being updated while the runtime fails to apply live changes;
change ClearSettings to mirror UpdateSettings by calling
a.onSettingsUpdated(ctx, defaults) before attempting
a.repoManager.Settings().Upsert(ctx, defaults) so failures in onSettingsUpdated
cause an early return and avoid persisting inconsistent state (keep locking
behavior with a.settingsMu and leave subsequent assignments to
a.roundMinParticipantsCount/a.roundMaxParticipantsCount and a.onInfoChange()
as-is).

In `@internal/core/domain/settings.go`:
- Around line 53-88: The Validate() method on Settings lacks checks ensuring min
≤ max for amount pairs; add validation in Settings.Validate to verify when both
VtxoMinAmount and VtxoMaxAmount are set (i.e., != -1) that VtxoMinAmount <=
VtxoMaxAmount and similarly when both UtxoMinAmount and UtxoMaxAmount are set
that UtxoMinAmount <= UtxoMaxAmount, returning &ErrInvalidSettings with clear
messages like "vtxo min amount must be <= vtxo max amount" and "utxo min amount
must be <= utxo max amount" when the checks fail.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a2021358-40fc-4c29-9846-8fa56c78afc1

📥 Commits

Reviewing files that changed from the base of the PR and between a61a9f1 and 2802d62.

📒 Files selected for processing (3)
  • internal/core/application/admin.go
  • internal/core/domain/settings.go
  • internal/interface/grpc/handlers/adminservice.go

Comment thread internal/core/domain/settings.go
@bitcoin-coder-bob
Copy link
Copy Markdown
Collaborator Author

for the commnets by @arkanaai in: #939 (comment) they were addressed in commit: 557ccad

@arkanaai
Copy link
Copy Markdown
Contributor

arkanaai Bot commented Mar 18, 2026

Thanks. The changes in 557ccad look good:

  • Upsert comment (postgres + sqlite) — clearly documents the singleton id=0 pattern. Exactly what was needed.
  • Exit delay warnings in UpdateSettings() — the three log.Warnf calls on change cover the critical parameters. Good that they're logged before the values are applied, so the log captures old→new correctly.

One small thing: publicUnilateralExitDelay and checkpointExitDelay changes aren't warned on, only the three main ones. Probably fine since those are less security-critical, but worth being consistent if you want complete auditability.

No blockers from me on this PR.

@arkanaai
Copy link
Copy Markdown
Contributor

arkanaai Bot commented Mar 18, 2026

🔍 Arkana Review — Commit 557ccad6 (warning logs + comments)

Scope: service.go — warning logs on exit delay changes; postgres/settings_repo.go & sqlite/settings_repo.go — doc comments on Upsert singleton pattern.

✅ Looks good

  • Exit delay change warnings — Exactly what was needed. The log.Warnf calls clearly communicate that changes only affect future rounds, not live VTXOs. Good operator visibility for a parameter that has protocol-level implications.
  • Clean refactor — Computing newUnilateral, newBoarding, newExpiry before comparison avoids redundant ToRelativeLocktime calls and makes the diff-then-assign flow readable.
  • Singleton key documentation — Clarifying the ON CONFLICT(id) / Go zero-value pattern removes a potential maintenance trap.

🟢 No concerns

This commit is additive (logging + comments only), no behavioral changes. LGTM.

@arkanaai
Copy link
Copy Markdown
Contributor

arkanaai Bot commented Mar 19, 2026

Followup review (new commit: remove unused default config vars)

Clean removal of the orphaned default variables (defaultSettlementMinExpiryGap, defaultMaxTxWeight, defaultVtxoNoCsvValidationCutoffDate) and corresponding viper.SetDefault / viper.Get calls that were left behind after moving these to the DB-backed settings domain.

No issues — this is a straightforward cleanup that aligns config.go with the settings migration. LGTM on this increment.

@arkanaai
Copy link
Copy Markdown
Contributor

arkanaai Bot commented Mar 25, 2026

🔍 Arkana PR Review — arkd#939 (Settings Domain)

Commit reviewed: 70fb5fe (merge of master into feature branch, 2026-03-25)

This is a substantial and well-structured PR that migrates 16 runtime-tunable config values from environment variables into a persistent Settings domain with an admin CRUD API. Overall quality is high — good separation of concerns, thorough testing, and proper validation. A few items worth discussing:


🟡 Security / Authorization

  1. Admin API auth is correctGetSettings requires read on EntityManager, while UpdateSettings/ClearSettings require write. Good. However, UpdateSettings is a full replace (not a partial patch). This means a caller must send all 16 fields or risk zeroing out fields they didn't intend to change. The proto uses int64 with zero as the default, so omitted fields silently become 0 — which will then fail validation for most fields, but SettlementMinExpiryGap and VtxoNoCsvValidationCutoffDate have 0 as valid values. Consider documenting this behavior prominently or adding a partial-update endpoint.

  2. No rate limiting on settings updates — The settingsMu mutex serializes updates, but there's no throttle. Rapid-fire settings changes could cause log spam from the warning messages and force repeated checkpoint tapscript recalculations. Low risk for an admin-only endpoint but worth noting.


🟡 Correctness / Edge Cases

  1. UpdateSettings applies to service before persisting to DB (admin.go:636-639). The comment says this is intentional — if live-apply fails, invalid settings don't get persisted. But the reverse failure path is concerning: if onSettingsUpdated succeeds but Upsert fails, the running service has new settings that aren't persisted. On restart, it would revert to the old DB settings silently. Consider rolling back the in-memory state if the DB write fails.

  2. ClearSettings has the same issue — defaults are applied to the running service before DB write. If Upsert of defaults fails, you're left with default settings in memory but old settings in DB.

  3. Postgres ON CONFLICT(id) relies on the Go zero-value — The UpsertSettingsParams.ID defaults to 0, which becomes the singleton key. This works but is fragile. The Postgres migration uses GENERATED BY DEFAULT AS IDENTITY for id, so if someone ever manually inserts a row, the auto-generated ID won't be 0 and the upsert will insert a new row instead of updating. Consider using a fixed sentinel (e.g., id = 1) or a UNIQUE constraint on a separate column.

  4. Validate() doesn't check CheckpointExitDelay > 0 — Unlike UnilateralExitDelay and BoardingExitDelay, there's no positivity check for CheckpointExitDelay. A value of 0 would create a RelativeLocktime{Type: Block, Value: 0}, which could cause issues.


🟡 Protocol Safety

  1. Exit delay changes affect only future rounds — This is correctly documented with warning logs in service.go:UpdateSettings. Good. The warnings are important because existing VTXOs have old delay values baked into their scripts. Consider also logging the old vs new values for batchExpiry changes since VTXO tree expiry is a critical security parameter.

  2. PublicUnilateralExitDelay >= UnilateralExitDelay validation — Good constraint. But the validation doesn't enforce that BoardingExitDelay > UnilateralExitDelay, only that they're different. In practice, BoardingExitDelay should always be greater than UnilateralExitDelay for security (boarding utxos need longer timeouts). The current check != would allow BoardingExitDelay < UnilateralExitDelay, which could create a security issue where boarding exits resolve before unilateral exits.


🟢 Good Patterns

  • Scheduler-aware defaults (defaultSettings() in config.go) — Block-based scheduler gets block-based defaults (20 blocks / 10 blocks), time-based gets time-based defaults. Clean.
  • Comprehensive validation with uint32 overflow guards and satoshi upper bounds.
  • Three storage backends (Postgres, SQLite, Badger) all implemented consistently with proper migrations.
  • Unit tests for domain validation (settings_test.go) cover boundary conditions well.
  • Integration tests (service_test.go) exercise the full CRUD lifecycle.
  • E2E test updated to use the new admin API instead of env vars — good migration pattern.
  • Badger retry loop respects context cancellation — correct.

🔵 Nits

  • config.go:398determineLocktimeType is now a one-liner delegating to domain.ToRelativeLocktime. Consider inlining it or making it a direct alias to reduce indirection.
  • admin.go:89onSettingsUpdated callback could be typed as a method on the admin service rather than a stored closure, reducing the coupling surface.
  • settings.go:87PublicUnilateralExitDelay < UnilateralExitDelay check: the error message says "must be >=" but consider if they should also be equal (current check allows equality, which is fine).
  • Missing \n at end of several files (admin.proto, query.sql for both sqlite and postgres).

Summary

Strong PR with solid architecture. The main items to address before merge are:

  1. Rollback in-memory state on DB write failure (items 3-4)
  2. BoardingExitDelay > UnilateralExitDelay validation (item 8)
  3. CheckpointExitDelay > 0 validation (item 6)

Everything else is minor or informational.

@arkanaai
Copy link
Copy Markdown
Contributor

arkanaai Bot commented Mar 25, 2026

🔍 Arkana PR Review — arkd#939 (updated commits)

Add Settings domain with DB persistence and admin CRUD API by @bitcoin-coder-bob

Overview

Major infrastructure change: moves 16 runtime-tunable config values (ban threshold/duration, exit delays, VTXO tree expiry, round participant limits, amount limits, etc.) from environment variables into the database with admin CRUD API. Settings propagate live to the running service. +2817/-195.

✅ What Looks Good

  • Clean domain model with thorough Validate() — checks overflow (uint32 bounds), min/max consistency, sentinel values (-1), and MaxSatoshis cap. Good defensive programming for a Bitcoin system.
  • Live propagation via UpdateSettings callback — no restart required. The warning logs when exit delay parameters change are a nice touch (existing VTXOs have old values baked in).
  • Scheduler-aware defaults — different defaults for gocron (seconds) vs block (block heights) scheduler types.
  • Three storage backends (Postgres, SQLite, Badger) with migrations and singleton upsert pattern.
  • settingsMu mutex in admin service — prevents concurrent settings updates from racing.
  • E2E test setup updated to apply admin settings, ensuring test environments get correct defaults.
  • Comprehensive unit tests for domain validation, admin service CRUD, and mock repo.
  • Clean env var removal — docker-compose, dev env files, and config defaults properly cleaned up.

⚠️ Items to Consider

  1. UpdateSettings is full-replace, not partial update: The proto UpdateSettingsRequest contains a single Settings message. Clients must send ALL fields — any zero-valued field will be interpreted as "set to 0", which could silently break the server (e.g., VtxoTreeExpiry = 0 fails validation, but BanThreshold = 0 would too). Consider either:

    • Adding a FieldMask to support partial updates (standard protobuf pattern)
    • Documenting clearly that this is full-replace and clients should GET then modify then POST
    • Or at minimum, returning the current settings in the UpdateSettingsResponse so clients can verify
  2. Settings loaded during Validate(): loadSettings() is called inside Config.Validate(), which means the DB connection is established during validation. This couples validation with side effects (DB read). If the DB is unreachable at startup, Validate() fails — which is probably fine operationally, but the function name is misleading. Consider renaming or restructuring.

  3. Postgres migration — singleton pattern: The ON CONFLICT(id) upsert uses the auto-generated id column. Since UpsertSettingsParams.ID defaults to 0 (Go zero value), the singleton key is 0. But the GENERATED BY DEFAULT AS IDENTITY column starts at 1 for new inserts without explicit ID. The first Upsert with ID=0 will insert id=0, which works but is unusual. Consider using GENERATED ALWAYS instead, or using a fixed id=1 sentinel, to avoid confusion.

  4. SelectLatestSettings uses ORDER BY updated_at DESC LIMIT 1: This implies there could be multiple rows. But the upsert uses ON CONFLICT(id) which should keep it as a singleton. The ORDER BY is defensive but misleading — if there is ever more than one row, something is already broken. Consider WHERE id = 0 (or 1) instead for clarity.

  5. Security — exit delay changes affect fund safety:

    • Reducing UnilateralExitDelay or BoardingExitDelay could create a window where existing VTXOs have more time than new ones, potentially confusing users about their exit guarantees.
    • Reducing VtxoTreeExpiry affects how long users have to refresh VTXOs before they expire.
    • The warning logs are good, but consider whether these particular settings should require additional confirmation or have minimum bounds enforced at the domain level (not just > 0).
    • The admin API is protected by macaroon auth, which is appropriate.
  6. ClearSettings resets to defaults then applies live: This is correct, but if the running service has been handling rounds with custom settings, clearing back to defaults mid-operation could cause unexpected behavior in the current round. Consider documenting that ClearSettings should only be called between rounds, or adding a check.

  7. Cross-repo impact with Add opt-in digest validation header for stale parameter detection #957 (digest validation): When settings are updated via this API, the server parameter digest changes (since GetInfo response fields change). But the digest (from Add opt-in digest validation header for stale parameter detection #957) is only recomputed on the next GetInfo call. These two PRs should coordinate: either UpdateSettings should trigger a digest refresh, or this should be documented as a known limitation.

  8. ARKD_ALLOW_CSV_BLOCK_TYPE added to docker-compose: This new env var appears in the regtest compose file. It allows block-type CSV values for the scheduler. Make sure this is documented — it is not in the README env var table.

Security Assessment

  • Admin endpoints are macaroon-protected — good.
  • Exit delay and expiry changes are security-sensitive for fund safety. The validation prevents obviously invalid values (0, negative, overflow), but does not enforce minimum safe values. For production, consider:
    • Minimum UnilateralExitDelay of e.g., 3600 seconds (1 hour)
    • Minimum BoardingExitDelay significantly larger than UnilateralExitDelay
    • Minimum VtxoTreeExpiry of at least 1 day
  • No secrets in settings — good, these are all numeric protocol parameters.

Verdict

Well-architected change that brings significant operational flexibility. The full-replace API semantics and the interaction with the digest validation PR (#957) are the main items to address. The security-sensitive nature of exit delay parameters warrants either documented minimum safe values or enforced bounds in Validate().

@bitcoin-coder-bob bitcoin-coder-bob force-pushed the bob/settings-domain-and-repo branch from e2706bf to b21375a Compare March 25, 2026 14:06
@arkanaai
Copy link
Copy Markdown
Contributor

arkanaai Bot commented Mar 25, 2026

🔍 Iterative Review — Partial Update Support (b21375a)

New commits since last review: merge from master + Validate CheckpointExitDelay + Support partial updates in UpdateSettings

What changed: UpdateSettings now merges caller-provided fields with current DB state via Settings.Merge(), so callers can send only the fields they want to change (zero-valued fields inherit existing values).

Looks good:

  • Merge-then-validate ordering is correct — validation runs on the merged result, not the partial input
  • Lock acquired before the DB read, preventing TOCTOU between Get and Upsert
  • Good test coverage: partial update, full update, merge preserves all set fields

One concern — zero as a sentinel vs. legitimate value:
The Merge method uses == 0 to detect "not provided." This means a caller cannot intentionally set a field to 0 via UpdateSettings — it will always be treated as "keep current." For most of these fields 0 is invalid anyway (and Validate() would reject it), so this is safe in practice. But worth a doc comment on Merge noting this design choice explicitly, since VtxoNoCsvValidationCutoffDate == 0 could arguably be a valid "no cutoff" sentinel.

Minor: The comment // nil means no settings exist yet (first boot) so then skip merge has a trailing whitespace on the line above it — nit.

Overall: solid incremental improvement. Partial updates are the right UX for an admin API.

@arkanaai
Copy link
Copy Markdown
Contributor

arkanaai Bot commented Mar 25, 2026

🔍 Review — Settings Domain (iterative)

New commits since last review. Focused on the delta.

Architecture & Design ✅

Clean domain-driven approach: Settings struct with Validate(), Merge(), and repo interface across all 3 backends (Postgres, SQLite, Badger). Live-apply via onSettingsUpdated callback before persisting is a good pattern — prevents storing invalid settings.

Security ✅

  • Admin endpoints properly gated behind macaroon permissions (EntityManager read/write)
  • settingsMu protects concurrent updates
  • Validation catches uint32 overflows, amount bounds, and relational constraints (min ≤ max, unilateral ≠ boarding delay)

Concern — Zero-value merge semantics for sentinel fields

Settings.Merge() treats 0 as "not provided" and inherits from other. But several fields use 0 as a valid value:

  • SettlementMinExpiryGap = 0 means disabled
  • VtxoNoCsvValidationCutoffDate = 0 means disabled

With the current merge logic, a caller cannot set these back to 0 via a partial update — the zero will be silently replaced by the current value. Two options:

  1. Document that partial updates can't clear these to 0 (caller must send full settings)
  2. Use proto optional / wrapper types to distinguish "not set" from "set to 0"

This is a usability issue, not a security issue, but worth noting in the API docs.

Minor nits

  • defaultBanDuration = 300 (comment says "10 * 30s") — the session duration default is 30s, so this is correct but the defaultSessionDuration is at top of file as 30. Consider referencing it for clarity.
  • Postgres migration uses BIGINT GENERATED BY DEFAULT AS IDENTITY for id but the upsert always passes 0 as the singleton key. Works but GENERATED BY DEFAULT allows overrides, so this is fine.
  • UpdateSettings logs warnings for exit delay changes — good operational awareness.

Cross-repo impact

The removed env vars (ARKD_VTXO_TREE_EXPIRY, ARKD_UNILATERAL_EXIT_DELAY, etc.) mean existing deployments relying on them will silently fall back to DB defaults on upgrade. E2E tests handle this via updateSettings() in setup — good. Deployment docs should note the migration path.

Tests are thorough — domain validation, admin service CRUD, partial updates, clear idempotency, and repo-level tests across all backends.

@arkanaai
Copy link
Copy Markdown
Contributor

arkanaai Bot commented Mar 25, 2026

🔍 Review — arkd#939 Settings domain with DB persistence and admin CRUD API (iterative, post-93010e2)

Reviewing the 2 new commits: 232d5a12 (add update_fields to UpdateSettingsRequest) and 4fd6fbba (validate update_fields, derive valid names from struct).

What changed since last review

The partial-update mechanism was significantly improved. Previously UpdateSettings was full-replace only; now it supports update_fields (a list of snake_case field names) so callers can patch individual settings without needing to read-modify-write.

New code analysis

  1. Merge() method (settings.go). When updateFields is non-empty, only listed fields are applied from the request onto the stored settings. When empty, all fields are written as-is. This is a clean partial-update pattern. Unknown and duplicate field names are rejected — good.

  2. validUpdateFields generation. Built via reflection from the Settings struct, excluding UpdatedAt. The camelToSnake converter handles the Go→proto field name mapping. Potential issue: camelToSnake("VtxoNoCsvValidationCutoffDate")vtxo_no_csv_validation_cutoff_date — verify this matches the proto field name exactly. Abbreviations like "CSV" could produce c_s_v instead of csv depending on the parser. Let me check: CSVMultisig → each uppercase letter gets an underscore... Actually camelToSnake treats each uppercase rune individually, so VtxoNoCsvValidationCutoffDatev_t_x_o_n_o_c_s_v...this is wrong. The function splits on every uppercase letter, not on word boundaries. BanThresholdban_threshold ✓, but VtxoTreeExpiryvtxo_tree_expiry ✓ only because the Go field name uses Vtxo not VTXO. Since all field names use title-case words (not all-caps abbreviations), this actually works correctly. But adding a field like HTTPTimeout would break it. Worth a comment noting the limitation.

  3. Security: permissions. GetSettings requires EntityManager read, UpdateSettings/ClearSettings require EntityManager write. These are admin-only macaroon permissions — correct. No public API exposure.

  4. Live-apply ordering in UpdateSettings. The code applies to the running service before persisting to DB. If the DB write fails after live-apply succeeds, the service has settings that aren't persisted. On restart, the old DB settings would be loaded. This is a deliberate choice (commented in code) to avoid persisting invalid settings, but the failure mode means a DB-write error leaves the running service with ephemeral settings. Consider: if Upsert fails, should we roll back the live-apply? The settingsMu lock prevents concurrent mutations, so at least there's no race.

  5. ClearSettings doesn't delete — it upserts defaults. This means there's always a row in the settings table after first boot. Clean approach, avoids NULL handling.

  6. Exit delay change warnings. service.UpdateSettings() logs warnings when exit delay parameters change, noting they only affect future rounds. This is important — existing VTXOs have the old values baked into their scripts. The warnings are good operational hygiene.

  7. Config validation order change. loadSettings() is now called inside Validate(), which means the DB must be reachable during config validation. This couples config validation to DB availability — a reasonable tradeoff since the server can't start without the DB anyway, but it's a behavior change worth noting in the PR description.

  8. Migration files. Both Postgres and SQLite have matching up/down migrations. The settings table has a single-row design (no explicit PK constraint visible in the migration — check that UPSERT logic handles this correctly in the repo implementations).

Cross-repo impact

  • SDKs that display server info (e.g., ts-sdk, go-sdk) will see the same values via /v1/info since onInfoChange() is called after settings updates.
  • No breaking API changes for clients — the new endpoints are admin-only additions.

Verdict

The partial-update mechanism is well-designed. The camelToSnake function works for the current field naming convention but is fragile for future all-caps abbreviations. The live-apply-before-persist pattern has a minor inconsistency risk on DB-write failure. Overall solid — ready for human approval.

Copy link
Copy Markdown
Contributor

@arkanaai arkanaai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Arkana Review — Iterative (commits 1fc333e, 2f660ca)

Delta since last review: Remove unused defaultAllowCSVBlockType config var + seed DB-backed settings from env vars on first boot.


🔴 CRITICAL — Env-var seeded defaults bypass domain.Settings.Validate()

config.go:515-518loadSettings() calls defaultSettings()Upsert() without calling settings.Validate() first. The admin UpdateSettings API correctly calls Validate() before persisting, but the first-boot seed path does not.

Config.Validate() (called after loadSettings()) checks locktime type consistency and BanDuration >= 1, but misses these domain-level constraints:

Constraint Example bad env config
PublicUnilateralExitDelay >= UnilateralExitDelay PUBLIC=5, UNILATERAL=100 → persisted, fund safety issue
BoardingExitDelay != UnilateralExitDelay Both set to 86400 → persisted, identical script paths
RoundMaxParticipantsCount >= RoundMinParticipantsCount MIN=128, MAX=1 → persisted
CheckpointExitDelay > 0 Set to 0 → persisted
uint32 overflow guards Values > 2^32 → persisted
Satoshi amount bounds VtxoMinAmount=99999999999 → persisted

Fix: Add settings.Validate() before the Upsert call:

if settings == nil {
    settings = c.defaultSettings()
    if err := settings.Validate(); err != nil {
        return fmt.Errorf("invalid default settings: %w", err)
    }
    if err := c.repo.Settings().Upsert(ctx, *settings); err != nil {
        return fmt.Errorf("failed to seed default settings: %w", err)
    }
}

This is protocol-critical — invalid exit delays affect VTXO script safety and forfeit path timing. 🚨 Requires human review.


🟡 E2E test behavioral change — exit delay values differ

The removed updateSettings() helper used unilateral_exit_delay: 512 and boarding_exit_delay: 1024. The new docker-compose env vars use 20 and 30 respectively. Since the scheduler is block type and these are block counts, the new values are valid and arguably more appropriate for e2e tests (shorter blocks = faster tests). But it's a behavioral change — any e2e test that depends on specific delay arithmetic (e.g., waiting for a certain number of blocks for an exit) will see different behavior. Verify the full e2e suite passes with these new values.


🟢 Good

  • First-boot env var seeding is the right UX — operators can configure via env vars on initial deploy without needing a second admin API call.
  • viper.IsSet() guards correctly distinguish "env var explicitly set" from "use scheduler-aware default." The two scheduler-dependent fields (VtxoTreeExpiry, CheckpointExitDelay) fall through to inline defaults when no env var is set.
  • Comment quality is good — the doc comment on defaultSettings() clearly explains the first-boot-only semantics.
  • Removing updateSettings() from e2e simplifies test setup and makes tests more representative of real deployments.

Summary

One blocker: call settings.Validate() on seeded defaults before persisting (security-critical for exit delays). Everything else in this delta is solid.

Reviewer: Arkana (automated)
Requires: Human sign-off (protocol-critical: exit delay handling on first boot)

@bitcoin-coder-bob
Copy link
Copy Markdown
Collaborator Author

🔍 Arkana Review — Iterative (commits 1fc333e, 2f660ca)

Delta since last review: Remove unused defaultAllowCSVBlockType config var + seed DB-backed settings from env vars on first boot.

🔴 CRITICAL — Env-var seeded defaults bypass domain.Settings.Validate()

config.go:515-518loadSettings() calls defaultSettings()Upsert() without calling settings.Validate() first. The admin UpdateSettings API correctly calls Validate() before persisting, but the first-boot seed path does not.

Config.Validate() (called after loadSettings()) checks locktime type consistency and BanDuration >= 1, but misses these domain-level constraints:

Constraint Example bad env config
PublicUnilateralExitDelay >= UnilateralExitDelay PUBLIC=5, UNILATERAL=100 → persisted, fund safety issue
BoardingExitDelay != UnilateralExitDelay Both set to 86400 → persisted, identical script paths
RoundMaxParticipantsCount >= RoundMinParticipantsCount MIN=128, MAX=1 → persisted
CheckpointExitDelay > 0 Set to 0 → persisted
uint32 overflow guards Values > 2^32 → persisted
Satoshi amount bounds VtxoMinAmount=99999999999 → persisted
Fix: Add settings.Validate() before the Upsert call:

if settings == nil {
    settings = c.defaultSettings()
    if err := settings.Validate(); err != nil {
        return fmt.Errorf("invalid default settings: %w", err)
    }
    if err := c.repo.Settings().Upsert(ctx, *settings); err != nil {
        return fmt.Errorf("failed to seed default settings: %w", err)
    }
}

This is protocol-critical — invalid exit delays affect VTXO script safety and forfeit path timing. 🚨 Requires human review.

🟡 E2E test behavioral change — exit delay values differ

The removed updateSettings() helper used unilateral_exit_delay: 512 and boarding_exit_delay: 1024. The new docker-compose env vars use 20 and 30 respectively. Since the scheduler is block type and these are block counts, the new values are valid and arguably more appropriate for e2e tests (shorter blocks = faster tests). But it's a behavioral change — any e2e test that depends on specific delay arithmetic (e.g., waiting for a certain number of blocks for an exit) will see different behavior. Verify the full e2e suite passes with these new values.

🟢 Good

  • First-boot env var seeding is the right UX — operators can configure via env vars on initial deploy without needing a second admin API call.
  • viper.IsSet() guards correctly distinguish "env var explicitly set" from "use scheduler-aware default." The two scheduler-dependent fields (VtxoTreeExpiry, CheckpointExitDelay) fall through to inline defaults when no env var is set.
  • Comment quality is good — the doc comment on defaultSettings() clearly explains the first-boot-only semantics.
  • Removing updateSettings() from e2e simplifies test setup and makes tests more representative of real deployments.

Summary

One blocker: call settings.Validate() on seeded defaults before persisting (security-critical for exit delays). Everything else in this delta is solid.

Reviewer: Arkana (automated) Requires: Human sign-off (protocol-critical: exit delay handling on first boot)

i added the settings.Validate() per suggestion. done in: 29e4f2b

Per the 512/1024 seconds the change i made was intentional to remove the out of band updateSettings() can and let env vars be single source of truth.

Copy link
Copy Markdown
Contributor

@arkanaai arkanaai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Arkana Review — Commit 29e4f2b (validate seeded defaults)

Delta since last review: Single commit addressing the critical finding from my previous review.

✅ Fix confirmed — config.go:519-522

loadSettings() now calls settings.Validate() before Upsert() on the first-boot seed path. This closes the gap where a bad env-var combination (e.g. PublicUnilateral < Unilateral, RoundMin > RoundMax) could be silently persisted to the DB without validation.

The fix is correct:

  • Same Validate() call used by the admin UpdateSettings path
  • Error is returned with descriptive wrapping ("invalid default settings: %w")
  • Placement is right: after defaultSettings(), before Upsert()

No new findings. All prior critical/high issues have been addressed across the review iterations.

⚠️ Requires: Human sign-off (protocol-critical: settings control exit delays, locktime values, and round lifecycle parameters)

Reviewer: Arkana (automated)

Copy link
Copy Markdown
Contributor

@arkanaai arkanaai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Arkana Review — Iterative (commits 0f734ec, 5142125, ece9dd1)

Reviewer: Arkana (automated)

Changes since last review (29e4f2b):

  • Two merge-master commits (conflict resolution)
  • 5142125: Removes 3 unused default constants (defaultRoundMaxParticipantsCount, defaultRoundMinParticipantsCount, defaultVtxoNoCsvValidationCutoffDate) that leaked in from the merge

Assessment: Clean housekeeping. Verified via git grep that the removed constants have zero remaining references on this branch. No new logic, no new risk surface.

Status: No new issues. My previous approvals and protocol-critical flag (exit delay handling requiring human sign-off) still stand.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add settings domain and repo

1 participant