Skip to content

feat: add CEL-based batch_trigger gate for round start#1046

Open
Kukks wants to merge 4 commits intomasterfrom
feat/batch-trigger-cel
Open

feat: add CEL-based batch_trigger gate for round start#1046
Kukks wants to merge 4 commits intomasterfrom
feat/batch-trigger-cel

Conversation

@Kukks
Copy link
Copy Markdown
Contributor

@Kukks Kukks commented Apr 23, 2026

Summary

Closes #1045.

Adds an optional ARKD_BATCH_TRIGGER CEL formula that decides whether the server should start a new batch round. The gate is evaluated at the top of startRound(). If the formula returns false, the round is skipped and rechecked one registration window (session_duration / 6) later. When unset (default), the legacy "always start a round" behaviour is preserved.

The new pkg/ark-lib/batchtrigger package mirrors the design of pkg/ark-lib/arkfee — CEL programs are compiled once at startup and reused per evaluation. Misconfigured programs fail validation in Config.Validate() so they never make it past server startup. Runtime eval errors fail open (logged at warn level) so a buggy formula cannot permanently halt rounds.

CEL environment

Variable Type Description
intents_count double Number of pending intents queued
current_feerate double Current mempool fee rate (sat/kvbyte)
time_since_last_batch double Seconds since the last finalized batch (0 until first batch)
boarding_inputs_count double Total pending boarding UTXOs across queued intents
total_boarding_amount double Total satoshis across pending boarding UTXOs
total_intent_fees double Sum of (inputs - outputs) across pending intents

now() -> double is also available.

Example

intents_count > 1.0 && (current_feerate <= 2.0 || time_since_last_batch >= 3600.0)

Test plan

  • go test ./pkg/ark-lib/batchtrigger/... -v -count=1 — all subtests pass (compile/parse + eval against valid & invalid fixtures, all variables accessible)
  • go test ./internal/core/application/ -run TestAggregateIntentTriggerData -v — pure aggregation helper covered (5 cases)
  • go test ./internal/core/application/ -run TestEvalBatchTrigger -v — gate evaluation incl. issue feat: CEL formula to control batch trigger conditions #1045 examples (9 cases)
  • GOOS=linux go vet ./... clean
  • GOOS=linux go build ./... clean
  • CI green on PR

Notes on review feedback

CodeRabbit flagged the following, all addressed in 5f12bb66:

  • collectTriggerContext previously called Intents().Len() and Intents().ViewAll() separately, which could disagree under concurrent registrations. Both now derive from the same ViewAll snapshot.
  • The CEL eval failure path now logs at warn (not error) — matches the "fail open" wording in the README.
  • batchtrigger.Trigger keeps cel.Program in a named (unexported) field so the package's public API is just Eval/Source — callers can no longer bypass Trigger.Eval's nil guard.
  • batchtrigger.Parse now validates the output type via Type.IsExactType(cel.BoolType) instead of pointer equality.
  • The now() overload doesn't need LateFunctionBinding because BatchTriggerEnv does not enable the constant folding optimizer (cel.OptOptimize); a comment now documents this.

The two duplicate-compilation nitpicks (compile in Validate() then again in NewService) and the "skip ViewAll when the formula doesn't reference aggregates" optimization were left as future work — the cost is microseconds at startup / per recheck, and the alternatives meaningfully grow the API surface.

Summary by CodeRabbit

  • New Features
    • Optional conditional batch-round triggering: operators can set ARKD_BATCH_TRIGGER with a CEL boolean expression to control when rounds start; unset preserves automatic starts.
  • Documentation
    • README and package docs describe the new config, available input variables, typing rules, and a now() helper for expressions.
  • Tests
    • Added test suites and fixtures validating trigger parsing and evaluation across scenarios.

Add an optional ARKD_BATCH_TRIGGER CEL formula that decides whether
the server should start a new batch round. When unset (default), the
legacy "always start a round" behaviour is preserved.

Exposes intents_count, current_feerate, time_since_last_batch,
boarding_inputs_count, total_boarding_amount, total_intent_fees and a
now() helper to the program. Programs are compiled once at startup
and re-used per round; eval errors fail open so a buggy formula does
not permanently halt rounds.

Closes #1045
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 23, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 769401d5-5b90-432e-ae6b-25a715c969c1

📥 Commits

Reviewing files that changed from the base of the PR and between bf70018 and 5f12bb6.

📒 Files selected for processing (5)
  • internal/core/application/batch_trigger_test.go
  • internal/core/application/service.go
  • pkg/ark-lib/batchtrigger/celenv/functions.go
  • pkg/ark-lib/batchtrigger/trigger.go
  • pkg/ark-lib/batchtrigger/trigger_test.go
✅ Files skipped from review due to trivial changes (1)
  • internal/core/application/service.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • pkg/ark-lib/batchtrigger/celenv/functions.go
  • pkg/ark-lib/batchtrigger/trigger.go
  • pkg/ark-lib/batchtrigger/trigger_test.go
  • internal/core/application/batch_trigger_test.go

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 18.75% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding a CEL-based batch trigger gate to control batch round startup, which is the core objective of PR #1045.
Linked Issues check ✅ Passed The PR fulfills all acceptance criteria: implements batch_trigger CEL program in config, exposes required variables (intents_count, current_feerate, time_since_last_batch, plus additional ones), maintains sane default behavior (unset = always start), includes documentation and config examples.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing the CEL-based batch trigger gate as specified in issue #1045; no unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/batch-trigger-cel

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@arkanaai arkanaai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Arkana Code Review — arkd#1046

feat: add CEL-based batch_trigger gate for round start

Overall: Well-structured PR that follows the established arkfee CEL pattern. The package decomposition, compile-once-eval-many design, fail-open semantics, and test coverage are solid. No cross-repo breakage — NewService is internal and all callers are updated.

That said, this touches round lifecycle (when rounds start). Flagging for mandatory human review per protocol.


🟡 Issues

1. Cold-start footgun: time_since_last_batch == 0 is ambiguous (Medium)

pkg/ark-lib/batchtrigger/types.go:18, internal/core/application/service.go:2434-2438

After server restart, lastBatchAt is 0 (not persisted), so time_since_last_batch is 0. This makes 0 mean both "never batched since boot" and "batched 0 seconds ago" — semantically different states.

Practical consequence: An operator writes:

intents_count > 5.0 || time_since_last_batch >= 3600.0

After a restart with 3 pending intents, the server waits up to an hour before the first batch despite having work to do, because time_since_last_batch reads as 0 (recently batched) instead of ∞ (never batched).

Suggested fix: Initialize lastBatchAt to 0 but treat 0 as "infinitely stale" — i.e., when last == 0, set TimeSinceLastBatch to math.MaxInt64 (or a large sentinel like 999999). Alternatively, persist lastBatchAt to the repo/cache so it survives restarts. Document whichever choice you make prominently in the README.

2. Double cache read creates minor inconsistency (Low)

internal/core/application/service.go:2420-2447

collectTriggerContext calls s.cache.Intents().Len(ctx) for IntentsCount, then separately calls s.cache.Intents().ViewAll(ctx, nil) for aggregation. Between the two calls, intents can be pushed or popped. Result: intents_count in the CEL context may not match the data backing boarding_inputs_count/total_intent_fees.

Suggested fix: Drop the Len() call. Use int64(len(intents)) from the ViewAll result instead. Single read, always consistent.

3. Redundant cache cleanup on retry (Low)

internal/core/application/service.go:2299-2333 vs 2528-2543

When the gate denies and retries, startRound re-enters and hits the existingRound != nil branch again, running the full cleanup (DeleteVtxos, Reset forfeit txs, etc.) on already-cleaned data. These ops are idempotent so it's not a bug, but it's wasted work every sessionDuration/6 seconds while the gate denies.

Suggested fix: Move the gate check above the cleanup block, or add a guard (e.g., if existingRound.Id != lastCleanedRoundId). Not blocking.


🟢 Things Done Well

  • Compile-time validation: Invalid CEL programs are rejected in Config.Validate() before the server boots. Operators get fast feedback.
  • Fail-open on eval errors: A runtime eval failure starts the batch rather than halting it. Correct safety direction for a financial system.
  • Atomic lastBatchAt: No mutex needed, correct concurrent access pattern.
  • Test coverage: 5 aggregation cases + 9 eval cases + nil/empty/all-variables tests + JSON fixture-driven tests for the package itself. Solid.
  • DeleteVtxos doesn't affect intent data: Confirmed — it operates on a separate vtxo tracking set. Pending intents survive cleanup and remain readable by the trigger.

❓ Questions for Author

  1. Was the cold-start time_since_last_batch == 0 behavior intentional? The README says "0 if no batch finalized since boot" but this makes time-based escape hatches fail on fresh starts.
  2. Any plans to persist lastBatchAt to survive restarts? For long time_since_last_batch thresholds (hours), losing this on restart is operationally painful.

🔴 Human Review Required

This PR modifies round lifecycle — specifically when startRound() proceeds vs. defers. While the gate can only delay rounds (never corrupt them), and the fail-open design limits blast radius, this is protocol-adjacent code on a Bitcoin L2. A human must review and approve before merge.

cc @Kukks — nice clean PR overall. The cold-start semantics (#1) are the main thing I'd want addressed before shipping.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (5)
pkg/ark-lib/batchtrigger/trigger.go (2)

19-22: Embedded cel.Program promotes the entire CEL API onto *Trigger.

With an anonymous cel.Program field, callers can bypass Trigger.Eval and call trigger.Eval(map[string]any{...}) (the cel.Program method) directly, sidestepping the nil-receiver guard and native-bool conversion. Making the program a named field prevents accidental misuse and keeps the public surface minimal.

♻️ Proposed change
 type Trigger struct {
-	cel.Program
+	program cel.Program
 	txt     string
 }

Then update Eval to use t.program.Eval(...) (line 68). Parse at line 51 becomes &Trigger{program: prg, txt: txt}.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/ark-lib/batchtrigger/trigger.go` around lines 19 - 22, The Trigger struct
currently embeds cel.Program which exposes the whole CEL API on *Trigger; change
the anonymous field to a named field (e.g., program cel.Program) in the Trigger
type, update Parse (where prg and txt are returned) to construct the trigger as
&Trigger{program: prg, txt: txt}, and modify Trigger.Eval to call
t.program.Eval(...) (and retain the nil-receiver/natural-bool conversion logic)
so callers cannot bypass Trigger.Eval by calling the embedded cel.Program
directly.

43-45: Use IsExactType instead of pointer comparison for CEL type equality.

ast.OutputType() != cel.BoolType relies on pointer equality with the singleton cel.BoolType instance. While this works for primitives in cel-go, the semantically correct and idiomatic approach is Type.IsExactType(). CEL-go's own type checker uses IsExactType for type comparisons. Using pointer inequality is fragile and could silently fail if cel-go's implementation wraps or caches types differently in the future.

Suggested fix
-	if ast.OutputType() != cel.BoolType {
+	if !ast.OutputType().IsExactType(cel.BoolType) {
 		return nil, fmt.Errorf("expected return type bool, got %v", ast.OutputType())
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/ark-lib/batchtrigger/trigger.go` around lines 43 - 45, The comparison
using pointer equality on CEL types is fragile: replace the check in the trigger
validation that currently does ast.OutputType() != cel.BoolType with a semantic
type check using IsExactType (e.g.,
ast.OutputType().IsExactType(cel.BoolType())) so the code uses CEL's
Type.IsExactType API; update the error path in the same function (where the
fmt.Errorf is returned) to be triggered when IsExactType returns false.
internal/core/application/service.go (2)

148-151: Duplicate CEL compilation — see comment in config.go.

The program is already compiled in Config.Validate() for validation purposes; the same text is compiled again here. Not incorrect, but easy to dedupe by plumbing the compiled *batchtrigger.Trigger through instead of the raw string. Downstream impact of the root cause in internal/config/config.go (Validate).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/application/service.go` around lines 148 - 151, The code
recompiles the CEL program here using batchtrigger.New(batchTriggerProgram) even
though Config.Validate() already compiles it; instead, plumb the compiled
*batchtrigger.Trigger through the config into the service so you reuse the
existing compiled trigger rather than the raw string. Modify the Config to
expose the compiled *batchtrigger.Trigger produced in Config.Validate() (or add
a getter), update the service constructor to accept that compiled trigger
instead of batchTriggerProgram, and replace the call to batchtrigger.New in the
service (the batchTrigger variable) with the passed-in compiled
*batchtrigger.Trigger to avoid duplicate compilation.

2435-2442: ViewAll on every gate recheck can be costly with large intent queues.

When the gate denies, the recheck cadence is sessionDuration/6 (default 5s). Each recheck materializes every pending intent via ViewAll and walks them to aggregate boarding/fee data. For deployments with a growing intent backlog (which is precisely the scenario where a trigger like intents_count > N is useful), this O(N) pass per 5s is wasted work whenever the formula doesn't actually reference the aggregates.

Consider either:

  • only calling ViewAll when the formula references a boarding/fee variable (cel-go exposes the reference set via cel.AstToCheckedExpr / checked references), or
  • caching the aggregates for a short TTL.

Not a blocker, but worth a note for future optimization.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/core/application/service.go` around lines 2435 - 2442, The code
always calls s.cache.Intents().ViewAll and aggregateIntentTriggerData during
every gate recheck which is costly; modify the gate recheck path to avoid the
O(N) pass unless needed by (a) detecting whether the configured CEL trigger
expression references the boarding/fee fields (e.g., inspect the CEL checked AST
or reference set for names like BoardingInputsCount, TotalBoardingAmount,
TotalIntentFees) and only call ViewAll + aggregateIntentTriggerData when those
names are present, or (b) compute and store the aggregates once and reuse them
for a short TTL before re-materializing; update the logic around
s.cache.Intents().ViewAll, aggregateIntentTriggerData, and the trigger
evaluation to implement one of these two approaches.
internal/config/config.go (1)

664-666: Consider caching the compiled trigger on Config to avoid double-compilation.

batchtrigger.New(c.BatchTrigger) is invoked here purely for validation, and the compiled *batchtrigger.Trigger is discarded. application.NewService (line 148 in service.go) then recompiles the exact same program text. For a simple program this is cheap, but CEL compilation is not free and the duplication means any divergence between the two call sites (e.g. someone adds a wrapper or option) silently breaks the "validated at startup" guarantee.

Optional: store the compiled trigger on *Config and plumb it through NewService instead of the raw string.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/config/config.go` around lines 664 - 666, The code currently calls
batchtrigger.New(c.BatchTrigger) only to validate and discards the compiled
*batchtrigger.Trigger, causing NewService to recompile the same program; change
Config to store the compiled trigger (add a field like CompiledBatchTrigger
*batchtrigger.Trigger), have the config initialization call
batchtrigger.New(c.BatchTrigger) once and assign the result to that field,
update any callers (notably application.NewService) to accept and use the
compiled *batchtrigger.Trigger instead of recompiling from c.BatchTrigger
string, and ensure error paths and JSON/serialization handling for the new
Config field are addressed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/core/application/service.go`:
- Around line 2456-2462: The code logs CEL evaluation failures at error level,
contradicting the README/PR which says such runtime evaluation errors are "warn
level and fail open"; update the log call in the batch trigger evaluation block
(the s.batchTrigger.Eval error handling that currently uses
log.WithError(err).Errorf and references s.batchTrigger.Source()) to use
log.WithError(err).Warnf(...) so failures are warned and the code continues to
"fail open" as intended.
- Around line 2413-2445: In collectTriggerContext, remove the separate
s.cache.Intents().Len(ctx) call (and its log) to avoid a non-atomic race between
Intents().Len and Intents().ViewAll; instead call Intents().ViewAll(ctx, nil)
once, then set tc.IntentsCount from the returned intents slice length (convert
to the same numeric type as tc.IntentsCount if needed), and keep the aggregation
via aggregateIntentTriggerData(intents) to populate
BoardingInputsCount/TotalBoardingAmount/TotalIntentFees so all fields derive
from the same snapshot.

In `@pkg/ark-lib/batchtrigger/celenv/functions.go`:
- Around line 12-20: The nowFunction binding is non-deterministic but uses
cel.FunctionBinding which gets constant-folded at compile time; change the
implementation for the "now" overload to avoid folding by either returning
types.Unknown from the FunctionBinding or (preferably) switching to
cel.LateFunctionBinding for the now overload so it is evaluated at runtime, and
remove the redundant float64 cast (use types.Double with the int64
time.Now().Unix() directly or let types.Double handle conversion); update the
overload binding in nowFunction accordingly.

---

Nitpick comments:
In `@internal/config/config.go`:
- Around line 664-666: The code currently calls batchtrigger.New(c.BatchTrigger)
only to validate and discards the compiled *batchtrigger.Trigger, causing
NewService to recompile the same program; change Config to store the compiled
trigger (add a field like CompiledBatchTrigger *batchtrigger.Trigger), have the
config initialization call batchtrigger.New(c.BatchTrigger) once and assign the
result to that field, update any callers (notably application.NewService) to
accept and use the compiled *batchtrigger.Trigger instead of recompiling from
c.BatchTrigger string, and ensure error paths and JSON/serialization handling
for the new Config field are addressed.

In `@internal/core/application/service.go`:
- Around line 148-151: The code recompiles the CEL program here using
batchtrigger.New(batchTriggerProgram) even though Config.Validate() already
compiles it; instead, plumb the compiled *batchtrigger.Trigger through the
config into the service so you reuse the existing compiled trigger rather than
the raw string. Modify the Config to expose the compiled *batchtrigger.Trigger
produced in Config.Validate() (or add a getter), update the service constructor
to accept that compiled trigger instead of batchTriggerProgram, and replace the
call to batchtrigger.New in the service (the batchTrigger variable) with the
passed-in compiled *batchtrigger.Trigger to avoid duplicate compilation.
- Around line 2435-2442: The code always calls s.cache.Intents().ViewAll and
aggregateIntentTriggerData during every gate recheck which is costly; modify the
gate recheck path to avoid the O(N) pass unless needed by (a) detecting whether
the configured CEL trigger expression references the boarding/fee fields (e.g.,
inspect the CEL checked AST or reference set for names like BoardingInputsCount,
TotalBoardingAmount, TotalIntentFees) and only call ViewAll +
aggregateIntentTriggerData when those names are present, or (b) compute and
store the aggregates once and reuse them for a short TTL before
re-materializing; update the logic around s.cache.Intents().ViewAll,
aggregateIntentTriggerData, and the trigger evaluation to implement one of these
two approaches.

In `@pkg/ark-lib/batchtrigger/trigger.go`:
- Around line 19-22: The Trigger struct currently embeds cel.Program which
exposes the whole CEL API on *Trigger; change the anonymous field to a named
field (e.g., program cel.Program) in the Trigger type, update Parse (where prg
and txt are returned) to construct the trigger as &Trigger{program: prg, txt:
txt}, and modify Trigger.Eval to call t.program.Eval(...) (and retain the
nil-receiver/natural-bool conversion logic) so callers cannot bypass
Trigger.Eval by calling the embedded cel.Program directly.
- Around line 43-45: The comparison using pointer equality on CEL types is
fragile: replace the check in the trigger validation that currently does
ast.OutputType() != cel.BoolType with a semantic type check using IsExactType
(e.g., ast.OutputType().IsExactType(cel.BoolType())) so the code uses CEL's
Type.IsExactType API; update the error path in the same function (where the
fmt.Errorf is returned) to be triggered when IsExactType returns false.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7fdbeaaf-eb87-4bbf-b362-82dbb44a7a48

📥 Commits

Reviewing files that changed from the base of the PR and between 21fbae5 and bf70018.

📒 Files selected for processing (15)
  • README.md
  • envs/arkd.dev.env
  • envs/arkd.light.env
  • internal/config/config.go
  • internal/core/application/batch_trigger_test.go
  • internal/core/application/service.go
  • pkg/ark-lib/batchtrigger/README.md
  • pkg/ark-lib/batchtrigger/celenv/env.go
  • pkg/ark-lib/batchtrigger/celenv/functions.go
  • pkg/ark-lib/batchtrigger/celenv/variables.go
  • pkg/ark-lib/batchtrigger/testdata/invalid.json
  • pkg/ark-lib/batchtrigger/testdata/valid.json
  • pkg/ark-lib/batchtrigger/trigger.go
  • pkg/ark-lib/batchtrigger/trigger_test.go
  • pkg/ark-lib/batchtrigger/types.go

Comment thread internal/core/application/service.go
Comment thread internal/core/application/service.go
Comment on lines +12 to +20
var nowFunction = cel.Function("now",
cel.Overload("now_double",
[]*cel.Type{},
cel.DoubleType,
cel.FunctionBinding(func(_ ...ref.Val) ref.Val {
return types.Double(float64(time.Now().Unix()))
}),
),
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

cel-go cel.FunctionBinding non-deterministic function constant folding

💡 Result:

In cel-go, the constant folding optimizer evaluates function calls with constant arguments at compile time to replace them with literal results, improving runtime performance. However, it skips folding for non-deterministic or side-effecting functions to avoid incorrect optimization. Key mechanisms: - The optimizer (NewConstantFoldingOptimizer) matches calls where all arguments are constant literals and attempts to evaluate them using a temporary program with NoVars activation. - For functions intended to have side effects or be non-deterministic (e.g., random, now, logging), the implementation should return types.Unknown during constant folding evaluation. This prevents folding, as the optimizer previously errored on missing implementations but was fixed to handle Unknown gracefully (PR #1117). - Late-bound functions via cel.LateFunctionBinding (added in PR #1117, merged Apr 2025) are explicitly skipped: isLateBoundFunctionCall check in folding.go prevents folding these calls, as their runtime binding makes them unsuitable for static optimization. This ensures soundness: deterministic pure functions fold safely; non-deterministic ones do not, preserving correct evaluation behavior even under cel.OptOptimize.

Citations:


🏁 Script executed:

cat -n pkg/ark-lib/batchtrigger/celenv/functions.go | head -40

Repository: arkade-os/arkd

Length of output: 623


🏁 Script executed:

# Check if types.Double accepts int64 directly
rg "func Double" --type go -A 2

Repository: arkade-os/arkd

Length of output: 40


🏁 Script executed:

# Look for usage of FunctionBinding vs LateFunctionBinding in the codebase
rg "FunctionBinding|LateFunctionBinding" --type go

Repository: arkade-os/arkd

Length of output: 246


🏁 Script executed:

# Check the types.Double function signature in cel-go
rg "types\.Double" --type go -B 2 -A 2 | head -30

Repository: arkade-os/arkd

Length of output: 743


🏁 Script executed:

# Check if there's any usage of types.Unknown in the codebase for non-deterministic functions
rg "types\.Unknown" --type go | head -10

Repository: arkade-os/arkd

Length of output: 40


🏁 Script executed:

# Look at other function implementations in the same file
cat -n pkg/ark-lib/batchtrigger/celenv/functions.go

Repository: arkade-os/arkd

Length of output: 623


🏁 Script executed:

# Check cel-go types package to see Double signature
# First, let's look for the types.Double definition in any vendored code or local definitions
fd "types.go" --type f | grep -E "common|types" | head -5

Repository: arkade-os/arkd

Length of output: 236


🏁 Script executed:

# Search for Double function definition with more context
rg "func.*Double\(" --type go -A 3 | head -30

Repository: arkade-os/arkd

Length of output: 40


🏁 Script executed:

# Check if there's any documentation or examples of non-deterministic functions in cel-go usage
rg "now\|random\|uuid" --type go pkg/

Repository: arkade-os/arkd

Length of output: 40


🏁 Script executed:

# Check the cel-go dependency version in go.mod
grep "cel-go" go.mod

Repository: arkade-os/arkd

Length of output: 103


🏁 Script executed:

# Look for how now() is used in tests or actual CEL expressions
rg "now\(\)" --type go -B 2 -A 2

Repository: arkade-os/arkd

Length of output: 1174


🌐 Web query:

cel-go types.Double int64 signature

💡 Result:

In CEL-Go, types.Double (types.DoubleType) and int64 (types.IntType) do not support operator overloads or function calls between them. Both types implement traits like Adder, Comparer, etc., but their method implementations (e.g., Add, Compare) are type-specific and only accept arguments of the exact same type. For example: - types.Double.Add(other ref.Val) checks if other is types.Double, else returns "no such overload". - types.Int.Add(other ref.Val) checks if other is types.Int, else returns "no such overload". Operator signatures like '+' ("+") are registered separately for each type: add_double (double + double -> double), add_int64 (int + int -> int). There are no cross-type overloads like double + int64 or int64 + double. To operate across types, explicit conversion is required, e.g., double(1i) or int(1.0d). This aligns with CEL's no-implicit-type-coercion design. Operator overload names from common/overloads include LessInt64Double ("less_int64_double"), confirming comparison overloads exist for int64-double, but core arithmetic like add does not for built-in types.

Citations:


🌐 Web query:

types.Double function constructor Go cel-go

💡 Result:

In the cel-go library (github.com/google/cel-go), the types.Double is a type alias for float64 defined in the github.com/google/cel-go/common/types package. It implements the ref.Val interface to support CEL operations like arithmetic, comparison, and type conversion. There is no dedicated "function constructor" like types.NewDouble or types.DoubleOf. Instead, create a types.Double value by type conversion: go import "github.com/google/cel-go/common/types" d := types.Double(3.14) // float64 value wrapped as types.Double This works because Double is type Double float64, allowing direct conversion from float64. The value supports methods like Add, Subtract, Multiply, Divide, Type (returns DoubleType), Value (returns underlying float64), and ConvertToNative. DoubleType is a singleton *types.Type exported from the package, used for type checks and declarations, e.g., cel.Variable("x", types.DoubleType). CEL expressions parse float literals (e.g., 3.14) directly as Double values during evaluation, without needing explicit construction in Go code. Sources confirm Double float64 and method implementations, with DoubleType singleton, but no exported factory function beyond the type conversion.

Citations:


Critical issue: Non-deterministic function vulnerable to constant folding.

The implementation has a correctness bug: cel-go DOES constant-fold cel.FunctionBindingcalls at compile time by default, but this function is non-deterministic and lacks protection. The constant folding optimizer will evaluatenow()` once at compile time and replace all calls with a literal timestamp, breaking the intended semantics of returning the current time on each evaluation.

To prevent folding, either:

  1. Return types.Unknown from the binding to signal non-determinism, or
  2. Use cel.LateFunctionBinding instead of cel.FunctionBinding (explicitly excluded from folding as of v0.26.1).

Secondary note: The float64(time.Now().Unix()) cast is redundant; types.Double accepts int64 directly and will implicitly convert it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/ark-lib/batchtrigger/celenv/functions.go` around lines 12 - 20, The
nowFunction binding is non-deterministic but uses cel.FunctionBinding which gets
constant-folded at compile time; change the implementation for the "now"
overload to avoid folding by either returning types.Unknown from the
FunctionBinding or (preferably) switching to cel.LateFunctionBinding for the now
overload so it is evaluated at runtime, and remove the redundant float64 cast
(use types.Double with the int64 time.Now().Unix() directly or let types.Double
handle conversion); update the overload binding in nowFunction accordingly.

Copy link
Copy Markdown
Contributor

@arkanaai arkanaai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Arkana — Follow-up review (commit a5945b3)

New commit a5945b3 is cosmetic onlygofmt struct field alignment in two test files (batch_trigger_test.go, trigger_test.go). No logic changes, no new code paths.

Previous findings still stand:

  1. 🟡 Cold-start footguntime_since_last_batch == 0 is ambiguous (never-batched vs just-batched)
  2. 🟡 Double cache readLen() + ViewAll() can diverge under concurrent intent churn

No new issues introduced by this commit. Still flagged for human review per protocol (round lifecycle gate).

Copy link
Copy Markdown
Contributor

@arkanaai arkanaai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Arkana — Follow-up review (commit ae0858e)

New commit ae0858e is a lint-only fix — drops redundant embedded struct selectors in aggregateIntentTriggerData (service.go:2399-2400):

// before (QF1008 lint warning)
inputAmount := it.Intent.TotalInputAmount() + boardingAmount
outputAmount := it.Intent.TotalOutputAmount()

// after
inputAmount := it.TotalInputAmount() + boardingAmount
outputAmount := it.TotalOutputAmount()

TimedIntent embeds domain.Intent (ports/live_store.go:93), so promoted method access is semantically identical. ✅ No logic change.

Previous findings still stand (from initial review):

  1. 🟡 Cold-start footguntime_since_last_batch == 0 is ambiguous (never-batched vs just-batched)
  2. 🟡 Double cache readLen() + ViewAll() can diverge under concurrent intent churn

No new issues introduced. Waiting on author response to the above before approving.

- collectTriggerContext: derive IntentsCount from the same ViewAll snapshot
  used for the boarding/fee aggregates, eliminating the race between Len()
  and ViewAll().
- evalBatchTrigger: log eval failures at warn level (matches the README
  description of failing open).
- batchtrigger.Trigger: keep cel.Program in a named (unexported) field so
  the package's public surface is just Eval/Source.
- batchtrigger.Parse: validate output type via Type.IsExactType instead of
  pointer equality.
- celenv: clarify in the now() comment why the binding is safe (no
  constant-folding optimizer is enabled on BatchTriggerEnv).
Copy link
Copy Markdown
Contributor

@arkanaai arkanaai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Arkana — Follow-up review (commit 5f12bb6)

Commit 5f12bb6 ("refactor: address CodeRabbit review on batch_trigger") — all changes are correct.

What changed

Change Verdict
Double-read race fixedLen() removed, IntentsCount derived from len(intents) via single ViewAll() call ✅ Exactly addresses finding #2 from initial review
ErrorfWarnf on eval failure ✅ Consistent with fail-open semantics
Unexported program fieldcel.Program was embedded, now program cel.Program ✅ Proper encapsulation, no accidental interface leak
!ast.OutputType().IsExactType(cel.BoolType) instead of != ✅ Safer type comparison (pointer equality on *Type is fragile across CEL versions)
Drop redundant float64() cast in now() int64float64 implicit conversion is fine
now() constant-folding comment ✅ Good documentation

Outstanding

🟡 Cold-start footgun (finding #1) — time_since_last_batch == 0 is still ambiguous between "never batched" and "just batched." This is an operator-facing documentation concern, not a correctness bug. Consider mentioning it in pkg/ark-lib/batchtrigger/README.md so operators know to use time_since_last_batch <= 0.0 or an explicit sentinel to handle the never-batched case.

No new issues. Still flagged for human review per protocol (round lifecycle gate).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: CEL formula to control batch trigger conditions

1 participant