Skip to content

feat: implement resource reclamation#398

Open
troian wants to merge 3 commits into
mainfrom
reclaim
Open

feat: implement resource reclamation#398
troian wants to merge 3 commits into
mainfrom
reclaim

Conversation

@troian
Copy link
Copy Markdown
Member

@troian troian commented May 28, 2026

Description

Adds provider support for resource reclamation. Providers can advertise a reclamation window in bids, leases in the reclaiming state continue running during that window, and the cluster service waits for the close event before tearing workloads down.

This also aligns the provider dependency replacement for the Cosmos SDK Akash fork with the declared v0.53.6 requirement and updates the E2E oracle setup to use the oracle v2 types used by pkg.akt.dev/node/v2.

Tests

  • GOTOOLCHAIN=go1.26.2 GOWORK=off go mod tidy
  • GOTOOLCHAIN=go1.26.2 GOWORK=off go list -m -json github.com/cosmos/cosmos-sdk
  • GOTOOLCHAIN=go1.26.2 GOWORK=off go build -mod=readonly -a ./...
  • GOTOOLCHAIN=go1.26.2 GOWORK=off go test -mod=readonly -count=1 -tags "e2e" ./integration/... -run "^$"
  • GOTOOLCHAIN=go1.26.2 GOWORK=off go test -mod=readonly ./cmd/provider-services ./cmd/provider-services/cmd ./testutil/provider -count=1
  • GOTOOLCHAIN=go1.26.2 GOWORK=off go test -mod=readonly ./bidengine -run Test_ScalePricingForIPs -count=1
  • git diff --check

Signed-off-by: Artur Troian <troian@users.noreply.github.com>
@troian troian requested a review from a team as a code owner May 28, 2026 17:41
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 28, 2026

Review Change Stack

Walkthrough

Adds an optional ReclamationWindow duration, wires a CLI flag into provider config, forwards it into bid messages, updates service and lease event handling to account for reclamation state, and bumps dependencies/tests for compatibility.

Changes

Reclamation Window Configuration and Integration

Layer / File(s) Summary
Configuration model additions
config.go, bidengine/config.go
Provider and bid engine configs add an optional ReclamationWindow *time.Duration field.
CLI flag definition and configuration reading
cmd/provider-services/cmd/run.go, cmd/provider-services/cmd/flags.go
New reclamation-window CLI flag is registered and bound to Viper, validated (must be >= 0), read at startup, and set into provider config only when > 0.
Service configuration plumbing
service.go
Provider service passes cfg.ReclamationWindow into bidengine.NewService configuration.
Bid message integration
bidengine/order.go
order.run sets MsgCreateBid.ReclamationWindow from the bidengine config when building bid transactions.
Lease state and reclamation event handling
cluster/manager.go, cluster/service.go
checkLeaseActive permits LeaseReclaiming alongside LeaseActive for deployments; service logs EventLeaseReclaimStarted with lease ID, reason, and deadline.
Dependency updates and tests
go.mod, integration/e2e_test.go
Go toolchain and multiple modules updated; integration test updated to oracle v2 imports and message shape.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🐰 A reclamation window hops into view,
Configuration flags and bids all ring true,
Leases may linger while the deadline's near,
Grace threads the flow, so teardown's crystal clear! ✨

sequenceDiagram
  participant CLI
  participant ProviderService
  participant BidEngine
  participant ClusterService

  CLI->>ProviderService: start (reads `reclamation-window` via Viper)
  ProviderService->>BidEngine: NewService(cfg with ReclamationWindow)
  BidEngine->>BidEngine: order.run builds MsgCreateBid (sets ReclamationWindow)
  BidEngine->>ProviderService: broadcast MsgCreateBid (tx)
  ClusterService->>ClusterService: receive EventLeaseReclaimStarted (logs leaseID, reason, deadline)
Loading
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: implement resource reclamation' accurately and clearly summarizes the main change—implementing resource reclamation support across the provider codebase.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The pull request description clearly describes the changeset: it adds provider support for resource reclamation with specific implementation details (reclamation window in bids, leases continuing during reclamation, cluster service waiting for close events).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch reclaim

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@go.mod`:
- Around line 347-355: Update the vulnerable OTel SDK modules: change the
versions for go.opentelemetry.io/otel/sdk and
go.opentelemetry.io/otel/sdk/metric to at least v1.43.0 (or a newer compatible
release) so they are aligned with the other otel dependencies (e.g.,
go.opentelemetry.io/otel v1.41.0); after editing the go.mod entries for
go.opentelemetry.io/otel/sdk and go.opentelemetry.io/otel/sdk/metric, run the
module update (e.g., go get ./... or go get go.opentelemetry.io/otel/sdk@v1.43.0
&& go get go.opentelemetry.io/otel/sdk/metric@v1.43.0) and then go mod tidy to
refresh go.sum.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 88344d80-f710-443d-913c-02e756f3ddb0

📥 Commits

Reviewing files that changed from the base of the PR and between 4324522 and 611d8c7.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (9)
  • bidengine/config.go
  • bidengine/order.go
  • cluster/manager.go
  • cluster/service.go
  • cmd/provider-services/cmd/flags.go
  • cmd/provider-services/cmd/run.go
  • config.go
  • go.mod
  • service.go

Comment thread cmd/provider-services/cmd/run.go
Comment thread go.mod
chalabi2 added 2 commits May 29, 2026 08:24
Align the E2E oracle setup with the oracle v2 module used by
pkg.akt.dev/node/v2 so genesis interception can unmarshal the oracle state.
Also align the Akash Cosmos SDK fork replacement with the declared v0.53.6
requirement.

Signed-off-by: Joseph Chalabi <chalabi.joseph@gmail.com>
Fail fast when the reclamation window flag is set below zero. Zero
remains the disabled value, and positive values continue to advertise a
reclamation window in bids.

Signed-off-by: Joseph Chalabi <chalabi.joseph@gmail.com>
Copy link
Copy Markdown

@chalabi2 chalabi2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM e2e tests fixed and passing locally, coderabbit negative reclamation window checks implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants