Write price data to iceberg with backfill#1181
Merged
Conversation
| impl Settings { | ||
| /// Build a `BucketClient` for the output bucket using the shared | ||
| /// `file_store` credentials. | ||
| pub async fn output_bucket_client(&self) -> file_store::BucketClient { |
Contributor
There was a problem hiding this comment.
We should probably update Settings.file_store to use file_store::BucketSetting and use the connect function provided by that.
| /// Database settings. Required when running `backfill`; unused by the | ||
| /// server path. | ||
| #[serde(default)] | ||
| pub database: Option<db_store::Settings>, |
Contributor
There was a problem hiding this comment.
If the database is only used for backfill, should this setting be moved in there?
Collaborator
Author
There was a problem hiding this comment.
I can move if you feel strongly, but this is easier and quicker to get things moving
f69ac88 to
ae2da3a
Compare
Co-authored-by: Michael Jeffrey <michaeldjeffrey@gmail.com>
f70c094 to
6a5f1ea
Compare
michaeldjeffrey
approved these changes
May 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Price oracle: Iceberg integration with optional S3, backfill, and pluggable sinks
Branch layered on top of
bbalser/helium-icebeg-batched-writerand rebased to currentmain. 6 commits, 14 files changed (+664 / −41 LOC).Overview
Adds an Iceberg sink to the price daemon, a one-shot backfill subcommand to seed historical price data into Iceberg from existing S3 protobuf files, and refactors the per-tick output path into a
PriceSinktrait so S3 and Iceberg are interchangeable, individually optional sinks. Everything writes to the newtokens.pricestable.Commits
9abc534fIcebergTable::write_idempotentbccec1a9Cargo.tomlcleanup59b0160cfile_store::BucketSettings1adfad7fwrite_idempotentwith the newBatchedWriter(size/time-batched, spool-backed)3010b1b4price_usd: Decimalcolumn derived viasolana::Token::decimals()6a5f1eadPriceSinktrait; make S3 output optional on the serverChanges by area
1. Iceberg writes (new)
price/src/iceberg/with:IcebergPriceReportrow type (timestamp,price: u64,price_usd: Decimal,token_type: String).connect_table(&helium_iceberg::Settings) -> IcebergTable<IcebergPriceReport>— ensures namespace and table exist.tokens.prices, partitioned daily ontimestamp.price_usdderivation usessolana::Token::decimals()(HNT=8, MOBILE/IOT=6, …), removing the hardcodedHNT_DECIMALSfor that column.2.
BatchedWriterfor live and backfill pathsBoth the daemon and backfill use
helium_iceberg::BatchedWriter(introduced upstream inbbalser/helium-icebeg-batched-writer), which:Knobs in
Settings:iceberg_batch_sizeiceberg_batch_timeoutThe two writer instances use separate spool dirs (
<cache>/iceberg-spoolvs<cache>/iceberg-spool-backfill) so they can run on the same host without crossing each other's replay.3.
PriceSinktrait — pluggable destinationsNew module
price/src/sinks/:Two implementations:
S3PriceSinkwrapsFileSinkClient<PriceReportV1>(rolled.gzfiles in S3).IcebergPriceSinkwrapsBatchedWriter<IcebergPriceReport>and owns the proto→iceberg conversion.PriceGeneratornow holdssinks: Vec<Box<dyn PriceSink>>and the per-tick fan-out collapses to:4. Optional S3 output on the server
Settings::outputis nowOption<file_store::BucketSettings>.Server::runis a fan-out builder:outputiceberg_settingsLocal
<cache>/hnt.latestcache is always written regardless of sink config (used to seed stale-price fallback after restart).5.
backfillsubcommand (new)price backfill --start-after <RFC3339> --stop-after <RFC3339>reads existingPriceReportV1S3 files viafile_source::Continuousand queues them into Iceberg through theBatchedWriter. Bookkeeping:files_processedtable (new migration1_files_processed.sql).output(S3) is required for backfill since this is the read source.--stop-aftershould be set to the date Iceberg was first enabled in production so it does not overlap with the daemon's live writes.6. Configuration surface
price/pkg/settings-template.tomlupdates:[output]block now commented out with a note (optional forserver, required forbackfill).iceberg_batch_size/iceberg_batch_timeoutknobs documented.[iceberg_settings]block documented (tokens.pricestable).[database]block documented (only consulted bybackfill).Cargo.tomladds:async-trait,solana(workspace path dep — pullshelium_lib::Token),helium-iceberg,trino-rust-client,db-store,sqlx.Files