Skip to content

Optimize and Fix provider price fetches for sold securities and batch queries#1580

Open
wps260 wants to merge 4 commits intowe-promise:mainfrom
wps260:perf/market-data-importer
Open

Optimize and Fix provider price fetches for sold securities and batch queries#1580
wps260 wants to merge 4 commits intowe-promise:mainfrom
wps260:perf/market-data-importer

Conversation

@wps260
Copy link
Copy Markdown

@wps260 wps260 commented Apr 28, 2026

Three distinct bugs caused the price provider API to be called unnecessarily on every investment account sync.

  1. Sold securities triggered a provider call on every sync forever

    import_security_prices passed end_date: Date.current for every security ever traded. Security::Price::Importer short-circuits via all_prices_exist? only when persisted_count == expected_count, where:

    expected_count = (clamped_start_date..Date.current).count

    This range increases daily, so a security closed two years ago would have all historical prices in the DB unnecessarily. This also causes any closed securities to fetch prices daily, forever.

    Fix: separate currently-held securities (end_date: Date.current) from historical-only securities (end_date: last holding date for that security). Once a closed position's price range is complete through its last holding date, all_prices_exist? becomes permanently stable and no further provider calls occur for that security.

    "Currently held" is defined as appearing in account.current_holdings, which returns the most recent holding per security with qty != 0. On the first sync after a sell, the pre-sale holding is still the most recent, so the security correctly receives end_date: Date.current for one final sync before the new qty=0 holding is materialised.

  2. Offline securities were not filtered

    account.trades.map(&:security) returned all securities regardless of the offline flag. This results in fetching of securities even if the provider cannot serve them, or if the user don't want them served for some reason (eg when there are symbol collisions that causes the wrong prices to be returned) The global MarketDataImporter correctly uses Security.online; the account-scoped importer did not.

    Fix: Security.online.where(id: all_security_ids) matches the established contract. Offline IDs still pass through the pluck step but resolve to nil in the securities hash and are skipped by the existing next unless security guard.

  3. N+1 queries for security loading and per-security start dates

    • account.trades.map(&:security): triggered one SQL query per trade to load the security association (N+1).
    • first_required_price_date(security): issued 2 DB queries per security - one MIN(entries.date) and one EXISTS - so S securities = 2S queries.

    Fix: replace with batch queries totalling 4 regardless of security count:

    • account.current_holdings.pluck(:security_id) - current security IDs
    • account.trades.pluck(:security_id).uniq - traded security IDs
    • Security.online.where(id: ...) - load all security records at once
    • batch_first_required_price_dates: one GROUP BY security_id MIN(entries.date) over trades, one pluck for provider-holding security IDs, one GROUP BY security_id MAX(date) over holdings for historical end dates

Summary by CodeRabbit

  • Bug Fixes

    • Price imports now cap end dates for sold positions and treat reopened positions as active, ensuring accurate historical and current valuations.
  • Performance Improvements

    • Date calculations for imports are batched to reduce redundant queries and speed up price syncing.
  • Tests

    • Added unit tests verifying end-date behavior for sold and reopened securities.

Three distinct bugs caused the price provider API to be called unnecessarily
on every investment account sync.

1. Sold securities triggered a provider call on every sync forever

   import_security_prices passed end_date: Date.current for every security
   ever traded. Security::Price::Importer short-circuits via all_prices_exist?
   only when persisted_count == expected_count, where:

     expected_count = (clamped_start_date..Date.current).count

   This range increases daily, so a security closed two years ago would have
   all historical prices in the DB unnecessarily.  This also causes any closed
   securities to fetch prices daily, forever.

   Fix: separate currently-held securities (end_date: Date.current) from
   historical-only securities (end_date: last holding date for that security).
   Once a closed position's price range is complete through its last holding
   date, all_prices_exist? becomes permanently stable and no further provider
   calls occur for that security.

   "Currently held" is defined as appearing in account.current_holdings, which
   returns the most recent holding per security with qty != 0. On the first
   sync after a sell, the pre-sale holding is still the most recent, so the
   security correctly receives end_date: Date.current for one final sync before
   the new qty=0 holding is materialised.

2. Offline securities were not filtered

   account.trades.map(&:security) returned all securities regardless of the
   offline flag. This results in fetching of securities even if the provider
   cannot serve them, or if the user don't want them served for some reason
   (eg when there are symbol collisions that causes the wrong prices to be
   returned) The global MarketDataImporter correctly uses Security.online;
   the account-scoped importer did not.

   Fix: Security.online.where(id: all_security_ids) matches the established
   contract. Offline IDs still pass through the pluck step but resolve to nil
   in the securities hash and are skipped by the existing `next unless security`
   guard.

3. N+1 queries for security loading and per-security start dates

   - account.trades.map(&:security): triggered one SQL query per trade to load
     the security association (N+1).
   - first_required_price_date(security): issued 2 DB queries per security -
     one MIN(entries.date) and one EXISTS - so S securities = 2S queries.

   Fix: replace with batch queries totalling 4 regardless of security count:
   - account.current_holdings.pluck(:security_id) - current security IDs
   - account.trades.pluck(:security_id).uniq - traded security IDs
   - Security.online.where(id: ...) - load all security records at once
   - batch_first_required_price_dates: one GROUP BY security_id MIN(entries.date)
     over trades, one pluck for provider-holding security IDs, one GROUP BY
     security_id MAX(date) over holdings for historical end dates
@brin-security-scanner brin-security-scanner Bot added the contributor:flagged Contributor flagged for review by trust analysis. label Apr 28, 2026
@brin-security-scanner
Copy link
Copy Markdown

brin-security-scanner Bot commented Apr 28, 2026

⚠️ Contributor Trust Check — Review Recommended

This contributor's profile shows patterns that may warrant additional review. This is based on their GitHub activity, not the contents of this PR.

wps260 · Score: 70/100

Dimension breakdown
Dimension Score What it measures
Identity 25 Account age, contribution history, GPG keys, org memberships
Behavior 90 PR patterns, unsolicited contribution ratio, activity cadence
Content 100 PR body substance, issue linkage, contribution quality
Graph 30 Cross-repo trust, co-contributor relationships

Analyzed by Brin · Full profile

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 28, 2026

📝 Walkthrough

Walkthrough

Replaces per-security price import looping with a bulk-by-security_id flow: gathers security_ids from current holdings and trades, filters to Security.online, computes grouped start_dates via batch_first_required_price_dates, and fetches prices per security using per-security end_date logic (current for held/reopened, capped to last holding date otherwise).

Changes

Cohort / File(s) Summary
Market Data Importer
app/models/account/market_data_importer.rb
Reworks import_security_prices to build a security_id set from account.current_holdings and account.trades, restrict to Security.online, replace per-security first_required_price_date calls with batch_first_required_price_dates, and compute per-security end_date (Date.current for held/reopened, last holding date if no current holding).
Importer Tests
test/models/account/market_data_importer_test.rb
Adds tests asserting fetch_security_prices is called with correct per-security end_date behavior (cap to sold holding date; use Date.current if repurchased before rematerialization); stubs provider and exchange-rate interactions to isolate date-range logic.

Sequence Diagram(s)

sequenceDiagram
    participant Importer as Account::MarketDataImporter
    participant DB as Holdings/Trades DB
    participant Sec as Security Model
    participant Helper as batch_first_required_price_dates
    participant Provider as Price Provider

    Importer->>DB: collect security_ids from holdings & trades
    Importer->>Sec: filter ids to Security.online
    Importer->>Helper: request grouped start_dates for security_ids
    Helper-->>Importer: return start_dates map
    loop per security_id
        Importer->>DB: get last holding date / current holding qty
        alt held or reopened
            Importer->>Provider: fetch_security_prices(id, start_date, Date.current)
        else sold and not reopened
            Importer->>Provider: fetch_security_prices(id, start_date, last_holding_date)
        end
        Provider-->>Importer: return prices
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

contributor:verified

Suggested reviewers

  • jjmata

Poem

🐰 I hopped through IDs in one quick run,
Grouped first dates so queries weigh none.
If it's held or rebought, fetch through today,
If sold and stayed gone, I stop where it lay.
🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main changes: optimizing provider price fetches through batch queries and fixing the issue where sold securities trigger repeated API calls.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@brin-security-scanner brin-security-scanner Bot added the pr:verified PR passed security analysis. label Apr 28, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 71e7fc7a02

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread app/models/account/market_data_importer.rb Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/models/account/market_data_importer.rb`:
- Around line 89-93: Cache account.start_date into a local variable before the
security_ids.each_with_object loop and use that variable (e.g.,
account_start_date) in place of account.start_date inside the block; update the
logic that sets holding_date and the fallback min to reference
account_start_date, so Account#start_date (which runs entries.minimum(:date)) is
not invoked per-security.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: de33a739-9041-4fba-b3c0-c3b3b2171e34

📥 Commits

Reviewing files that changed from the base of the PR and between 3960582 and 71e7fc7.

📒 Files selected for processing (2)
  • app/models/account/market_data_importer.rb
  • test/models/account/market_data_importer_test.rb

Comment thread app/models/account/market_data_importer.rb
wps260 added 2 commits April 28, 2026 10:41
…sitions

Account::Syncer runs import_market_data before materialize_balances, so
current_holdings reflects the last materialized snapshot rather than the
current trade state. If a security was previously sold (stale holdings show
qty=0) and then repurchased in the same sync cycle, it landed in
historical_ids and had its end_date capped at the old last_holding_date.
This caused all_prices_exist? to short-circuit, skipping the price fetch
through today, and leaving the forthcoming holding materialization without
a price for the repurchase period.

Fix: compare the latest trade date against the last holding date for each
historical security. If the trade is newer, the position was reopened before
holdings were rematerialized; treat end_date as Date.current for that sync.
The cap still applies on subsequent syncs once materialize_balances has
updated the holdings table.

Adds a regression test covering the repurchase scenario.
Account#start_date issues SELECT MIN(date) FROM entries on every call.
Inside batch_first_required_price_dates it was called up to twice per
security (holding_date assignment + fallback), producing up to 2N extra
queries for an account with N provider-held securities.

Cache the result in account_start_date before the loop.
@wps260
Copy link
Copy Markdown
Author

wps260 commented Apr 28, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 28, 2026

✅ Actions performed

Full review triggered.

@wps260 wps260 changed the title Performance and bug fixes in provider price fetches Fix provider price fetches for sold securities and batch queries Apr 28, 2026
@wps260 wps260 changed the title Fix provider price fetches for sold securities and batch queries Optimize and Fix provider price fetches for sold securities and batch queries Apr 28, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
app/models/account/market_data_importer.rb (1)

54-57: Filter the aggregate work down to online IDs too.

Security.online.where(id: all_security_ids) correctly prevents provider calls for offline securities, but the batched start_dates, last_holding_date, and latest_trade_date queries still run against the full pre-filtered ID set. Accounts with many offline trades will still pay that grouped-query cost every sync. Consider switching the downstream working set to securities.keys so the DB work matches the provider-eligible universe.

As per coding guidelines: "Optimize database queries with proper indexes and prevent N+1 queries via includes/joins".

Also applies to: 61-76

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/models/account/market_data_importer.rb` around lines 54 - 57, The batched
date queries are still using the original all_security_ids set instead of the
filtered online securities, causing unnecessary DB work; update the downstream
working set to use the online securities' IDs (e.g., use securities.keys or
securities.keys.map(&:to_i) / securities.keys to pass into
batch_first_required_price_dates, last_holding_date, latest_trade_date, and any
subsequent calculations) so that start_dates =
batch_first_required_price_dates(...) and the historical_ids calculation
(traded_security_ids - current_security_ids.to_a) operate only on the
provider-eligible online securities returned by securities (refer to the
securities, start_dates, batch_first_required_price_dates, historical_ids,
traded_security_ids, current_security_ids, last_holding_date, and
latest_trade_date symbols).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/models/account/market_data_importer.rb`:
- Around line 61-64: The aggregate queries over holdings (e.g., where you
compute last_holding_date from
account.holdings.group(:security_id).maximum(:date)) are not scoped to
account.currency and must match the universe used by account.current_holdings;
update these queries to add the currency filter (e.g., where(currency:
account.currency)) so the capped end_date and provider-derived start_date use
the same holding set. Locate all similar aggregates in this file (including the
block that computes last_holding_date and the later aggregates around lines
100–111) and add the same currency scope to each account.holdings query.

In `@test/models/account/market_data_importer_test.rb`:
- Around line 115-210: Add a regression test that creates a Security with
offline: true and ensures Account::MarketDataImporter.import_all never calls
`@provider.fetch_security_prices` for that security; specifically, in
test/models/account/market_data_importer_test.rb create an offline Security
(e.g., Security.create!(ticker: "OFF", exchange_operating_mic: "XNAS", offline:
true)), set up any minimal trades/holdings as needed, then set an expectation
that `@provider.expects`(:fetch_security_prices).with(..., symbol: "OFF",
...).never or assert_no_request for that ticker while keeping normal stubs for
fetch_security_info and fetch_exchange_rates; this verifies the importer uses
Security.online and skips offline securities when
Account::MarketDataImporter.new(account).import_all runs.

---

Nitpick comments:
In `@app/models/account/market_data_importer.rb`:
- Around line 54-57: The batched date queries are still using the original
all_security_ids set instead of the filtered online securities, causing
unnecessary DB work; update the downstream working set to use the online
securities' IDs (e.g., use securities.keys or securities.keys.map(&:to_i) /
securities.keys to pass into batch_first_required_price_dates,
last_holding_date, latest_trade_date, and any subsequent calculations) so that
start_dates = batch_first_required_price_dates(...) and the historical_ids
calculation (traded_security_ids - current_security_ids.to_a) operate only on
the provider-eligible online securities returned by securities (refer to the
securities, start_dates, batch_first_required_price_dates, historical_ids,
traded_security_ids, current_security_ids, last_holding_date, and
latest_trade_date symbols).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: af824559-9dcf-4230-811d-2306f00e5b95

📥 Commits

Reviewing files that changed from the base of the PR and between 3960582 and 11b7048.

📒 Files selected for processing (2)
  • app/models/account/market_data_importer.rb
  • test/models/account/market_data_importer_test.rb

Comment thread app/models/account/market_data_importer.rb
Comment thread test/models/account/market_data_importer_test.rb
Adds a regression test verifying that Account::MarketDataImporter never
calls fetch_security_prices for a security with offline: true, covering
the Security.online filter on line 54 of the importer.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/models/account/market_data_importer_test.rb (1)

115-248: Optional: extract common account/security/trade setup into small test helpers.

These new tests are solid, but they duplicate substantial setup. A helper (or two) would reduce maintenance cost and keep intent-focused assertions even clearer.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/models/account/market_data_importer_test.rb` around lines 115 - 248,
Extract the repeated setup in market_data_importer_test.rb into small test
helper methods: add a private helper like build_family_and_account(name:
"Smith", currency: "USD") that returns the family and account, a helper
create_security_with_trade(account, ticker:, exchange_operating_mic:,
trade_date:, qty:, price:, offline: false) that creates the Security and the
associated Trade entry, and a helper create_holdings(account, security,
holdings_array) to add the various holdings (dates, qty, price, amount). Replace
the duplicated inline setup in the tests ("caps end_date...", "fetches
prices...", "skips price fetching...") to call these helpers and keep each test
focused on only the differing pieces (e.g., sold_date, repurchase_date, offline
flag) while still calling Account::MarketDataImporter.new(account).import_all as
before.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/models/account/market_data_importer_test.rb`:
- Around line 115-248: Extract the repeated setup in
market_data_importer_test.rb into small test helper methods: add a private
helper like build_family_and_account(name: "Smith", currency: "USD") that
returns the family and account, a helper create_security_with_trade(account,
ticker:, exchange_operating_mic:, trade_date:, qty:, price:, offline: false)
that creates the Security and the associated Trade entry, and a helper
create_holdings(account, security, holdings_array) to add the various holdings
(dates, qty, price, amount). Replace the duplicated inline setup in the tests
("caps end_date...", "fetches prices...", "skips price fetching...") to call
these helpers and keep each test focused on only the differing pieces (e.g.,
sold_date, repurchase_date, offline flag) while still calling
Account::MarketDataImporter.new(account).import_all as before.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a9b0f03c-f198-491d-b4c7-4e9e2ecae97c

📥 Commits

Reviewing files that changed from the base of the PR and between 11b7048 and 430d3f8.

📒 Files selected for processing (1)
  • test/models/account/market_data_importer_test.rb

@jjmata jjmata self-requested a review April 29, 2026 15:38
@jjmata jjmata removed the contributor:flagged Contributor flagged for review by trust analysis. label Apr 29, 2026
@jjmata jjmata added this to the v0.7.0 milestone Apr 29, 2026
@jjmata
Copy link
Copy Markdown
Collaborator

jjmata commented Apr 29, 2026

Nit-picks from Sonnet 4.7:


PR #1580 — Optimize and Fix provider price fetches for sold securities and batch queries

Overview

This PR fixes three real bugs in Account::MarketDataImporter#import_security_prices and replaces an O(2S) query pattern with a fixed set of batch queries. The problems are well-described and the fixes are targeted.


Code Quality

Strengths:

  • The core logic is correct. Capping end_date to the last holding date for sold securities is exactly the right fix — it makes all_prices_exist? permanently stable for closed positions.
  • The reopened detection (trade_date > holding_date) correctly handles the ordering dependency: import_market_data runs before materialize_balances, so stale holdings can hide a new repurchase. This edge case is subtle and the solution is clean.
  • Filtering to Security.online now matches the contract used in the global MarketDataImporter.
  • The next unless security guard already existed; offline IDs gracefully fall through.

Specific Issues

1. Inaccurate query count in method comment

The comment in batch_first_required_price_dates says "3 queries total" which refers only to that method. But import_security_prices runs 2 more grouped queries (last_holding_date, latest_trade_date) on top of that — for a total of 7 fixed queries in the worst case. The PR description says "4 queries" which is also wrong. The comment should either be removed or updated to reflect the full count.

2. historical_ids includes offline security IDs

historical_ids = traded_security_ids - current_security_ids.to_a

traded_security_ids contains all traded IDs including offline ones. The last_holding_date and latest_trade_date queries are run against historical_ids, which may include offline IDs that will be skipped by next unless security anyway. This is a minor inefficiency — consider:

historical_ids = (traded_security_ids - current_security_ids.to_a) & securities.keys

3. batch_first_required_price_dates queries more than needed

trade_start_dates = account.trades.group(:security_id).minimum("entries.date")

This fetches trade dates for all securities ever traded on the account, not just those in security_ids. Functionally correct, but slightly wasteful when called with a filtered set. A .where(security_id: security_ids) filter would be more precise.

4. Tests use create! instead of fixtures

Per CLAUDE.md and project conventions, tests should use Minitest + fixtures and avoid creating objects with Family.create!, Account.create!, etc. All three new tests create families and accounts inline, bypassing fixtures entirely. Existing fixtures should be reused where possible, with test-specific supplemental data created only for the edge cases.

5. Duplicated test setup

All three new tests repeat the same family/account creation boilerplate. A setup block or shared helper would reduce duplication.

6. Hardcoded timezone in test assertion

end_date: Date.current.in_time_zone("America/New_York").to_date

This hardcoded "America/New_York" is fragile — it'll match only because Security::Price::Importer internally converts to that timezone before calling the provider. The assertion implicitly tests that internal conversion rather than the date logic in this file. If the timezone is significant, it should be explicit in the production code, not just implied by the test.


Correctness Assessment

  • The end_date logic covers: currently held, sold-and-stable, and sold-then-reopened. All three branches look correct.
  • Fallback holding_date || Date.current when a security has trades but no holdings (pre-first-sync) is safe.
  • all_security_ids is a Set (result of Set#|), which is fine — it's only iterated.
  • traded_security_ids - current_security_ids.to_a requires the .to_a because Array#- doesn't accept a Set directly. This is correct.

Test Coverage

The three new tests cover the three described bugs well:

  • Sold security end date capping ✅
  • Reopened position before rematerialization ✅
  • Offline security filtering ✅

Missing: no assertion on query count reduction (bug #3). This is acceptable since query count tests are fragile, but worth noting that N+1 elimination is only validated by inspection, not by a test.


Summary

The fixes are correct and the performance improvement is meaningful. The main concerns are: inaccurate query count documentation, minor inefficiencies with historical_ids including offline IDs, and test style that doesn't follow the project's fixture-based convention. None of these are blockers, but the test style issue is worth addressing before merge since it sets a pattern others may follow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr:verified PR passed security analysis.

Development

Successfully merging this pull request may close these issues.

2 participants