Performance improvements in holding calculation pipeline by wps260 · Pull Request #1579 · we-promise/sure

wps260 · 2026-04-28T15:08:54Z

Investment accounts with large histories were pegging CPU at 100% during sync. Root cause was a cluster of quadratic and superlinear algorithms in the inner holding calculation loop. All are replaced with O(1) hash lookups built from single-pass indexes over the already-loaded data.

Holding::PortfolioCache - load_prices:

Three O(SxN) patterns inside the per-security loop:

DB prices: security.prices.where(...) fired one SQL query per security (N+1). Replaced with a single bulk query before the loop:

Security::Price.where(security_id: ..., date: ...).group_by(&:security_id)

70 securities -> 70 queries becomes 1.
Trade prices: trades.select { |t| t.entryable.security_id == id } scanned the full trades array for every security - O(SxT). Replaced with trades_by_security_id, pre-indexed once from the loaded array.
Holding prices: holdings.select { |h| h.security_id == id } - same O(SxH) pattern. Replaced with holdings_by_security_id.

Prices are now indexed into prices_by_date and prices_by_date_and_source
hashes during load_prices, making get_price O(1) instead of scanning the
flat prices array on every lookup.

Holding::PortfolioCache - get_trades / get_price:

get_trades(date:): trades.select { |t| t.date == date } (O(T) scan) replaced with trades_by_date hash (O(1)).
get_price: two prices.select { p.date == date ... }.min_by linear scans replaced with direct hash lookups into prices_by_date and prices_by_date_and_source.

Holding::PortfolioCache - collect_unique_securities:

holdings.map(&:security) traversed the security association on every
holding record (N+1 if not preloaded). Replaced with a pluck of
security_ids followed by a single Security.where(id: ...) batch load.

Holding::ForwardCalculator / ReverseCalculator:

holdings += build_holdings(...) allocated a new array copy on every
iteration - O(N) per day x thousands of days = O(D^2) total allocations.
Replaced with holdings.concat(...) which appends in place, O(1).

Holding::ReverseCalculator - precompute_cost_basis:

Old: walked every date from account.start_date to Date.current (O(D)),
writing a cost_basis entry for every security on every date. For an
account with 2 trades over 9,250 days this wrote ~18,500 hash entries
and consumed the full date range in the outer loop regardless of trade
density.

New: walks only buy trades (O(T)), appending one [date, avg_cost]
snapshot per trade. cost_basis_for binary-searches the sparse snapshot
array - O(log T) per lookup. Memory drops from O(DxS) to O(T).

Holding::Gapfillable:

security_holdings.find { |h| h.date == date } was called on every
date in the gapfill range - O(H) per date, O(HxD) total. Replaced with
security_holdings.index_by(:date) built once before the loop, making
each date lookup O(1).

Holding::Materializer - purge_stale_holdings:

account.entries.trades.map { |entry| entry.entryable.security_id }.uniq
loaded all trade entry records into Ruby then traversed the entryable
association on each (N+1). Replaced with account.trades.pluck(:security_id).uniq
(single SQL query returning only the IDs).

In testing, these changes were able to reduce sync time of an account with 25 years of history and 70 securities from about 90 minutes down to under 3 minutes.

Summary by CodeRabbit

Performance Improvements
- Faster portfolio calculations, gapfilling, holdings accumulation, and price lookups via improved indexing and bulk loading.
Accuracy / Bug Fixes
- More reliable cost-basis retrieval using snapshot-based lookup that preserves correct values across dates.
Tests
- Added unit tests covering cost-basis behavior, edge cases, and related scenarios.

brin-security-scanner · 2026-04-28T15:09:06Z

⚠️ Contributor Trust Check — Review Recommended

This contributor's profile shows patterns that may warrant additional review. This is based on their GitHub activity, not the contents of this PR.

wps260 · Score: 70/100

Dimension breakdown

Dimension	Score	What it measures
Identity	25	Account age, contribution history, GPG keys, org memberships
Behavior	90	PR patterns, unsolicited contribution ratio, activity cadence
Content	100	PR body substance, issue linkage, contribution quality
Graph	30	Cross-repo trust, co-contributor relationships

_{Analyzed by Brin · Full profile}

coderabbitai · 2026-04-28T15:09:09Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f7cb27bb-764d-40f7-8651-7bf26c617863

📥 Commits

Reviewing files that changed from the base of the PR and between 1b2d7c1 and 84104a4.

📒 Files selected for processing (6)

app/models/holding/forward_calculator.rb
app/models/holding/gapfillable.rb
app/models/holding/materializer.rb
app/models/holding/portfolio_cache.rb
app/models/holding/reverse_calculator.rb
test/models/holding/reverse_calculator_test.rb

✅ Files skipped from review due to trivial changes (1)

app/models/holding/gapfillable.rb

🚧 Files skipped from review as they are similar to previous changes (2)

app/models/holding/forward_calculator.rb
test/models/holding/reverse_calculator_test.rb

📝 Walkthrough

Walkthrough

Replace repeated linear scans with indexed lookups, switch array accumulation from += to concat, refactor cost-basis from per-date maps to per-trade snapshots with binary-search lookup, change how portfolio security IDs are derived, and add unit tests for cost-basis behavior.

Changes

Cohort / File(s)	Summary
Array accumulation `app/models/holding/forward_calculator.rb`, `app/models/holding/reverse_calculator.rb`	Replace `holdings += build_holdings(...)` with `holdings.concat(build_holdings(...))` to change array extension semantics during accumulation.
Indexed lookups & gapfilling `app/models/holding/gapfillable.rb`	Build a `holdings_by_date` index from `security_holdings` and use direct date lookup instead of scanning with `find`.
Portfolio cache & price lookup `app/models/holding/portfolio_cache.rb`	Precompute and memoize `trades_by_date`, `prices_by_date`, and `prices_by_date_and_source`; bulk-load prices and change `get_trades`/`get_price` to constant-time indexed lookups.
Cost-basis refactor & tests `app/models/holding/reverse_calculator.rb`, `test/models/holding/reverse_calculator_test.rb`	Switch from dense per-date cost-basis tables to per-trade WAC snapshots recorded once; `cost_basis_for` now binary-searches snapshots. Tests added to exercise snapshot behavior and edge cases.
Materializer trade sourcing `app/models/holding/materializer.rb`	Compute `portfolio_security_ids` via `account.trades.distinct.pluck(:security_id)` instead of Ruby-side mapping over entries/trades.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Correct brokerage cash calculation for SimpleFIN investment accounts #710 — Related changes to app/models/holding/materializer.rb around purge/stale-holdings logic and provider-aware cleanup.
Fixed average cost calculation #372 — Related cost-basis computation changes; both adjust average/weighted cost handling.

Suggested labels

pr:verified

Suggested reviewers

jjmata
sokie

Poem

🐰 I hop through code with eager cheer,
I swap += for concat here,
Indexes stand where finds once trod,
Snapshots hum of cost‑basis odd,
I nibble bugs and leave a carrot nod. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 9.52% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Performance improvements in holding calculation pipeline' directly and clearly summarizes the main objective of the changeset: optimizing the holding calculation system by replacing quadratic algorithms with efficient indexes and O(1) lookups.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (3)

test/models/holding/reverse_calculator_test.rb (1)

189-190: Consider tolerant numeric assertions for cost-basis checks.

Using assert_in_delta here would make tests less brittle if value representation/precision changes (e.g., BigDecimal/float conversion paths).

Optional test hardening example

-    assert_equal 110.0, cost_basis_for(calc, security, second_buy)
-    assert_equal 110.0, cost_basis_for(calc, security, Date.current)
+    assert_in_delta 110.0, cost_basis_for(calc, security, second_buy).to_f, 1e-6
+    assert_in_delta 110.0, cost_basis_for(calc, security, Date.current).to_f, 1e-6

Also applies to: 203-209, 221-222, 235-237

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@test/models/holding/reverse_calculator_test.rb` around lines 189 - 190,
Replace strict equality assertions for cost-basis values with tolerant numeric
assertions: change assert_equal 100.0, cost_basis_for(calc, security, buy_date)
and the other similar assertions in reverse_calculator_test.rb to use
assert_in_delta with a small delta (e.g., 0.01) so tests tolerate minor
float/BigDecimal precision differences; update every occurrence that compares
expected numeric cost-basis (calls to cost_basis_for or similar in this test) to
use assert_in_delta(expected, actual, delta) with a consistent delta value.

app/models/holding/materializer.rb (1)

173-185: Keep the dedupe in SQL.

pluck(:security_id).uniq still materializes every duplicate ID in Ruby. distinct.pluck(:security_id) preserves the behavior with less memory/GC churn.

Proposed tweak

-      portfolio_security_ids = account.trades.pluck(:security_id).uniq
+      portfolio_security_ids = account.trades.distinct.pluck(:security_id)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/models/holding/materializer.rb` around lines 173 - 185, In
purge_stale_holdings replace the in-Ruby dedupe
(account.trades.pluck(:security_id).uniq) with a SQL-level distinct to avoid
materializing duplicates; locate the portfolio_security_ids assignment in the
purge_stale_holdings method and use account.trades.distinct.pluck(:security_id)
so the query returns unique security IDs directly from the database.

app/models/holding/portfolio_cache.rb (1)

17-22: Return a copy of the cached trades.

group_by stores the arrays inside the memoized hash, so returning them directly makes get_trades externally mutable. That is a behavior change from the previous fresh-array return and can poison the cache if callers mutate the result.
Proposed tweak
-      trades_by_date[date] || []
+      trades_by_date[date]&.dup || []
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/models/holding/portfolio_cache.rb` around lines 17 - 22, get_trades
currently returns arrays that live inside the memoized trades_by_date, making
the cache mutable; fix get_trades by returning fresh copies instead of the
cached arrays: when date is blank return trades.to_a (or trades.dup) to produce
a new array, and when returning trades_by_date[date] return
(trades_by_date[date] || []).dup so callers cannot mutate the cached arrays;
update the get_trades method to use these copies while keeping the existing
logic.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@app/models/holding/materializer.rb`:
- Around line 173-185: In purge_stale_holdings replace the in-Ruby dedupe
(account.trades.pluck(:security_id).uniq) with a SQL-level distinct to avoid
materializing duplicates; locate the portfolio_security_ids assignment in the
purge_stale_holdings method and use account.trades.distinct.pluck(:security_id)
so the query returns unique security IDs directly from the database.

In `@app/models/holding/portfolio_cache.rb`:
- Around line 17-22: get_trades currently returns arrays that live inside the
memoized trades_by_date, making the cache mutable; fix get_trades by returning
fresh copies instead of the cached arrays: when date is blank return trades.to_a
(or trades.dup) to produce a new array, and when returning trades_by_date[date]
return (trades_by_date[date] || []).dup so callers cannot mutate the cached
arrays; update the get_trades method to use these copies while keeping the
existing logic.

In `@test/models/holding/reverse_calculator_test.rb`:
- Around line 189-190: Replace strict equality assertions for cost-basis values
with tolerant numeric assertions: change assert_equal 100.0,
cost_basis_for(calc, security, buy_date) and the other similar assertions in
reverse_calculator_test.rb to use assert_in_delta with a small delta (e.g.,
0.01) so tests tolerate minor float/BigDecimal precision differences; update
every occurrence that compares expected numeric cost-basis (calls to
cost_basis_for or similar in this test) to use assert_in_delta(expected, actual,
delta) with a consistent delta value.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1823688f-d2f2-4158-90e1-ccbf02ba6d59

📥 Commits

Reviewing files that changed from the base of the PR and between 3960582 and 1759984.

📒 Files selected for processing (6)

app/models/holding/forward_calculator.rb
app/models/holding/gapfillable.rb
app/models/holding/materializer.rb
app/models/holding/portfolio_cache.rb
app/models/holding/reverse_calculator.rb
test/models/holding/reverse_calculator_test.rb

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

app/models/holding/portfolio_cache.rb (1)

89-93: Reuse the new security indexes here too.

This path still walks holdings directly to derive ids, then load_prices groups the same collection again a few lines later. Pulling ids from holdings_by_security_id.keys (and similarly from trades_by_security_id.keys) would keep initialization closer to single-pass on large histories.

Possible refactor

 def collect_unique_securities
-  unique_securities_from_trades = trades.map(&:entryable).map(&:security).uniq
-  unique_securities_from_trades = unique_securities_from_trades.select { |s| `@security_ids.include`?(s.id) } if `@security_ids`
-
-  return unique_securities_from_trades unless use_holdings
-
-  holding_security_ids = holdings.map(&:security_id).uniq
-  holding_security_ids = holding_security_ids.select { |id| `@security_ids.include`?(id) } if `@security_ids`
-  unique_securities_from_holdings = Security.where(id: holding_security_ids).to_a
-
-  (unique_securities_from_trades + unique_securities_from_holdings).uniq
+  ids = trades_by_security_id.keys
+  ids |= holdings_by_security_id.keys if use_holdings
+  ids &= `@security_ids` if `@security_ids`
+
+  Security.where(id: ids).to_a
 end

As per coding guidelines "Optimize database queries with proper indexes and prevent N+1 queries via includes/joins".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/models/holding/portfolio_cache.rb` around lines 89 - 93, Replace the
direct iteration over holdings/trades to derive security ids with the
precomputed index keys to avoid multiple passes: use
holdings_by_security_id.keys (and trades_by_security_id.keys) as the source for
holding_security_ids/trade_security_ids, apply the existing filter (select by
`@security_ids`) and uniq logic, then query Security.where(id: ...) as before;
update any downstream uses (e.g., load_prices grouping) to rely on these indexed
keys so the collection is initialized in a single pass and avoids N+1 behavior
when building unique_securities_from_holdings/unique_securities_from_trades.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/models/holding/portfolio_cache.rb`:
- Around line 29-30: The RuboCop error is caused by the nested array literal
used as a hash key in the expression security[:prices_by_date_and_source][[date,
source]]; fix it by adding spaces inside the inner array so it becomes
security[:prices_by_date_and_source][[ date, source ]], updating the assignment
to price_with_priority to use the spaced inner array and re-run the linter to
confirm the nested array literal spacing rule is satisfied.

---

Nitpick comments:
In `@app/models/holding/portfolio_cache.rb`:
- Around line 89-93: Replace the direct iteration over holdings/trades to derive
security ids with the precomputed index keys to avoid multiple passes: use
holdings_by_security_id.keys (and trades_by_security_id.keys) as the source for
holding_security_ids/trade_security_ids, apply the existing filter (select by
`@security_ids`) and uniq logic, then query Security.where(id: ...) as before;
update any downstream uses (e.g., load_prices grouping) to rely on these indexed
keys so the collection is initialized in a single pass and avoids N+1 behavior
when building unique_securities_from_holdings/unique_securities_from_trades.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 66efb3a5-dc44-4c6a-b516-3fc0b4d567e9

📥 Commits

Reviewing files that changed from the base of the PR and between 1759984 and 930d6e6.

📒 Files selected for processing (1)

app/models/holding/portfolio_cache.rb

coderabbitai

🧹 Nitpick comments (1)

app/models/holding/portfolio_cache.rb (1)

109-112: Consider selecting only needed DB price columns to reduce memory footprint.

Since this path can load many records, narrowing selected fields can further reduce allocation pressure.

Proposed refinement

       db_prices_by_security_id = Security::Price
         .where(security_id: security_ids, date: account.start_date..Date.current)
+        .select(:security_id, :date, :price, :currency)
         .group_by(&:security_id)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/models/holding/portfolio_cache.rb` around lines 109 - 112, The query
building db_prices_by_security_id in portfolio_cache.rb currently loads full
Security::Price records into memory; change the query to only select the columns
you actually need (e.g., security_id plus date and price or whatever fields
downstream logic in the PortfolioCache class expects) by using select or pluck
on Security::Price before grouping so you return lightweight structs/hashes
rather than full AR objects; ensure downstream code that uses
db_prices_by_security_id still accesses the selected fields (adjust accessors if
you switch to arrays/hashes).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@app/models/holding/portfolio_cache.rb`:
- Around line 109-112: The query building db_prices_by_security_id in
portfolio_cache.rb currently loads full Security::Price records into memory;
change the query to only select the columns you actually need (e.g., security_id
plus date and price or whatever fields downstream logic in the PortfolioCache
class expects) by using select or pluck on Security::Price before grouping so
you return lightweight structs/hashes rather than full AR objects; ensure
downstream code that uses db_prices_by_security_id still accesses the selected
fields (adjust accessors if you switch to arrays/hashes).

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2646bf42-9968-486e-a4a8-67ec900ba7d7

📥 Commits

Reviewing files that changed from the base of the PR and between 930d6e6 and 4ba8bd5.

📒 Files selected for processing (1)

app/models/holding/portfolio_cache.rb

sure-admin · 2026-04-29T17:47:50Z

Pushed a small follow-up commit addressing the open review nits I agreed with:

return dup'd arrays from PortfolioCache#get_trades so callers can't mutate memoized cache state
use the precomputed security-id indexes in collect_unique_securities
keep security-id dedupe in SQL via distinct.pluck(:security_id)
tighten the DB price preload to select only needed columns
harden cost-basis assertions with assert_in_delta

I also did a quick syntax pass locally. I couldn't run the full Rails test/lint suite in this environment because the local Ruby/Bundler here is 3.1 while the repo expects 3.4, so Bundler aborts before boot.

Investment accounts with large histories were pegging CPU at 100% during sync. Root cause was a cluster of quadratic and superlinear algorithms in the inner holding calculation loop. All are replaced with O(1) hash lookups built from single-pass indexes over the already-loaded data. Holding::PortfolioCache - load_prices: Three O(SxN) patterns inside the per-security loop: 1. DB prices: `security.prices.where(...)` fired one SQL query per security (N+1). Replaced with a single bulk query before the loop: Security::Price.where(security_id: ..., date: ...).group_by(&:security_id) 70 securities -> 70 queries becomes 1. 2. Trade prices: `trades.select { |t| t.entryable.security_id == id }` scanned the full trades array for every security - O(SxT). Replaced with trades_by_security_id, pre-indexed once from the loaded array. 3. Holding prices: `holdings.select { |h| h.security_id == id }` - same O(SxH) pattern. Replaced with holdings_by_security_id. Prices are now indexed into prices_by_date and prices_by_date_and_source hashes during load_prices, making get_price O(1) instead of scanning the flat prices array on every lookup. Holding::PortfolioCache - get_trades / get_price: - get_trades(date:): `trades.select { |t| t.date == date }` (O(T) scan) replaced with trades_by_date hash (O(1)). - get_price: two `prices.select { p.date == date ... }.min_by` linear scans replaced with direct hash lookups into prices_by_date and prices_by_date_and_source. Holding::PortfolioCache - collect_unique_securities: `holdings.map(&:security)` traversed the security association on every holding record (N+1 if not preloaded). Replaced with a pluck of security_ids followed by a single Security.where(id: ...) batch load. Holding::ForwardCalculator / ReverseCalculator: `holdings += build_holdings(...)` allocated a new array copy on every iteration - O(N) per day x thousands of days = O(D^2) total allocations. Replaced with holdings.concat(...) which appends in place, O(1). Holding::ReverseCalculator - precompute_cost_basis: Old: walked every date from account.start_date to Date.current (O(D)), writing a cost_basis entry for every security on every date. For an account with 2 trades over 9,250 days this wrote ~18,500 hash entries and consumed the full date range in the outer loop regardless of trade density. New: walks only buy trades (O(T)), appending one [date, avg_cost] snapshot per trade. cost_basis_for binary-searches the sparse snapshot array - O(log T) per lookup. Memory drops from O(DxS) to O(T). Holding::Gapfillable: `security_holdings.find { |h| h.date == date }` was called on every date in the gapfill range - O(H) per date, O(HxD) total. Replaced with security_holdings.index_by(:date) built once before the loop, making each date lookup O(1). Holding::Materializer - purge_stale_holdings: `account.entries.trades.map { |entry| entry.entryable.security_id }.uniq` loaded all trade entry records into Ruby then traversed the entryable association on each (N+1). Replaced with account.trades.pluck(:security_id).uniq (single SQL query returning only the IDs). In testing, these changes were able to reduce sync time of an account with 25 years of history and 70 securities from about 90 minutes down to under 3 minutes.

* return dup'd arrays from PortfolioCache#get_trades so callers can't mutate memoized cache state * use the precomputed security-id indexes in collect_unique_securities * keep security-id dedupe in SQL via distinct.pluck(:security_id) * tighten the DB price preload to select only needed columns * harden cost-basis assertions with assert_in_delta

wps260 · 2026-04-29T19:51:25Z

Pulled the sure-admin changes into my branch. All tests pass.

jjmata · 2026-04-29T20:04:04Z

Ah, perfect! It was a "trigger-happy" gpt-5.4 OpenClaw change. I backed them out since they should have been review comments instead!

Thank you. 🙏

jjmata · 2026-04-29T20:16:54Z

Actually, here's more food for thought:

This is a high-quality, well-reasoned performance PR with a dramatic real-world impact. The algorithmic changes are correct and the cost-basis tests are solid. Three items worth addressing before merge:

Verify account.trades association exists in the Account model — the purge_stale_holdings rewrite depends on it.
Consider dropping the .select column restriction on the bulk price query (or document the intent) to avoid silent nil attributes on Security::Price objects in future callers.
Remove or justify the .dup on the no-date get_trades path — it's either unnecessary overhead or needs a comment explaining which callers mutate the returned array.

jjmata · 2026-04-29T20:17:42Z

Full Sonnet 4.6 review:

PR #1579 — Performance improvements in holding calculation pipeline

Overview

This PR replaces a cluster of quadratic/superlinear algorithms in the holding sync pipeline with pre-built hash indexes and a single bulk DB query. The author reports 90 min → <3 min for a 25-year, 70-security account. The changes are focused and well-justified.

Code Quality & Correctness

holding/portfolio_cache.rb — bulk price load (load_prices)

The N+1 elimination (70 SQL queries → 1) is the biggest single win. However, the bulk query uses a partial .select:

Security::Price
  .where(security_id: security_ids, date: account.start_date..Date.current)
  .select(:security_id, :date, :price, :currency)

This drops id, source, created_at, updated_at, and any other columns. The code downstream only accesses price.price, price.currency, and price.date on these records, so it's likely safe today — but it's a silent footgun if a future caller accesses any other attribute on the Security::Price objects (they'd get nil without an error). Consider removing the .select to load full records, or add a comment noting the intentional column restriction.

holding/materializer.rb — purge_stale_holdings

- portfolio_security_ids = account.entries.trades.map { |entry| entry.entryable.security_id }.uniq
+ portfolio_security_ids = account.trades.distinct.pluck(:security_id)

This assumes Account has a has_many :trades association (likely through entries). If it's missing, this raises NoMethodError silently at runtime rather than at test time (pluck errors aren't always caught). Worth a quick check that Account defines this association explicitly.

holding/reverse_calculator.rb — precompute_cost_basis / cost_basis_for

The snapshot + binary-search approach is correct, but there's a subtle edge case worth noting: two buy trades on the same date produce two snapshots for that date. The binary search scans <= and keeps advancing, so it returns the last snapshot for that date — the fully-accumulated WAC. The test "cost_basis_for accumulates multiple buys on the same date" covers this, and the logic is correct. No issue, but the invariant (sort_by keeps same-date order stable for accumulation) is worth a short comment since it's non-obvious.

The hand-rolled binary search is fine. An alternative is snapshots.rindex { |snap| snap[0] <= date } but readability is comparable.

holding/portfolio_cache.rb — get_trades defensive .dup

def get_trades(date: nil)
  if date.blank?
    trades.dup          # added
  else
    trades_by_date[date]&.dup || []   # added
  end
end

The .dup on the date-specific path is correct: it prevents callers from mutating the cached group. However, the no-date path .dup is unnecessary for the only visible caller (precompute_cost_basis), which immediately chains .sort_by (which already allocates a new array). If there are other callers that mutate the returned array this is needed, otherwise it's an O(T) allocation on every full-trades fetch. Either remove it or document why it's required.

Performance

All algorithmic changes are correct and the PR description's O() analysis matches the code. A few secondary notes:

prices_by_date and prices_by_date_and_source are built with min_by(&:priority) at index time, which matches the old query-time min_by. Semantics preserved. ✓
holdings.concat(...) vs holdings += for both calculators: correct, avoids O(D²) allocations over the full date range. ✓
collect_unique_securities now issues one Security.where(id: ids) query instead of traversing associations. Correct, though it adds a new query that didn't exist when use_holdings: false with the old association-preloaded path. Still a net win.

Test Coverage

Good:

Six focused unit tests for cost_basis_for covering nil, before-first-buy, single buy, carry-forward, same-date accumulation, and sell ignoring. These are the highest-risk behavioral changes.
OpenStruct usage matches project conventions.

Missing:

No tests for the PortfolioCache changes (bulk price load, index lookups, collect_unique_securities). A test verifying that get_price returns the same result with the new index as the old linear scan would catch regressions if the index-building logic drifts.
No test for purge_stale_holdings with the new account.trades path.

Existing Review Comments

The RuboCop Layout/SpaceInsideArrayLiteralBrackets issue (line 30 of portfolio_cache.rb) is already resolved and the thread is marked closed — the current diff shows [ date, source ] with spaces. ✓

brin-security-scanner Bot added the contributor:flagged Contributor flagged for review by trust analysis. label Apr 28, 2026

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

jjmata removed the contributor:flagged Contributor flagged for review by trust analysis. label Apr 28, 2026

jjmata assigned wps260 Apr 28, 2026

jjmata added this to the v0.7.0 milestone Apr 28, 2026

jjmata self-requested a review April 28, 2026 16:07

brin-security-scanner Bot added pr:verified PR passed security analysis. contributor:flagged Contributor flagged for review by trust analysis. and removed pr:verified PR passed security analysis. labels Apr 28, 2026

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

Comment thread app/models/holding/portfolio_cache.rb Outdated

brin-security-scanner Bot added pr:verified PR passed security analysis. and removed pr:verified PR passed security analysis. labels Apr 28, 2026

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

sure-admin modified the milestones: v0.7.0, v0.7.1 Apr 29, 2026

sure-admin added this to Latest release train (v0.7.1) - May 2026 Apr 29, 2026

github-project-automation Bot moved this to Todo in Latest release train (v0.7.1) - May 2026 Apr 29, 2026

jjmata removed the contributor:flagged Contributor flagged for review by trust analysis. label Apr 29, 2026

brin-security-scanner Bot added pr:verified PR passed security analysis. contributor:flagged Contributor flagged for review by trust analysis. and removed pr:verified PR passed security analysis. labels Apr 29, 2026

sure-admin force-pushed the perf/holding-pipeline branch from 1b2d7c1 to 4ba8bd5 Compare April 29, 2026 18:20

brin-security-scanner Bot added pr:verified PR passed security analysis. and removed pr:verified PR passed security analysis. labels Apr 29, 2026

jjmata removed this from Stable release train (v0.7.0) - April 2026 Apr 29, 2026

wps260 added 2 commits April 29, 2026 14:49

Lint fix

09d7228

wps260 added 2 commits April 29, 2026 14:49

Lint fix

d217eb5

wps260 force-pushed the perf/holding-pipeline branch from 4ba8bd5 to 84104a4 Compare April 29, 2026 19:50

brin-security-scanner Bot added pr:verified PR passed security analysis. and removed pr:verified PR passed security analysis. labels Apr 29, 2026

jjmata removed this from Latest release train (v0.7.1) - May 2026 Apr 29, 2026

jjmata added this to Stable release train (v0.7.0) - April 2026 Apr 29, 2026

github-project-automation Bot moved this to Todo in Stable release train (v0.7.0) - April 2026 Apr 29, 2026

jjmata modified the milestones: v0.7.1, v0.7.0 Apr 29, 2026

jjmata approved these changes Apr 29, 2026

View reviewed changes

Conversation

wps260 commented Apr 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

brin-security-scanner Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Contributor Trust Check — Review Recommended

Uh oh!

coderabbitai Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sure-admin commented Apr 29, 2026

Uh oh!

wps260 commented Apr 29, 2026

Uh oh!

jjmata commented Apr 29, 2026

Uh oh!

jjmata commented Apr 29, 2026

Uh oh!

jjmata commented Apr 29, 2026

PR #1579 — Performance improvements in holding calculation pipeline

Overview

Code Quality & Correctness

Performance

Test Coverage

Existing Review Comments

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wps260 commented Apr 28, 2026 •

edited by coderabbitai Bot

Loading

brin-security-scanner Bot commented Apr 28, 2026 •

edited

Loading

coderabbitai Bot commented Apr 28, 2026 •

edited

Loading