Skip to content

fix(record): stop leaking same-origin iframe state on removal#3570

Open
gruessi wants to merge 4 commits into
PostHog:mainfrom
gruessi:fix/iframe-memory-leak
Open

fix(record): stop leaking same-origin iframe state on removal#3570
gruessi wants to merge 4 commits into
PostHog:mainfrom
gruessi:fix/iframe-memory-leak

Conversation

@gruessi
Copy link
Copy Markdown

@gruessi gruessi commented May 12, 2026

Problem

Recording a page that mounts and unmounts same-origin iframes leaks each iframe's Document, every node serialized into the mirror, one MutationObserver per mount, and its Window. End-to-end heap test against a host that cycles 5 blob-URL iframes every 2s for 110s: +118 MB / +390 leaked HTMLDocuments, ~300 KB per mount. Long-running tabs eventually hit the browser's memory limit.

Cross-origin and about:blank iframes go through a different code path and weren't affected.

Changes

Eight independent retention chains in packages/rrweb/. Each was masking the next — fixing any subset still showed leak in heap snapshots.

  • Listener leaks. onceIframeLoaded now returns a disposer (load listener + timer); IframeManager's pagehide handler is named and tracked. Both are removed in removeIframeById.
  • recordCrossOriginIframes cleanup gate dropped so the same-origin path actually runs cleanup. cleanupDetachedIframes() sweeps orphans when an iframe is removed via its parent.
  • iframe.src = 'about:blank' before unmount (typical React cleanup) swaps contentDocument/contentWindow so the mirror walks an empty doc and nestedIframeListeners: Map<Window> misses its entry. New attachedDocuments / attachedWindows WeakMaps capture every doc/window observed across the iframe's lifetime; detach walks all of them.
  • Global arrays accumulated forever. mutationBuffers[] and handlers[] had no iframe-removal path. findAndRemoveIframeBuffer now matches by original doc; new runAndDetachIframeCleanup splices closures out of handlers[].
Per-chain detail (file-by-file)
  1. rrweb-snapshot/src/snapshot.tsonceIframeLoaded. Returns a disposer that removes the listener and clears the pending timer. The readyState === 'complete' branch previously registered a second listener next to setTimeout(listener, 0) — that one is now named (onSubsequentLoad) and removed by the same disposer. serializeNodeWithId/snapshot gain an optional onIframeListenerRegistered?(iframe, disposer) callback so the disposer travels back to IframeManager.
  2. IframeManager.attachIframe. Anonymous pagehide handler became a named function tracked in pageHideHandlers: WeakMap<HTMLIFrameElement, Set<{win, handler}>>. Set-of-handlers because an iframe can run through several Windows during its lifetime.
  3. record/index.ts wrappedMutationEmit. Cleanup gate on recordCrossOriginIframes removed — same-origin leaks too.
  4. IframeManager.cleanupDetachedIframes(). Sweeps attachedIframes for ids no longer in the mirror to catch nested iframe removals (mutation buffer only reports the directly-removed ancestor).
  5. attachedDocuments: WeakMap<HTMLIFrameElement, Set<Document>>. Captures every contentDocument observed across an iframe's lifetime (initial blank → loaded src → cleanup blank). Detach walks the Set and calls mirror.removeNodeFromMap(doc) on each.
  6. mutationBuffers[] cleanup. MutationBuffer.bufferDoc() exposes the doc; findAndRemoveIframeBuffer takes optional knownDocs: Set<Document> to match by original contentDocument. reset() nulls this.doc as defence in depth.
  7. handlers[] cleanup. runAndDetachIframeCleanup(iframeId) runs the cleanup and splices the closure from both iframeObserverCleanups and handlers. iframeObserverCleanups is Map<number, Set<cleanup>> so navigations don't overwrite earlier cleanups.
  8. attachedWindows: WeakMap<HTMLIFrameElement, Set<Window>>. Mirrors attachedDocuments. removeIframeById walks the captured Set and removes each entry from nestedIframeListeners and crossOriginIframeMap. Fixes a stale code comment that incorrectly claimed Map<Window> keys don't prevent GC.

Validation

End-to-end heap snapshot, same browser, same recorder config, 61 cycles per run (≈305 mount/unmount events).

Unpatched After fix
Post-GC heap delta over 110s +118 MB −5 MB
native:HTMLDocument Δ +390 0
native:MutationObserver Δ +391 0
native:MutationObserverRegistration Δ +390 −1
closure:onIframeLoad Δ +195 +1

The negative post-GC delta is because the heap-snapshot operation forces a thorough GC that releases other accumulated state too — the real read is the constructor-level deltas. Top growth in the post-cycle snapshot is V8/JIT internals and transient PerformanceLongTaskTiming entries, not leaked iframe state.

Tests

packages/rrweb/rrweb/test/record/memory-leaks.test.ts30/30 pass (4 new regressions for this fix: same-origin cleanup, nested parent removal, captured-doc walk, multi-doc lifecycle). check-types clean for both @posthog/rrweb and @posthog/rrweb-snapshot.

Release info Sub-libraries affected

Libraries affected

  • All of them
  • posthog-js (web)
  • posthog-js-lite (web lite)
  • posthog-node
  • posthog-react-native
  • @posthog/react
  • @posthog/ai
  • @posthog/convex
  • @posthog/next
  • @posthog/nextjs-config
  • @posthog/nuxt
  • @posthog/rollup-plugin
  • @posthog/webpack-plugin
  • @posthog/types

The changeset bumps @posthog/rrweb and @posthog/rrweb-snapshot (patch). updateInternalDependencies: "patch" cascades the bump through rrweb-recordposthog-js.

Checklist

  • Tests for new code
  • Accounted for the impact of any changes across different platforms
  • Accounted for backwards compatibility of any changes (no breaking changes!)
  • Took care not to unnecessarily increase the bundle size

No public API changes. The new onIframeListenerRegistered? option is optional and additive; everything else is internal to IframeManager / MutationBuffer / record/index.ts. Bundle-size impact is small (~handful of WeakMap declarations + cleanup helpers) — worth confirming via bundle-size CI on this PR.

If releasing new changes

  • Ran pnpm changeset to generate a changeset file

gruessi added 2 commits May 11, 2026 16:53
Same-origin iframes mounted and unmounted while session recording is
active leaked the iframe Document, every node serialized into the
mirror, and one MutationObserver per mount. Per-iframe retained cost
≈300 KB; for hosts that mount/unmount many iframes (e.g. flow editors
that render previews via blob URLs) memory can grow into the hundreds
of MB over an extended session.

Nine retainer chains had to be closed:

- onceIframeLoaded (snapshot.ts) never removed its load listener; the
  closure pinned the iframe via captured `mirror` / `iframeManager`.
  It now returns a disposer the caller can invoke on detach. The
  branch that re-registered a load listener after the iframe was
  already loaded is collapsed into the disposer-tracked path. The
  blank-frame-during-navigation guard from the original code is
  preserved: when win.location.href is about:blank but iframeEl.src
  is a non-empty/non-about:blank URL we register only the load
  listener (no immediate serialization), so the recording does not
  capture the transient about:blank document Chrome assigns before
  the real navigation completes.

- IframeManager.attachIframe registered an anonymous `pagehide`
  handler with no way to remove it. Convert it to a named function
  and track per iframe in `pageHideHandlers`. attachIframe runs once
  per iframe load, so a single iframe can collect handlers across
  several Windows; the map stores a Set so every handler is detached
  on removeIframeById, not just the latest. Same shape applied to
  load-listener disposers via `loadListenerDisposers`.

- The cleanup branch in `record/index.ts`'s wrappedMutationEmit was
  gated on `recordCrossOriginIframes`; same-origin iframes never had
  their iframeObserverCleanups entry called or
  iframeManager.removeIframeById invoked. Drop the gate. Also add
  IframeManager.cleanupDetachedIframes so iframes removed inside a
  removed subtree (where only the ancestor's id appears in
  m.removes) get cleaned up too.

- Hosts can swap `iframe.src` to `about:blank` before removing the
  iframe — typical React unmount pattern. By detach time
  `iframe.contentDocument` and `iframe.contentWindow` no longer match
  what we attached. Capture every (Document, Window) we ever saw per
  iframe in `attachedDocuments` / `attachedWindows` (WeakMaps keyed
  by iframe element, holding a Set of values), and walk the captured
  sets on detach to release `mirror.idNodeMap` entries and to unwind
  `nestedIframeListeners` (a strong Map keyed by Window).

- Iframes can be removed before their first load fires. attachIframe
  has not run yet so attachedIframes has no entry, and by the time
  wrappedMutationEmit calls removeIframeById the mirror entry has
  typically already been cleared by MutationBuffer.emit, so the
  fallback mirror.getNode lookup also returns null. New
  `iframeElementsById` Map (populated in registerLoadListenerDisposer
  and cleared in removeIframeById) lets the cleanup path find the
  iframe and run its disposer in this pre-load case.

- Iframe navigations overwrite per-iframe observer cleanups.
  loadListener fires once per load, but iframeObserverCleanups stored
  only one cleanup per iframe id — repeated navigations pushed every
  cleanup into `handlers[]` while leaving only the most recent one
  reachable for runAndDetachIframeCleanup. Earlier MutationObservers
  and MutationBuffers stayed alive until stopRecording.
  iframeObserverCleanups is now `Map<number, Set<listenerHandler>>`;
  detach iterates the Set, splicing each entry out of `handlers[]`.

- Reparenting an iframe must not be confused with removal.
  MutationBuffer.emit clears the mirror entry before re-serializing
  the add, so a moved iframe can come back with a fresh id — the
  previous same-id check in wrappedMutationEmit missed that case
  and disposed the (still-needed) listener and observers, breaking
  recording for the moved iframe until the next full snapshot.
  Cross-check by element identity: walk m.adds, look up each new id
  in the mirror, and if the removed-id's element matches one of
  those, treat it as a move and only drop the stale id mapping
  (new IframeManager.forgetIframeId helper) without disposing
  anything.

- IframeManager.destroy() reassigned `loadListenerDisposers` and
  `pageHideHandlers` to fresh WeakMaps without iterating them, so
  the registered DOM listeners and pending onceIframeLoaded timers
  outlived stopRecording(). A late iframe load could then call back
  into iframeManager.attachIframe and emit mutations after the user
  thought recording had stopped. destroy() now enumerates tracked
  iframes via the iterable id-keyed Maps and runs disposers/removers
  before dropping the WeakMaps.

- The global `mutationBuffers[]` array (observer.ts) and the local
  `handlers[]` array (record/index.ts) accumulated per-iframe entries
  forever — `findAndRemoveIframeBuffer` was only invoked on
  pagehide, and `handlers[]` was never cleaned up at all. Splice
  both on iframe detach via a new `runAndDetachIframeCleanup` helper.
  `findAndRemoveIframeBuffer` now accepts an optional captured-docs
  set so it can match a buffer by its original document even after
  iframe.contentDocument has been swapped. `MutationBuffer.reset()`
  clears `this.doc` as defence in depth.

Validated end-to-end with a host page that mounts and unmounts five
blob-URL iframes every 2s for 110s. In a clean Chrome profile (no
extensions, no PostHog identity):

- Before: post-GC heap +118 MB, +390 leaked HTMLDocuments,
  +391 MutationObservers.
- After: post-GC heap ~0 MB, +1 leaked HTMLDocument,
  0 MutationObservers.

30/30 unit tests in test/record/memory-leaks.test.ts pass; new tests
cover the captured-doc walk, same-origin cleanup without
recordCrossOriginIframes, nested-removal sweep, pre-load removal,
iframe reparenting, and post-stop disposer execution.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 12, 2026

@gruessi is attempting to deploy a commit to the PostHog Team on Vercel.

A member of the Team first needs to authorize it.

# Conflicts:
#	packages/rrweb/rrweb/src/record/observer.ts
@gruessi gruessi marked this pull request as ready for review May 12, 2026 09:50
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 12, 2026

Comments Outside Diff (1)

  1. packages/rrweb/rrweb/test/record/memory-leaks.test.ts, line 811-815 (link)

    P2 Repeated IframeManager setup violates Once-and-Only-Once

    The four new unit tests all construct an identical IframeManager with the same mock mirror, mutationCb, wrappedEmit, and stylesheetManager stub. A small factory helper (e.g., makeIframeManager(overrides?)) would collapse five copies of the same 15-line block into one, making future option changes a single-line edit and keeping the tests focused on the behaviour under test.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: packages/rrweb/rrweb/test/record/memory-leaks.test.ts
    Line: 811-815
    
    Comment:
    **Repeated `IframeManager` setup violates Once-and-Only-Once**
    
    The four new unit tests all construct an identical `IframeManager` with the same mock `mirror`, `mutationCb`, `wrappedEmit`, and `stylesheetManager` stub. A small factory helper (e.g., `makeIframeManager(overrides?)`) would collapse five copies of the same 15-line block into one, making future option changes a single-line edit and keeping the tests focused on the behaviour under test.
    
    How can I resolve this? If you propose a fix, please make it concise.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
packages/rrweb/rrweb/src/record/mutation.ts:254-263
`bufferDoc()` and `reset()` both reach through `(this as unknown as ...)` to access the `private doc` field. This cast bypasses TypeScript's type system and will silently break if `doc` is renamed. Exposing `doc` via a real getter or using a single consistent cast would be cleaner and safer.

```suggestion
  public reset() {
    this.shadowDomManager.reset();
    this.canvasManager.reset();
    // Defence in depth: clear doc in case a stale closure still pins the buffer.
    this.doc = null as unknown as observerParam['doc'];
  }

  public bufferDoc(): Document | null {
    return this.doc as Document | null;
  }
```

### Issue 2 of 2
packages/rrweb/rrweb/test/record/memory-leaks.test.ts:811-815
**Repeated `IframeManager` setup violates Once-and-Only-Once**

The four new unit tests all construct an identical `IframeManager` with the same mock `mirror`, `mutationCb`, `wrappedEmit`, and `stylesheetManager` stub. A small factory helper (e.g., `makeIframeManager(overrides?)`) would collapse five copies of the same 15-line block into one, making future option changes a single-line edit and keeping the tests focused on the behaviour under test.

Reviews (1): Last reviewed commit: "Merge remote-tracking branch 'origin/mai..." | Re-trigger Greptile

Comment thread packages/rrweb/rrweb/src/record/mutation.ts
- mutation.ts: replace receiver-cast on private `doc` with value-side
  cast using `observerParam['doc']`, so renames stay type-safe.
- memory-leaks.test.ts: extract a `makeIframeManager` factory in the
  `IframeManager.attachedDocuments` describe block; both tests use it.

Pure refactor — no runtime behavior change. 31/31 tests still pass.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 12, 2026

Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
packages/rrweb/rrweb-snapshot/src/snapshot.ts:404-423
`onSubsequentLoad` is a one-liner wrapper that simply forwards to `listener` with no argument transforms. It's a superfluous wrapper per the "no superfluous parts" rule — using `listener` directly as the event-listener reference is equivalent and removes the extra allocation.

```suggestion
  if (
    winLocationHref !== blankUrl ||
    iframeEl.src === blankUrl ||
    iframeEl.src === ''
  ) {
    // Genuinely loaded — fire on the next tick so the host mutation that
    // surfaced this iframe has time to commit; also re-fire on later loads.
    const initialTimer = setTimeout(listener, 0);
    iframeEl.addEventListener('load', listener);
    return () => {
      clearTimeout(initialTimer);
      iframeEl.removeEventListener('load', listener);
    };
  }
  // Transient blank during navigation — wait for the real load.
  iframeEl.addEventListener('load', listener);
  return () => {
    iframeEl.removeEventListener('load', listener);
  };
```

### Issue 2 of 2
packages/rrweb/rrweb/src/record/iframe-manager.ts:496-503
**`findAndRemoveIframeBuffer` not called when `capturedDocs` is absent**

When an iframe is removed before `attachIframe` was ever called (so `capturedDocs` is `undefined`), `findAndRemoveIframeBuffer` is never invoked from `removeIframeById`. In that scenario no `MutationBuffer` exists either (the buffer is created inside `loadListener` which is only called from `attachIframe`), so the omission is currently safe. However, this silent invariant is easy to break: a future path that creates a buffer without going through `attachIframe` would leak silently. A brief comment noting the invariant — or an unconditional `findAndRemoveIframeBuffer(iframe)` call outside the `if (capturedDocs)` block — would make the intent explicit.

Reviews (2): Last reviewed commit: "chore(record): tidy up per Greptile revi..." | Re-trigger Greptile

@marandaneto marandaneto requested review from a team, TueHaulund, arnaudhillen, fasyy612, ksvat and nicowaltz and removed request for a team May 12, 2026 11:33
@ksvat ksvat requested a review from pauldambra May 14, 2026 12:06
Copy link
Copy Markdown
Member

@pauldambra pauldambra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks safe to me
i feel like we're accruing a very similar "disposables" pattern across the code
but that's not the fault of this PR, or blocking here

Copy link
Copy Markdown
Member

@pauldambra pauldambra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

🤖 Automated comment by QA Swarm — not written by a human

QA Swarm review complete. See inline comments and the summary comment for the full verdict.

}
// Forward-declared; assigned inside the try{} block where `handlers` is in scope.
let runAndDetachIframeCleanup: (iframeId: number) => void = () => {};
let cleanupDetachedIframeObservers: () => void = () => {};
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

🤖 Automated comment by QA Swarm — not written by a human

[convergent: paul + xp + qa-team/frontend] 🟡 MEDIUM

Forward-declaring runAndDetachIframeCleanup / cleanupDetachedIframeObservers as () => {} and reassigning inside the try{} block makes construction order load-bearing in a quiet way. Anything that ends up invoking wrappedMutationEmit before the try-block executes gets a silent no-op cleanup — there's no warning, the leak just comes back.

Three refactors that would tighten this:

  • Extract handlers + iframeObserverCleanups into a small IframeObserverRegistry class with register(id, handler) / detach(id) / cleanupDetached() methods. The class owns its own state, no forward-declared no-ops needed.
  • Or: let runAndDetachIframeCleanup: ((id: number) => void) | undefined and call runAndDetachIframeCleanup?.(id) — silent misuse becomes a no-op only when intentionally unset, and a future caller that fires too early visibly does nothing rather than swallowing a leak.
  • Or: lift handlers to outer scope so the closures don't need to be reassigned at all.

const addedIframeElements = new Set<HTMLIFrameElement>();
if (m.adds.length > 0) {
for (const add of m.adds) {
const node = mirror.getNode(add.node.id);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

🤖 Automated comment by QA Swarm — not written by a human

[paul] 🟠 HIGH

the reparent detection uses mirror.getNode(add.node.id) to map the new id back to the element. that only works because MutationBuffer.emit drains mapRemoves before pushAdd serialises the new id. that's a deep cross-file coupling — a change to emit's ordering would silently corrupt reparent detection and we'd start tearing down observers on every iframe move.

could we add a comment over in MutationBuffer.emit pointing at this expectation, or a test that fails when emit's drain order changes? a silent regression here would look like "iframes occasionally stop recording after a move" — exactly the sort of thing that's painful to track down later. ship as you see fit, but i'd want a tripwire.

});

// Safety net: cleanup any iframes that have become detached or inaccessible
// Catch iframes removed inside a removed subtree (only the ancestor's
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

🤖 Automated comment by QA Swarm — not written by a human

[qa-team/reliability] 🟡 MEDIUM

The two cleanup orderings disagree. On the explicit m.removes path (around lines 282-284) runAndDetachIframeCleanup(id) runs before iframeManager.removeIframeById(id), so the MutationObserver is disconnected before findAndRemoveIframeBuffer resets the buffer and nulls this.doc. But the safety-net path here runs iframeManager.cleanupDetachedIframes() (which transitively calls buf.reset() -> this.doc = null) before cleanupDetachedIframeObservers() disconnects the observer.

Combined with the new reset() nulling this.doc, a mutation queued in the microtask window between those two calls would land on a buffer whose doc is null and throw inside pushAdd's serializeNodeWithId({ doc: this.doc, ... }). The race window is narrow (browsers deliver MutationObserver callbacks via microtasks) but real.

Suggested fix: swap the order to cleanupDetachedIframeObservers() first, then iframeManager.cleanupDetachedIframes(). That matches the explicit-remove path and closes the race.

this.shadowDomManager.reset();
this.canvasManager.reset();
// Defence in depth: clear doc in case a stale closure still pins the buffer.
this.doc = null as unknown as observerParam['doc'];
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

🤖 Automated comment by QA Swarm — not written by a human

[convergent: paul + xp + qa-team/reliability] 🟡 MEDIUM

this.doc = null as unknown as observerParam['doc'] — three reviewers independently snagged on this. Either:

  • doc really can be null after reset, in which case make the field's type honest (Document | null) and fix the few call sites that read it (pushAdd's serializeNodeWithId({ doc: this.doc, ... }), addShadowRoot(..., this.doc)), or
  • The buffer simply isn't used post-reset and the null-assignment isn't needed at all — let the GC collect the buffer naturally now that it's spliced from mutationBuffers[].

The "defence in depth" comment + the double-cast suggests neither path was fully committed to. Pick a story. Bonus: public bufferDoc(): Document | null on the next line claims nullability but the field is typed non-null — that inconsistency is the same smell.

If you keep the null-assignment, please also harden processMutation/emit with a if (!this.doc) return; guard so a late MutationObserver callback during the cleanup race (see the related comment on wrappedMutationEmit) is harmless instead of an exception.

private attachedWindows: WeakMap<HTMLIFrameElement, Set<Window>> =
new WeakMap();
private attachedDocuments: WeakMap<HTMLIFrameElement, Set<Document>> =
new WeakMap();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

🤖 Automated comment by QA Swarm — not written by a human

[convergent: xp + paul + qa-team/performance + qa-team/compatibility] 🟡 MEDIUM

Four-plus WeakMaps/Maps keyed by HTMLIFrameElement (attachedWindows, attachedDocuments, loadListenerDisposers, pageHideHandlers, plus the strong-keyed iframeElementsById) are a textbook Data Clump. They're populated together in attachIframe and cleared together in removeIframeById / destroy. Every retainer chain ends up with its own let bucket = …; if (!bucket) … boilerplate.

One WeakMap<HTMLIFrameElement, IframeAttachmentState> where IframeAttachmentState = { windows: Set<Window>; documents: Set<Document>; loadDisposers: Set<() => void>; pageHide: Set<{ win: Window; handler: () => void }> } would name the concept once. Cleanup becomes one delete; the "one bucket per element" pattern lives in a single helper.

Secondary observation: iframeElementsById is a strong Map, while the others are WeakMap — that's deliberate (you need to look up by id) but worth a comment that this map MUST be cleared on every detach pathway or it pins iframe elements forever.

Observability nudge while you're in here — five separate retainer chains is a lot to keep right forever. A debug-only counter that logs attachedDocuments.size / loadDisposers.size etc. crossing a threshold would let us see the next regression before users do.

fired = true;
let fired = false;
let timer: ReturnType<typeof setTimeout> | null = null;
const onLoad = () => {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

🤖 Automated comment by QA Swarm — not written by a human

[xp] 🟡 MEDIUM

The "still loading" branch's onLoad calls iframeEl.removeEventListener('load', onLoad) after firing, so subsequent loads no longer re-fire the listener. Compare to the "complete" branch right below this, which carefully wires onSubsequentLoad precisely to keep firing on later loads.

This asymmetry is either:

  • Intentional — then please leave a comment explaining why a still-loading iframe doesn't deserve subsequent-load handling, or
  • An oversight — the PR description says "Accumulate every contentDocument across loads", which strongly suggests this branch should also keep the listener attached after the first load (just behind the fired guard) so an iframe.src swap while the iframe is still loading the initial document gets its second contentDocument tracked too.

The latter feels more consistent with the rest of the diff's intent.


callSafely(() =>
iframeEl.contentWindow?.addEventListener('pagehide', () => {
callSafely(() => {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

🤖 Automated comment by QA Swarm — not written by a human

[qa-team/generalist-a] 🟢 LOW

attachIframe adds a pagehide listener on every call without checking for an existing entry. A reparented iframe (which calls attachIframe again from MutationBuffer.emit's onIframeLoad after the move) will accumulate N pagehide listeners on the same Window. The handlers are idempotent (mirror.removeNodeFromMap is a no-op for an unknown id, etc.), so this isn't a correctness bug — but the pageHideHandlers Set grows on every reparent, and every listener fires once per event.

Dedupe before adding:

if (Array.from(bucket).some(({ win }) => win === pageHideWindow)) return;


this.iframeElementsById.delete(iframeId);

if (iframe) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

🤖 Automated comment by QA Swarm — not written by a human

[xp] 🟢 LOW

removeIframeById is now ~60 lines doing seven distinct things: resolve element, clear nested-Window listeners, clear legacy-Window path, clear crossOrigin map, drop iframe, dispose load listeners, remove pagehide, walk captured docs. Each block has a clarifying comment — those comments are extract-method opportunities (detachNestedWindowListeners(iframe), releaseCapturedDocuments(iframe), disposeLoadListeners(iframe), etc.).

Kent Beck's ComposedMethod: keep all operations in a method at the same level of abstraction. Right now the method mixes the "detach observers" abstraction with the "release captured DOM refs" abstraction with the "forget the iframe" abstraction. The next maintainer has to read the whole thing to follow one retainer chain. Not blocking — works today — but the shape is worth pressing on before this file grows another chain.

await new Promise((r) => setTimeout(r, 50));

// Move must not have spliced any per-iframe MutationBuffer entries.
expect(mutationBuffers.length).toBeGreaterThanOrEqual(buffersBeforeMove);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note

🤖 Automated comment by QA Swarm — not written by a human

[paul] 🟢 LOW

the reparent test asserts mutationBuffers.length >= buffersBeforeMove. that's deliberately permissive and fine for catching the bug here — but a future regression that adds buffers per move would still pass.

worth tightening to === if the count is stable across the reparent, or splitting into a separate assertion on the iframe-specific buffer count (e.g. count buffers whose doc matches the moved iframe's contentDocument). otherwise we've shipped a test that's load-bearing for one direction of regression and silent on the other.

@pauldambra
Copy link
Copy Markdown
Member

Note

🤖 Automated comment by QA Swarm — not written by a human

Multi-perspective review: qa-team (specialists + generalists), paul-reviewer, xp-reviewer

Verdict: ⚠️ REQUEST CHANGES

The leak fix itself is sound and the +118 MB / +390 leaked HTMLDocuments -> ~0 validation is a real receipt. The requested changes are about robustness and maintainability: one narrow correctness concern (safety-net cleanup ordering can race against the new MutationBuffer.reset()), a load-bearing reparent invariant that needs a comment or test, and three convergent design smells in the new state plumbing.

Key findings

🟠 High

  • packages/rrweb/rrweb/src/record/index.ts:267 — reparent detection silently couples to MutationBuffer.emit's drain order (mapRemoves before pushAdd). A future change there would tear down observers on every iframe move with no test failure. Add a comment in emit or a regression test.

🟡 Medium

  • packages/rrweb/rrweb/src/record/index.ts:286 (qa-team/reliability) — the safety-net path runs iframeManager.cleanupDetachedIframes() (which nulls MutationBuffer.doc) before cleanupDetachedIframeObservers() disconnects the observer. Swap the order to match the explicit-remove path and close the narrow race window.
  • packages/rrweb/rrweb/src/record/mutation.ts:258 (convergent — paul + xp + qa-team)this.doc = null as unknown as observerParam['doc'] plus bufferDoc(): Document | null on a non-null field. Pick a story: either make the field properly nullable and guard the call sites, or drop the null-assignment entirely.
  • packages/rrweb/rrweb/src/record/index.ts:188 (convergent — paul + xp + qa-team/frontend) — forward-declared runAndDetachIframeCleanup / cleanupDetachedIframeObservers no-ops reassigned inside try{}. Construction order is now load-bearing in a way that fails silently. Extract an IframeObserverRegistry class or use let f: T | undefined; f?.(...).
  • packages/rrweb/rrweb/src/record/iframe-manager.ts:39 (convergent — xp + paul + qa-team/performance + qa-team/compatibility) — Data Clump: 4 WeakMaps + 1 strong Map all keyed by HTMLIFrameElement, set/cleared together. Collapse to one WeakMap<HTMLIFrameElement, IframeAttachmentState> so the concept is named once.
  • packages/rrweb/rrweb-snapshot/src/snapshot.ts:365 (xp) — the "still loading" branch removes its load listener after the first fire; the "complete" branch right below carefully keeps re-firing via onSubsequentLoad. Either intentional (needs a comment) or an oversight that misses contentDocument swaps during initial load.

🟢 Low

  • iframe-manager.ts:200attachIframe adds a pagehide listener every call; reparented iframes accumulate N listeners on the same Window. Dedupe.
  • iframe-manager.ts:459removeIframeById is now ~60 lines doing 7 distinct things. Extract detachNestedWindowListeners, releaseCapturedDocuments, disposeLoadListeners per ComposedMethod.
  • memory-leaks.test.ts:1224 — reparent test uses >= which catches the current regression but is silent on a future one that adds buffers per move. Tighten to === or split the assertion.

Convergence

Three findings were flagged independently by all three reviewers (highest confidence):

Convergent finding Reviewers Location
Forward-declared no-op cleanup pattern paul + xp + qa-team/frontend index.ts:186-188
this.doc = null cast / bufferDoc(): | null inconsistency paul + xp + qa-team/reliability mutation.ts:254-261
Data Clump across iframe state WeakMaps xp + paul + qa-team/performance + qa-team/compatibility iframe-manager.ts:36-49

Reviewer summaries

Reviewer Assessment
🔍 qa-team Substantial, well-motivated fix. Main concern is the cleanup-ordering inconsistency in wrappedMutationEmit between the explicit-remove path and the safety-net path now that MutationBuffer.reset() nulls this.doc. Other findings are LOW/NIT. New tests cover the right regressions. No security, data-integrity, copy, or compatibility concerns.
👤 paul Solid work, +118MB -> ~0 is a real receipt. Biggest concern is the new cross-file coupling between wrappedMutationEmit's reparent detection and MutationBuffer.emit's drain ordering — wants a comment or test. The this.doc = null as unknown as ... cast is a small smell. Everything else is preference. Ship after a brief chat about the reparent invariant and the cast.
📐 xp Closes a real, validated leak with thorough tests. Passes the four rules of simple design overall, but the main smell is the Data Clump in IframeManager — four parallel WeakMaps keyed by the same element want to be one state object. Secondary smells: subtle behaviour change in onceIframeLoaded's "still loading" branch, forward-declared mutable function bindings papering over a scoping issue, and a small encapsulation leak via MutationBuffer.bufferDoc().

Automated by QA Swarm — not a human review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants