Always use topological sort for published pages#22896
Conversation
PR #10476 introduced `MAX_TOPOLOGICAL_PAGE_COUNT = 100` to cap the topological (hierarchical) sort and fall back to sorting by publish date for sites with 100+ published pages. This was done to match Calypso's behavior at the time. The date-based fallback produced a completely different ordering than what users see on the web — pages lost their parent/child hierarchy and indentation, and edited pages could jump to the top of the list due to transient local state during save. This commit: - Removes `MAX_TOPOLOGICAL_PAGE_COUNT` and the date-based fallback so all published pages are sorted hierarchically regardless of count. - Rewrites `topologicalSort` to build a `parentId → children` lookup map in a single `groupBy` pass then walk the tree once, instead of re-scanning the full page list on every recursive call.
|
|
|
|
🤖 Build Failure AnalysisThis build has failures. Claude has analyzed them - check the build annotations for details. |
The `sorts 100 or more pages by date DESC` test validated the date-based fallback that no longer exists. The remaining topological sort test is renamed since it now applies to all page counts.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## trunk #22896 +/- ##
==========================================
- Coverage 37.32% 37.32% -0.01%
==========================================
Files 2320 2320
Lines 124578 124576 -2
Branches 16926 16925 -1
==========================================
- Hits 46498 46495 -3
Misses 74319 74319
- Partials 3761 3762 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| val childrenByParentId = pages | ||
| .filterNot { isRoot(it) } | ||
| .groupBy { it.parent?.remoteId } | ||
| val roots = pages.filter { isRoot(it) } |
There was a problem hiding this comment.
⛏️ You could avoid iterating pages twice with filter and filterNot to sepparated roots by:
val (roots, nonRoots) = pages.partition { isRoot(it) }
adalpari
left a comment
There was a problem hiding this comment.
Just left a minor nitpick. But feel free to apply it or not.
LGTM!


Description
#10476 introduced
MAX_TOPOLOGICAL_PAGE_COUNT = 100to cap the topological (hierarchical) sort and fall back to sorting by publish date for sites with 100+ published pages. This was done to match Calypso's behavior at the time.The date-based fallback produced a completely different ordering than what users see on the web — pages lost their parent/child hierarchy and indentation, and edited pages could jump to the top of the list due to transient local state during save.
This commit:
MAX_TOPOLOGICAL_PAGE_COUNTand the date-based fallback so all published pages are sorted hierarchically regardless of count.topologicalSortto build aparentId → childrenlookup map in a singlegroupBypass then walk the tree once, instead of re-scanning the full page list on every recursive call.To Test
MAX_TOPOLOGICAL_PAGE_COUNTto something like2, you can verify that the ordering is completely different from the web.Although I don't know why Calypso had the 100 pages limit, maybe due to network request page sizing, I don't believe it makes sense in our case because we fetch all the pages before we show it to the user.
I couldn't find a compelling reason to keep this behavior. The only consideration I could think of was the performance cost, because the current implementation is
O(n^2)but this PR improves that toO(n).Fixes CMM-2065