Performance: Two Missing Indexes on oc_filecache
Summary
After analysing the slow query log on a large production instance of Nextcloud (using Claude), two queries against oc_filecache are consistently slow due to missing index coverage. Both queries have been identified via performance_schema and EXPLAIN analysis, and both are resolved by adding lightweight indexes. The fixes have been validated in production for two weeks with measurable, significant improvement.
We are submitting this as a performance enhancement request for the Nextcloud core team to consider adding these indexes to the schema migration, so all users benefit automatically on upgrade.
Environment
|
|
| Nextcloud |
33.0.2 |
| Database |
MariaDB 10.11.16 |
oc_filecache rows |
~815,000 |
oc_filecache data size |
224 MB |
oc_filecache index size |
476 MB |
| Users |
~140 active |
| Files |
~326,000 user files across multiple storages |
Issue 1 — DELETE FROM oc_filecache WHERE path_hash = ? performs a full table scan
Affected query
DELETE FROM oc_filecache WHERE path_hash = ?
This query is executed during file deletion and cache invalidation operations.
Root cause
oc_filecache has a composite index fs_storage_path_hash (storage, path_hash). Because storage is the leading column, the index cannot be used for a WHERE path_hash = ? predicate alone — the query planner performs a full table scan instead.
Evidence — EXPLAIN before fix
type: ALL key: NULL rows: 814801 Extra: Using where
With ~815,000 rows, each DELETE took 44+ seconds and held row locks for the entire duration. Concurrent writes on the table caused lock wait timeouts, which cascaded into deadlocks on oc_preferences (which Nextcloud writes during the same request cycle).
What we observed in production
From performance_schema.events_statements_summary_by_digest and SHOW ENGINE INNODB STATUS:
oc_preferences deadlock cycle occurring multiple times per day
SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded appearing
regularly in nextcloud.log
- Deadlocks confirmed in InnoDB status between concurrent sessions updating
oc_preferences while a long-running oc_filecache DELETE held locks
Fix
ALTER TABLE oc_filecache ADD INDEX idx_path_hash (path_hash);
EXPLAIN after fix
type: ref key: idx_path_hash key_len: 97 rows: 1 Extra: Using where
Impact in production (measured over 14 days post-fix)
| Metric |
Before |
After |
| DELETE execution time |
44+ seconds |
< 1ms |
| InnoDB deadlocks |
Multiple per day |
0 |
Lock wait timeouts in nextcloud.log |
Daily |
Rare (isolated) |
| Row lock average wait time |
Seconds |
0ms |
| DELETE operations served via index |
— |
1,916 in 14 days |
Index size
43.6 MB — modest relative to the existing 476 MB index footprint on this table.
Notes
fs_storage_path_hash (storage, path_hash) cannot substitute for this index when storage is not part of the predicate. A single-column index on path_hash is required to allow direct lookup by hash alone.
path_hash stores a 32-character MD5 hex string (unique per row) so the index cardinality is very high and lookups resolve in O(1).
Issue 2 — Paginated file-scan query uses wrong index and performs a filesort
Affected query
SELECT fileid
FROM oc_filecache
WHERE storage = ?
AND path LIKE ?
AND fileid > ?
ORDER BY fileid
LIMIT ?
This query is the inner loop of the paginated cursor used by occ files:scan, Nextcloud background file indexing, and any code path that iterates over a storage's file list in batches (e.g. OC\Files\Cache\Scanner::scan()).
Root cause
The query planner selects fs_storage_mimetype (storage, mimetype) as the access path because storage is satisfied as a ref lookup and the index appears cheaper than a full scan. However, mimetype provides no further filtering, so the engine must:
- Scan all rows matching
storage = ? via fs_storage_mimetype
- Apply the
path LIKE ? and fileid > ? filters in a second pass
- Sort all surviving rows by
fileid (Using filesort)
On a storage with 300,000+ files this sort step is done entirely in memory on each paginated call.
Evidence — EXPLAIN before fix (real storage ID, 302,572 files)
type: ref
key: fs_storage_mimetype
key_len: 8
rows: 407583
Extra: Using index condition; Using where; Using filesort
Measurements from performance_schema.events_statements_summary_by_digest
| Metric |
Value |
| Total calls |
3,744 |
| Average execution time |
121.9 ms |
| Maximum execution time |
491.2 ms |
| Cumulative time |
456 seconds |
Fix
ALTER TABLE oc_filecache ADD INDEX idx_storage_fileid (storage, fileid);
With (storage, fileid) the planner can:
- Narrow to
storage = ? using the leading column
- Apply
fileid > ? as a range condition on the second column — rows are already ordered by fileid within each storage partition
- Satisfy
ORDER BY fileid LIMIT ? directly from the index with no sort step
EXPLAIN after fix (same storage, 302,572 files)
type: ref
key: idx_storage_fileid
key_len: 8
rows: 407583
Extra: Using index condition; Using where
Using filesort is gone. Results are delivered pre-ordered by fileid from the index.
Impact in production (direct timing, 14 days post-fix)
| Storage size |
Before (avg) |
After |
Improvement |
| 302,572 files |
~122 ms |
3.8 ms |
32× faster |
| 238,285 files |
~122 ms |
1.6 ms |
76× faster |
| 151,458 files |
~122 ms |
0.2 ms |
550× faster |
The query has dropped out of the top-10 slowest queries by cumulative time entirely. occ files:scan --all (326,000 files across all storages) completes in 2 minutes 4 seconds, down from several minutes previously.
Index usage confirmation (performance_schema, 14 days)
| Index |
Reads |
idx_storage_fileid |
58,814,260 |
fs_storage_mimetype (former choice) |
53,237,066 (used for other queries, not this one) |
Index size
20.6 MB — very lightweight.
Notes
- The existing
fs_storage_path_prefix (storage, path) index is close but cannot eliminate the filesort because its second column is path, not fileid. The planner cannot derive fileid order from a path-ordered index.
- The existing
fs_storage_size (storage, size, fileid) also covers (storage, fileid) as columns 1 and 3, but size in the middle means fileid is not contiguous in the index and the range condition cannot be satisfied efficiently.
- The fix is only effective for the paginated cursor pattern (
fileid > ? ORDER BY fileid). Queries filtering by path LIKE ? without the fileid > ? cursor continue to use fs_storage_path_prefix as before.
Proposed schema change
-- Issue 1: allow direct lookup by path_hash without requiring storage in the predicate
ALTER TABLE oc_filecache ADD INDEX idx_path_hash (path_hash);
-- Issue 2: allow paginated cursor (fileid > ?) to skip the filesort step
ALTER TABLE oc_filecache ADD INDEX idx_storage_fileid (storage, fileid);
Both statements are safe to run online (MariaDB and MySQL support online DDL for index additions on InnoDB tables) and are idempotent if wrapped in a conditional migration check.
Suggested migration integration
These indexes can be added to the Nextcloud schema via the existing migration infrastructure, for example by extending or adding a migration class that calls:
$table = $schema->getTable('filecache');
if (!$table->hasIndex('idx_path_hash')) {
$table->addIndex(['path_hash'], 'idx_path_hash');
}
if (!$table->hasIndex('idx_storage_fileid')) {
$table->addIndex(['storage', 'fileid'], 'idx_storage_fileid');
}
After the migration ships, users already running the manually added indexes will have the duplicate removed gracefully by the migration's hasIndex guard.
Performance: Two Missing Indexes on
oc_filecacheSummary
After analysing the slow query log on a large production instance of Nextcloud (using Claude), two queries against
oc_filecacheare consistently slow due to missing index coverage. Both queries have been identified viaperformance_schemaandEXPLAINanalysis, and both are resolved by adding lightweight indexes. The fixes have been validated in production for two weeks with measurable, significant improvement.We are submitting this as a performance enhancement request for the Nextcloud core team to consider adding these indexes to the schema migration, so all users benefit automatically on upgrade.
Environment
oc_filecacherowsoc_filecachedata sizeoc_filecacheindex sizeIssue 1 —
DELETE FROM oc_filecache WHERE path_hash = ?performs a full table scanAffected query
This query is executed during file deletion and cache invalidation operations.
Root cause
oc_filecachehas a composite indexfs_storage_path_hash (storage, path_hash). Becausestorageis the leading column, the index cannot be used for aWHERE path_hash = ?predicate alone — the query planner performs a full table scan instead.Evidence — EXPLAIN before fix
With ~815,000 rows, each DELETE took 44+ seconds and held row locks for the entire duration. Concurrent writes on the table caused lock wait timeouts, which cascaded into deadlocks on
oc_preferences(which Nextcloud writes during the same request cycle).What we observed in production
From
performance_schema.events_statements_summary_by_digestandSHOW ENGINE INNODB STATUS:oc_preferencesdeadlock cycle occurring multiple times per daySQLSTATE[HY000]: General error: 1205 Lock wait timeout exceededappearingregularly in
nextcloud.logoc_preferenceswhile a long-runningoc_filecacheDELETE held locksFix
EXPLAIN after fix
Impact in production (measured over 14 days post-fix)
nextcloud.logIndex size
43.6 MB — modest relative to the existing 476 MB index footprint on this table.
Notes
fs_storage_path_hash (storage, path_hash)cannot substitute for this index whenstorageis not part of the predicate. A single-column index onpath_hashis required to allow direct lookup by hash alone.path_hashstores a 32-character MD5 hex string (unique per row) so the index cardinality is very high and lookups resolve in O(1).Issue 2 — Paginated file-scan query uses wrong index and performs a filesort
Affected query
This query is the inner loop of the paginated cursor used by
occ files:scan, Nextcloud background file indexing, and any code path that iterates over a storage's file list in batches (e.g.OC\Files\Cache\Scanner::scan()).Root cause
The query planner selects
fs_storage_mimetype (storage, mimetype)as the access path becausestorageis satisfied as areflookup and the index appears cheaper than a full scan. However,mimetypeprovides no further filtering, so the engine must:storage = ?viafs_storage_mimetypepath LIKE ?andfileid > ?filters in a second passfileid(Using filesort)On a storage with 300,000+ files this sort step is done entirely in memory on each paginated call.
Evidence — EXPLAIN before fix (real storage ID, 302,572 files)
Measurements from
performance_schema.events_statements_summary_by_digestFix
With
(storage, fileid)the planner can:storage = ?using the leading columnfileid > ?as a range condition on the second column — rows are already ordered byfileidwithin each storage partitionORDER BY fileid LIMIT ?directly from the index with no sort stepEXPLAIN after fix (same storage, 302,572 files)
Using filesortis gone. Results are delivered pre-ordered byfileidfrom the index.Impact in production (direct timing, 14 days post-fix)
The query has dropped out of the top-10 slowest queries by cumulative time entirely.
occ files:scan --all(326,000 files across all storages) completes in 2 minutes 4 seconds, down from several minutes previously.Index usage confirmation (performance_schema, 14 days)
idx_storage_fileidfs_storage_mimetype(former choice)Index size
20.6 MB — very lightweight.
Notes
fs_storage_path_prefix (storage, path)index is close but cannot eliminate the filesort because its second column ispath, notfileid. The planner cannot derivefileidorder from apath-ordered index.fs_storage_size (storage, size, fileid)also covers(storage, fileid)as columns 1 and 3, butsizein the middle meansfileidis not contiguous in the index and the range condition cannot be satisfied efficiently.fileid > ? ORDER BY fileid). Queries filtering bypath LIKE ?without thefileid > ?cursor continue to usefs_storage_path_prefixas before.Proposed schema change
Both statements are safe to run online (MariaDB and MySQL support online DDL for index additions on InnoDB tables) and are idempotent if wrapped in a conditional migration check.
Suggested migration integration
These indexes can be added to the Nextcloud schema via the existing migration infrastructure, for example by extending or adding a migration class that calls:
After the migration ships, users already running the manually added indexes will have the duplicate removed gracefully by the migration's
hasIndexguard.