Skip to content

New index proposal on oc_filecache for performance enhancing #59711

@stsimb

Description

@stsimb

Performance: Two Missing Indexes on oc_filecache

Summary

After analysing the slow query log on a large production instance of Nextcloud (using Claude), two queries against oc_filecache are consistently slow due to missing index coverage. Both queries have been identified via performance_schema and EXPLAIN analysis, and both are resolved by adding lightweight indexes. The fixes have been validated in production for two weeks with measurable, significant improvement.

We are submitting this as a performance enhancement request for the Nextcloud core team to consider adding these indexes to the schema migration, so all users benefit automatically on upgrade.


Environment

Nextcloud 33.0.2
Database MariaDB 10.11.16
oc_filecache rows ~815,000
oc_filecache data size 224 MB
oc_filecache index size 476 MB
Users ~140 active
Files ~326,000 user files across multiple storages

Issue 1 — DELETE FROM oc_filecache WHERE path_hash = ? performs a full table scan

Affected query

DELETE FROM oc_filecache WHERE path_hash = ?

This query is executed during file deletion and cache invalidation operations.

Root cause

oc_filecache has a composite index fs_storage_path_hash (storage, path_hash). Because storage is the leading column, the index cannot be used for a WHERE path_hash = ? predicate alone — the query planner performs a full table scan instead.

Evidence — EXPLAIN before fix

type: ALL   key: NULL   rows: 814801   Extra: Using where

With ~815,000 rows, each DELETE took 44+ seconds and held row locks for the entire duration. Concurrent writes on the table caused lock wait timeouts, which cascaded into deadlocks on oc_preferences (which Nextcloud writes during the same request cycle).

What we observed in production

From performance_schema.events_statements_summary_by_digest and SHOW ENGINE INNODB STATUS:

  • oc_preferences deadlock cycle occurring multiple times per day
  • SQLSTATE[HY000]: General error: 1205 Lock wait timeout exceeded appearing
    regularly in nextcloud.log
  • Deadlocks confirmed in InnoDB status between concurrent sessions updating
    oc_preferences while a long-running oc_filecache DELETE held locks

Fix

ALTER TABLE oc_filecache ADD INDEX idx_path_hash (path_hash);

EXPLAIN after fix

type: ref   key: idx_path_hash   key_len: 97   rows: 1   Extra: Using where

Impact in production (measured over 14 days post-fix)

Metric Before After
DELETE execution time 44+ seconds < 1ms
InnoDB deadlocks Multiple per day 0
Lock wait timeouts in nextcloud.log Daily Rare (isolated)
Row lock average wait time Seconds 0ms
DELETE operations served via index 1,916 in 14 days

Index size

43.6 MB — modest relative to the existing 476 MB index footprint on this table.

Notes

  • fs_storage_path_hash (storage, path_hash) cannot substitute for this index when storage is not part of the predicate. A single-column index on path_hash is required to allow direct lookup by hash alone.
  • path_hash stores a 32-character MD5 hex string (unique per row) so the index cardinality is very high and lookups resolve in O(1).

Issue 2 — Paginated file-scan query uses wrong index and performs a filesort

Affected query

SELECT fileid
FROM oc_filecache
WHERE storage = ?
  AND path LIKE ?
  AND fileid > ?
ORDER BY fileid
LIMIT ?

This query is the inner loop of the paginated cursor used by occ files:scan, Nextcloud background file indexing, and any code path that iterates over a storage's file list in batches (e.g. OC\Files\Cache\Scanner::scan()).

Root cause

The query planner selects fs_storage_mimetype (storage, mimetype) as the access path because storage is satisfied as a ref lookup and the index appears cheaper than a full scan. However, mimetype provides no further filtering, so the engine must:

  1. Scan all rows matching storage = ? via fs_storage_mimetype
  2. Apply the path LIKE ? and fileid > ? filters in a second pass
  3. Sort all surviving rows by fileid (Using filesort)

On a storage with 300,000+ files this sort step is done entirely in memory on each paginated call.

Evidence — EXPLAIN before fix (real storage ID, 302,572 files)

type: ref
key:  fs_storage_mimetype
key_len: 8
rows: 407583
Extra: Using index condition; Using where; Using filesort

Measurements from performance_schema.events_statements_summary_by_digest

Metric Value
Total calls 3,744
Average execution time 121.9 ms
Maximum execution time 491.2 ms
Cumulative time 456 seconds

Fix

ALTER TABLE oc_filecache ADD INDEX idx_storage_fileid (storage, fileid);

With (storage, fileid) the planner can:

  1. Narrow to storage = ? using the leading column
  2. Apply fileid > ? as a range condition on the second column — rows are already ordered by fileid within each storage partition
  3. Satisfy ORDER BY fileid LIMIT ? directly from the index with no sort step

EXPLAIN after fix (same storage, 302,572 files)

type: ref
key:  idx_storage_fileid
key_len: 8
rows: 407583
Extra: Using index condition; Using where

Using filesort is gone. Results are delivered pre-ordered by fileid from the index.

Impact in production (direct timing, 14 days post-fix)

Storage size Before (avg) After Improvement
302,572 files ~122 ms 3.8 ms 32× faster
238,285 files ~122 ms 1.6 ms 76× faster
151,458 files ~122 ms 0.2 ms 550× faster

The query has dropped out of the top-10 slowest queries by cumulative time entirely. occ files:scan --all (326,000 files across all storages) completes in 2 minutes 4 seconds, down from several minutes previously.

Index usage confirmation (performance_schema, 14 days)

Index Reads
idx_storage_fileid 58,814,260
fs_storage_mimetype (former choice) 53,237,066 (used for other queries, not this one)

Index size

20.6 MB — very lightweight.

Notes

  • The existing fs_storage_path_prefix (storage, path) index is close but cannot eliminate the filesort because its second column is path, not fileid. The planner cannot derive fileid order from a path-ordered index.
  • The existing fs_storage_size (storage, size, fileid) also covers (storage, fileid) as columns 1 and 3, but size in the middle means fileid is not contiguous in the index and the range condition cannot be satisfied efficiently.
  • The fix is only effective for the paginated cursor pattern (fileid > ? ORDER BY fileid). Queries filtering by path LIKE ? without the fileid > ? cursor continue to use fs_storage_path_prefix as before.

Proposed schema change

-- Issue 1: allow direct lookup by path_hash without requiring storage in the predicate
ALTER TABLE oc_filecache ADD INDEX idx_path_hash (path_hash);

-- Issue 2: allow paginated cursor (fileid > ?) to skip the filesort step
ALTER TABLE oc_filecache ADD INDEX idx_storage_fileid (storage, fileid);

Both statements are safe to run online (MariaDB and MySQL support online DDL for index additions on InnoDB tables) and are idempotent if wrapped in a conditional migration check.


Suggested migration integration

These indexes can be added to the Nextcloud schema via the existing migration infrastructure, for example by extending or adding a migration class that calls:

$table = $schema->getTable('filecache');

if (!$table->hasIndex('idx_path_hash')) {
    $table->addIndex(['path_hash'], 'idx_path_hash');
}

if (!$table->hasIndex('idx_storage_fileid')) {
    $table->addIndex(['storage', 'fileid'], 'idx_storage_fileid');
}

After the migration ships, users already running the manually added indexes will have the duplicate removed gracefully by the migration's hasIndex guard.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    Triaged

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions