Skip to content

optimize slab quarantine: unify random+FIFO into single array#334

Open
peterlodri-sec wants to merge 1 commit into
GrapheneOS:mainfrom
peterlodri-sec:optimize-slab-quarantine
Open

optimize slab quarantine: unify random+FIFO into single array#334
peterlodri-sec wants to merge 1 commit into
GrapheneOS:mainfrom
peterlodri-sec:optimize-slab-quarantine

Conversation

@peterlodri-sec
Copy link
Copy Markdown

@peterlodri-sec peterlodri-sec commented May 11, 2026

Summary

Implements #179: unify the two-stage random + FIFO slab quarantine into a single random-replacement array. Steady-state per-free() work drops from 2 pointer swaps to 1.

Background

The previous slab quarantine had two sequential stages:

  1. Random array: SLAB_QUARANTINE_RANDOM_LENGTH slots
  2. FIFO queue: SLAB_QUARANTINE_QUEUE_LENGTH slots

In steady state every free() did 2 swaps: one against the random array, then one against the FIFO ring buffer.

Change

Replace both arrays with a single random-replacement array of length SLAB_QUARANTINE_LENGTH.

Capacity is unchanged

The new default SLAB_QUARANTINE_LENGTH = 2 preserves the old default total capacity (RANDOM=1 + QUEUE=1). light stays at 0. Android.bp is updated to 2 to match.

Tradeoff

The unified design loses the deterministic FIFO minimum delay before reuse. Eviction is now geometric: a pointer can in principle be evicted on the next free(), with expected dwell time equal to LENGTH frees. For the default LENGTH=2, the expected delay matches the old RANDOM=1 + QUEUE=1. Double-free, invalid-free and write-after-free detection are untouched.

Key changes

h_malloc.c

  • Single quarantine[] array replaces quarantine_random[] + quarantine_queue[] + quarantine_queue_index.
  • deallocate_small(): 1 random swap (was 2 swaps).
  • h_malloc_trim(): one purge loop (was two).
  • static_assert upper bound (<= 65536) and get_random_u16_uniform use preserved unchanged.

Configuration

  • CONFIG_SLAB_QUARANTINE_RANDOM_LENGTH + CONFIG_SLAB_QUARANTINE_QUEUE_LENGTH collapsed into CONFIG_SLAB_QUARANTINE_LENGTH.
  • config/default.mk = 2, config/light.mk = 0, Android.bp = 2.

README.md

  • Removed stale "FIFO + randomization" wording for the slab quarantine.
  • Updated CONFIG_SLAB_QUARANTINE_LENGTH description (2 scales to 2048 for 16-byte allocs at 16 KiB max, 16384 at 128 KiB max).
  • The large-allocation quarantine section is unchanged (still random + FIFO).

Tests (5 new files)

Test Purpose
quarantine_double_free_extended Immediate double-free on a 32 KiB allocation
quarantine_double_free_extended_delayed Double-free after 100 intervening 64 KiB frees (almost certainly evicts; accepts either double free or double free (quarantine))
quarantine_invalid_malloc_usable_size_extended malloc_usable_size on a quarantined extended-class pointer
quarantine_invalid_malloc_object_size_extended malloc_object_size on a quarantined extended-class pointer
quarantine_write_after_free_extended_reuse Write-after-free detected on reuse for extended-class allocations

All existing tests pass.

Migration

Old New
RANDOM=1, QUEUE=1 LENGTH=2 (default; same capacity)
RANDOM=2, QUEUE=4 LENGTH=6 (same capacity)
RANDOM=0, QUEUE=0 LENGTH=0 (disabled)

There is no longer a deterministic minimum delay. Eviction follows a geometric distribution and higher LENGTH increases the expected delay.

Notes for review

  • Default capacity is preserved at 2; no increase.
  • The pre-existing latent get_random_u16_uniform truncation for very large SLAB_QUARANTINE_LENGTH is not addressed here and is intended as a separate PR. With the default LENGTH=2 the scaled length stays well under U16_MAX.
  • Rebased onto current main; GitHub reports MERGEABLE/CLEAN.

Closes #179

@thestinger
Copy link
Copy Markdown
Member

There are conflicts with the current state of the code. The size of the quarantine should also not be increased. Fixing the u16 issue for large quarantine sizes should be fixed separately. I haven't looked at the code itself yet.

Combine the two-stage slab quarantine (random array + FIFO queue) into a
single random-replacement array. This reduces per-free operations from
2 pointer swaps to 1, improving performance while keeping the total
quarantine slot count identical.

- Replace CONFIG_SLAB_QUARANTINE_RANDOM_LENGTH and
  CONFIG_SLAB_QUARANTINE_QUEUE_LENGTH with unified
  CONFIG_SLAB_QUARANTINE_LENGTH
- Default: 2 (same total slots as old 1+1)
- Light: 0 (unchanged)
- Remove stale FIFO references from README
- Add extended-size-class quarantine tests

Fixes GrapheneOS#179
@peterlodri-sec peterlodri-sec force-pushed the optimize-slab-quarantine branch from 938451c to a4e45ad Compare May 11, 2026 10:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimizing slab quarantines

2 participants