Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
2d08b9d
plan
badrishc Mar 26, 2026
2e95bbf
Optimize ParseCommand with hash-based command lookup
badrishc Mar 26, 2026
7ce5f19
Optimize ParseCommand with SIMD fast path and hash-based command lookup
badrishc Mar 26, 2026
1a24e00
update
badrishc Mar 26, 2026
2d4e4b2
Merge remote, resolve conflicts keeping optimized HashLookupCommand; …
badrishc Mar 26, 2026
aeaed64
Add CommandParsingBenchmark with 16 commands across all parser tiers
badrishc Mar 26, 2026
0d01f9f
Add per-session MRU command cache for repeated command optimization
badrishc Mar 26, 2026
c38d1e6
update
badrishc Mar 27, 2026
80875e0
update
badrishc Mar 27, 2026
b4415db
Merge branch 'badrishc/fast-parses' of github.com:microsoft/garnet in…
badrishc Mar 27, 2026
c2b50da
Replace SlowParseCommand with hash-based subcommand dispatch
badrishc Mar 27, 2026
e099686
Update add-garnet-command skill and copilot instructions for hash tab…
badrishc Mar 27, 2026
5c67063
Production hardening: debug asserts, startup validation, code review …
badrishc Mar 27, 2026
4ead4a0
updates
badrishc Apr 1, 2026
56d8f3b
code review updates
badrishc Apr 1, 2026
b813807
Merge branch 'dev' into badrishc/fast-parses
badrishc Apr 1, 2026
0b119f1
nits
badrishc Apr 1, 2026
52c5474
Merge branch 'badrishc/fast-parses' of github.com:microsoft/garnet in…
badrishc Apr 1, 2026
2c2661c
updates
badrishc Apr 1, 2026
507f5a0
Merge remote-tracking branch 'origin/dev' into badrishc/fast-parses
badrishc Apr 1, 2026
b868080
nits
badrishc Apr 1, 2026
b95ac6a
fixes
badrishc Apr 1, 2026
f6845a4
fix flaky test
badrishc Apr 1, 2026
5db102b
small cleanup
badrishc Apr 1, 2026
60bcf4a
Merge branch 'dev' into badrishc/fast-parses
badrishc Apr 2, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ Full guide: https://microsoft.github.io/garnet/docs/dev/garnet-api
### Steps for a new built-in command:

1. **Define the command**: Add enum value to `RespCommand` in `libs/server/Resp/Parser/RespCommand.cs`. For object commands (List, SortedSet, Hash, Set), also add a value to the `[ObjectName]Operation` enum in `libs/server/Objects/[ObjectName]/[ObjectName]Object.cs`.
2. **Add parsing logic**: In `libs/server/Resp/Parser/RespCommand.cs`, add to `FastParseCommand` (fixed arg count) or `FastParseArrayCommand` (variable args).
2. **Add parsing logic**: In `libs/server/Resp/Parser/RespCommandHashLookupData.cs`, add an entry to `PopulatePrimaryTable()` (e.g., `Add("MYNEWCMD", RespCommand.MYNEWCMD)`). For commands with subcommands, set `hasSub: true` and add a subcommand table. The hash table provides O(1) lookup for all command name lengths.
3. **Declare the API method**: Add method signature to `IGarnetReadApi` (read-only) or `IGarnetApi` (read-write) in `libs/server/API/IGarnetApi.cs`.
4. **Implement the network handler**: Add a method to `RespServerSession` (the class is split across ~22 partial `.cs` files — object commands go in `libs/server/Resp/Objects/[ObjectName]Commands.cs`, others in `libs/server/Resp/BasicCommands.cs`, `ArrayCommands.cs`, `AdminCommands.cs`, `KeyAdminCommands.cs`, etc.). The handler parses arguments from the network buffer via `parseState.GetArgSliceByRef(i)` (returns `ref PinnedSpanByte`), calls the storage API, and writes the RESP response using `RespWriteUtils` helper methods, then calls `SendAndReset()` to flush the response buffer.
5. **Add dispatch route**: In `libs/server/Resp/RespServerSession.cs`, add a case to `ProcessBasicCommands` or `ProcessArrayCommands` calling the handler from step 4.
Expand Down
62 changes: 35 additions & 27 deletions .github/skills/add-garnet-command/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,17 +82,39 @@ EVALSHA, // Note: Update LastDataCommand if adding new data commands after this

**File:** `libs/server/Resp/Parser/RespCommand.cs`

Two parsing paths exist:
Two parsing tiers exist:

### Fast path: `FastParseCommand()` / `FastParseArrayCommand()`
Two fast-path methods exist with different constraints:
- **`FastParseCommand()`**: For commands with a fixed number of arguments and command names up to **9 characters**. Uses `ulong` pointer comparisons on `(count << 4) | length` patterns.
- **`FastParseArrayCommand()`**: For commands with a variable number of arguments and command names up to **16 characters**. Uses similar `ulong` comparison patterns but accommodates longer names.
### Hash table path: `RespCommandHashLookup` (primary path for most commands)
The hash table in `libs/server/Resp/Parser/RespCommandHashLookupData.cs` provides O(1) lookup for all built-in commands. This is the **recommended path for all new commands**.

Only add here if the command name is a simple word (no dots or special characters).
**To add a new primary command**, add one line to `PopulatePrimaryTable()` in `RespCommandHashLookupData.cs`:
```csharp
Add("DELIFGREATER", RespCommand.DELIFGREATER);
```

### Slow path: `SlowParseCommand()`
For longer names, dot-prefixed names (like `RI.CREATE`), or names that don't fit the fast-path pattern.
**To add a command with subcommands** (e.g., `MYPARENT SUBCMD`):
1. Add the parent with `hasSub: true` in `PopulatePrimaryTable()`:
```csharp
Add("MYPARENT", RespCommand.MYPARENT, hasSub: true);
```
2. Define the subcommand array in `RespCommandHashLookupData.cs`:
```csharp
private static readonly (string Name, RespCommand Command)[] MyparentSubcommands =
[
("SUBCMD1", RespCommand.MYPARENT_SUBCMD1),
("SUBCMD2", RespCommand.MYPARENT_SUBCMD2),
];
```
3. Build the table in the static constructor in `RespCommandHashLookup.cs`:
```csharp
myparentSubTable = BuildSubTable(MyparentSubcommands, out myparentSubTableMask);
```
4. Wire it into `LookupSubcommand()` in `RespCommandHashLookup.cs`:
```csharp
RespCommand.MYPARENT => (myparentSubTable, myparentSubTableMask),
```

**⚠️ Important:** Use the exact wire-protocol spelling for the hash table name string. Some commands use hyphens (e.g., `"SET-CONFIG-EPOCH"` not `"SETCONFIGEPOCH"`). Check `CmdStrings.cs` for the canonical spelling.

**⚠️ Convention:** Define the command name string in **`libs/server/Resp/CmdStrings.cs`** and reference it from the parser, rather than using inline `"..."u8` literals. This keeps command name strings centralized and reusable (e.g., for error messages).

Expand All @@ -101,26 +123,12 @@ For longer names, dot-prefixed names (like `RI.CREATE`), or names that don't fit
public static ReadOnlySpan<byte> DELIFGREATER => "DELIFGREATER"u8;
```

**Pattern for slow-path commands:**
```csharp
else if (command.SequenceEqual(CmdStrings.DELIFGREATER))
{
return RespCommand.DELIFGREATER;
}
```

**Pattern for dot-prefixed commands (e.g., `RI.CREATE`):**
```csharp
else if (command.SequenceEqual(CmdStrings.RICREATE))
{
return RespCommand.RICREATE;
}
```

Add this before the final `return RespCommand.INVALID;` at the end of `SlowParseCommand`.
### SIMD fast path: `FastParseCommand()` (optional, for hottest commands only)
Static `Vector128<byte>` patterns defined in `libs/server/Resp/Parser/RespCommandSimdPatterns.cs` that match the full RESP encoding (`*N\r\n$L\r\nCMD\r\n`) in a single 16-byte comparison. Use the `RespPattern(argCount, "CMD")` helper to create new patterns. Only needed for the most performance-critical commands with:
- Fixed argument count (single digit)
- Command names of 3-6 characters (total encoded length must fit in 16 bytes)

**⚠️ Caveat: Dot-prefixed commands and ACL**
If your command uses a dot (e.g., `RI.CREATE`), you must also update **`libs/server/ACL/ACLParser.cs`** so that the ACL system can map the dotted wire name to the enum name. Search for how existing dot-handling works (look for `Replace(".", "")` or similar normalization).
Most new commands should **NOT** be added here — the hash table + MRU cache provide excellent performance for all commands. Only add SIMD patterns if benchmarking shows the command is a bottleneck.

---

Expand Down
260 changes: 260 additions & 0 deletions benchmark/BDN.benchmark/Operations/CommandParsingBenchmark.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,260 @@
// Copyright (c) Microsoft Corporation.
// Licensed under the MIT license.

using BenchmarkDotNet.Attributes;
using Garnet.server;

namespace BDN.benchmark.Operations
{
/// <summary>
/// Benchmark for RESP command parsing only (no storage operations).
/// Calls ParseRespCommandBuffer directly to measure pure parsing throughput
/// across all optimization tiers.
/// </summary>
[MemoryDiagnoser]
public unsafe class CommandParsingBenchmark : OperationsBase
{
// Tier 0a: SIMD Vector128 fast path (3-6 char commands with fixed arg counts)
static ReadOnlySpan<byte> CMD_PING => "*1\r\n$4\r\nPING\r\n"u8;
static ReadOnlySpan<byte> CMD_GET => "*2\r\n$3\r\nGET\r\n$1\r\na\r\n"u8;
static ReadOnlySpan<byte> CMD_SET => "*3\r\n$3\r\nSET\r\n$1\r\na\r\n$1\r\nb\r\n"u8;
static ReadOnlySpan<byte> CMD_INCR => "*2\r\n$4\r\nINCR\r\n$1\r\ni\r\n"u8;
static ReadOnlySpan<byte> CMD_EXISTS => "*2\r\n$6\r\nEXISTS\r\n$1\r\na\r\n"u8;
static ReadOnlySpan<byte> CMD_SETEX => "*4\r\n$5\r\nSETEX\r\n$1\r\na\r\n$2\r\n60\r\n$1\r\nb\r\n"u8;

// Tier 0b: Scalar path — hot commands too long for SIMD (name > 6 chars, exceeds 16-byte Vector128)
static ReadOnlySpan<byte> CMD_PUBLISH => "*3\r\n$7\r\nPUBLISH\r\n$2\r\nch\r\n$5\r\nhello\r\n"u8;

// Tier 0c: Scalar path — variable-arg hot commands (arg count varies, cannot be SIMD or MRU cached)
static ReadOnlySpan<byte> CMD_EXPIRE => "*3\r\n$6\r\nEXPIRE\r\n$1\r\na\r\n$2\r\n60\r\n"u8;

// Tier 1: Hash table lookup via ArrayParseCommand → HashLookupCommand (+ MRU cache on 2nd+ call)
static ReadOnlySpan<byte> CMD_HSET => "*4\r\n$4\r\nHSET\r\n$1\r\nh\r\n$1\r\nf\r\n$1\r\nv\r\n"u8;
static ReadOnlySpan<byte> CMD_LPUSH => "*3\r\n$5\r\nLPUSH\r\n$1\r\nl\r\n$1\r\nv\r\n"u8;
static ReadOnlySpan<byte> CMD_ZADD => "*4\r\n$4\r\nZADD\r\n$1\r\nz\r\n$1\r\n1\r\n$1\r\nm\r\n"u8;

// Tier 1: Hash table lookup (long command names, double-digit $ header)
static ReadOnlySpan<byte> CMD_ZRANGEBYSCORE => "*4\r\n$13\r\nZRANGEBYSCORE\r\n$1\r\nz\r\n$1\r\n0\r\n$2\r\n10\r\n"u8;
static ReadOnlySpan<byte> CMD_ZREMRANGEBYSCORE => "*4\r\n$16\r\nZREMRANGEBYSCORE\r\n$1\r\nz\r\n$1\r\n0\r\n$2\r\n10\r\n"u8;
static ReadOnlySpan<byte> CMD_HINCRBYFLOAT => "*4\r\n$12\r\nHINCRBYFLOAT\r\n$1\r\nh\r\n$1\r\nf\r\n$3\r\n1.5\r\n"u8;

// Tier 1: Hash table lookup (commands formerly in SlowParseCommand)
static ReadOnlySpan<byte> CMD_SUBSCRIBE => "*2\r\n$9\r\nSUBSCRIBE\r\n$2\r\nch\r\n"u8;
static ReadOnlySpan<byte> CMD_GEORADIUS => "*6\r\n$9\r\nGEORADIUS\r\n$1\r\ng\r\n$1\r\n0\r\n$1\r\n0\r\n$3\r\n100\r\n$2\r\nkm\r\n"u8;
static ReadOnlySpan<byte> CMD_SETIFMATCH => "*4\r\n$10\r\nSETIFMATCH\r\n$1\r\na\r\n$1\r\nb\r\n$1\r\n0\r\n"u8;

// Pre-allocated buffers (pinned for pointer stability)
byte[] bufPing, bufGet, bufSet, bufIncr, bufExists, bufSetex, bufPublish, bufExpire, bufHset, bufLpush, bufZadd, bufSubscribe;
byte[] bufZrangebyscore, bufZremrangebyscore, bufHincrbyfloat, bufGeoradius, bufSetifmatch;

public override void GlobalSetup()
{
base.GlobalSetup();

// Pre-seed a key so GET/EXISTS don't return NOT_FOUND
SlowConsumeMessage("*3\r\n$3\r\nSET\r\n$1\r\na\r\n$1\r\nb\r\n"u8);

bufPing = GC.AllocateArray<byte>(CMD_PING.Length, pinned: true);
CMD_PING.CopyTo(bufPing);
bufGet = GC.AllocateArray<byte>(CMD_GET.Length, pinned: true);
CMD_GET.CopyTo(bufGet);
bufSet = GC.AllocateArray<byte>(CMD_SET.Length, pinned: true);
CMD_SET.CopyTo(bufSet);
bufIncr = GC.AllocateArray<byte>(CMD_INCR.Length, pinned: true);
CMD_INCR.CopyTo(bufIncr);
bufExists = GC.AllocateArray<byte>(CMD_EXISTS.Length, pinned: true);
CMD_EXISTS.CopyTo(bufExists);
bufSetex = GC.AllocateArray<byte>(CMD_SETEX.Length, pinned: true);
CMD_SETEX.CopyTo(bufSetex);
bufPublish = GC.AllocateArray<byte>(CMD_PUBLISH.Length, pinned: true);
CMD_PUBLISH.CopyTo(bufPublish);
bufExpire = GC.AllocateArray<byte>(CMD_EXPIRE.Length, pinned: true);
CMD_EXPIRE.CopyTo(bufExpire);
bufHset = GC.AllocateArray<byte>(CMD_HSET.Length, pinned: true);
CMD_HSET.CopyTo(bufHset);
bufLpush = GC.AllocateArray<byte>(CMD_LPUSH.Length, pinned: true);
CMD_LPUSH.CopyTo(bufLpush);
bufZadd = GC.AllocateArray<byte>(CMD_ZADD.Length, pinned: true);
CMD_ZADD.CopyTo(bufZadd);
bufSubscribe = GC.AllocateArray<byte>(CMD_SUBSCRIBE.Length, pinned: true);
CMD_SUBSCRIBE.CopyTo(bufSubscribe);
bufZrangebyscore = GC.AllocateArray<byte>(CMD_ZRANGEBYSCORE.Length, pinned: true);
CMD_ZRANGEBYSCORE.CopyTo(bufZrangebyscore);
bufZremrangebyscore = GC.AllocateArray<byte>(CMD_ZREMRANGEBYSCORE.Length, pinned: true);
CMD_ZREMRANGEBYSCORE.CopyTo(bufZremrangebyscore);
bufHincrbyfloat = GC.AllocateArray<byte>(CMD_HINCRBYFLOAT.Length, pinned: true);
CMD_HINCRBYFLOAT.CopyTo(bufHincrbyfloat);
bufGeoradius = GC.AllocateArray<byte>(CMD_GEORADIUS.Length, pinned: true);
CMD_GEORADIUS.CopyTo(bufGeoradius);
bufSetifmatch = GC.AllocateArray<byte>(CMD_SETIFMATCH.Length, pinned: true);
CMD_SETIFMATCH.CopyTo(bufSetifmatch);
}

// === Tier 0a: SIMD Vector128 fast path ===

[Benchmark]
public RespCommand ParsePING()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufPing);
return result;
}

[Benchmark]
public RespCommand ParseGET()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufGet);
return result;
}

[Benchmark]
public RespCommand ParseSET()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufSet);
return result;
}

[Benchmark]
public RespCommand ParseINCR()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufIncr);
return result;
}

[Benchmark]
public RespCommand ParseEXISTS()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufExists);
return result;
}

// === Tier 0a: SIMD fast path (SETEX is a 15-byte SIMD pattern) ===

[Benchmark]
public RespCommand ParseSETEX()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufSetex);
return result;
}

// === Tier 0b: Scalar path — hot commands too long for SIMD ===

[Benchmark]
public RespCommand ParsePUBLISH()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufPublish);
return result;
}

// === Tier 0c: Scalar path — variable-arg hot commands ===

[Benchmark]
public RespCommand ParseEXPIRE()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufExpire);
return result;
}

// === Tier 1: Hash table lookup (short names, MRU cache on repeat) ===

[Benchmark]
public RespCommand ParseHSET()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufHset);
return result;
}

[Benchmark]
public RespCommand ParseLPUSH()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufLpush);
return result;
}

[Benchmark]
public RespCommand ParseZADD()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufZadd);
return result;
}

// === Tier 1: Hash table lookup (long names) ===

[Benchmark]
public RespCommand ParseZRANGEBYSCORE()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufZrangebyscore);
return result;
}

[Benchmark]
public RespCommand ParseZREMRANGEBYSCORE()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufZremrangebyscore);
return result;
}

[Benchmark]
public RespCommand ParseHINCRBYFLOAT()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufHincrbyfloat);
return result;
}

// === Tier 1: Hash table lookup (formerly in SlowParseCommand) ===

[Benchmark]
public RespCommand ParseSUBSCRIBE()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufSubscribe);
return result;
}

[Benchmark]
public RespCommand ParseGEORADIUS()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufGeoradius);
return result;
}

[Benchmark]
public RespCommand ParseSETIFMATCH()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufSetifmatch);
return result;
}
}
}
Loading
Loading