Skip to content
Open
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
2d08b9d
plan
badrishc Mar 26, 2026
2e95bbf
Optimize ParseCommand with hash-based command lookup
badrishc Mar 26, 2026
7ce5f19
Optimize ParseCommand with SIMD fast path and hash-based command lookup
badrishc Mar 26, 2026
1a24e00
update
badrishc Mar 26, 2026
2d4e4b2
Merge remote, resolve conflicts keeping optimized HashLookupCommand; …
badrishc Mar 26, 2026
aeaed64
Add CommandParsingBenchmark with 16 commands across all parser tiers
badrishc Mar 26, 2026
0d01f9f
Add per-session MRU command cache for repeated command optimization
badrishc Mar 26, 2026
c38d1e6
update
badrishc Mar 27, 2026
80875e0
update
badrishc Mar 27, 2026
b4415db
Merge branch 'badrishc/fast-parses' of github.com:microsoft/garnet in…
badrishc Mar 27, 2026
c2b50da
Replace SlowParseCommand with hash-based subcommand dispatch
badrishc Mar 27, 2026
e099686
Update add-garnet-command skill and copilot instructions for hash tab…
badrishc Mar 27, 2026
5c67063
Production hardening: debug asserts, startup validation, code review …
badrishc Mar 27, 2026
4ead4a0
updates
badrishc Apr 1, 2026
56d8f3b
code review updates
badrishc Apr 1, 2026
b813807
Merge branch 'dev' into badrishc/fast-parses
badrishc Apr 1, 2026
0b119f1
nits
badrishc Apr 1, 2026
52c5474
Merge branch 'badrishc/fast-parses' of github.com:microsoft/garnet in…
badrishc Apr 1, 2026
2c2661c
updates
badrishc Apr 1, 2026
507f5a0
Merge remote-tracking branch 'origin/dev' into badrishc/fast-parses
badrishc Apr 1, 2026
b868080
nits
badrishc Apr 1, 2026
b95ac6a
fixes
badrishc Apr 1, 2026
f6845a4
fix flaky test
badrishc Apr 1, 2026
5db102b
small cleanup
badrishc Apr 1, 2026
60bcf4a
Merge branch 'dev' into badrishc/fast-parses
badrishc Apr 2, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ Full guide: https://microsoft.github.io/garnet/docs/dev/garnet-api
### Steps for a new built-in command:

1. **Define the command**: Add enum value to `RespCommand` in `libs/server/Resp/Parser/RespCommand.cs`. For object commands (List, SortedSet, Hash, Set), also add a value to the `[ObjectName]Operation` enum in `libs/server/Objects/[ObjectName]/[ObjectName]Object.cs`.
2. **Add parsing logic**: In `libs/server/Resp/Parser/RespCommand.cs`, add to `FastParseCommand` (fixed arg count) or `FastParseArrayCommand` (variable args).
2. **Add parsing logic**: In `libs/server/Resp/Parser/RespCommandHashLookup.cs`, add an entry to `PopulatePrimaryTable()` (e.g., `Add("MYNEWCMD", RespCommand.MYNEWCMD)`). For commands with subcommands, set `hasSub: true` and add a subcommand table. The hash table provides O(1) lookup for all command name lengths.
3. **Declare the API method**: Add method signature to `IGarnetReadApi` (read-only) or `IGarnetApi` (read-write) in `libs/server/API/IGarnetApi.cs`.
4. **Implement the network handler**: Add a method to `RespServerSession` (the class is split across ~22 partial `.cs` files — object commands go in `libs/server/Resp/Objects/[ObjectName]Commands.cs`, others in `libs/server/Resp/BasicCommands.cs`, `ArrayCommands.cs`, `AdminCommands.cs`, `KeyAdminCommands.cs`, etc.). The handler parses arguments from the network buffer via `parseState.GetArgSliceByRef(i)` (returns `ref PinnedSpanByte`), calls the storage API, and writes the RESP response using `RespWriteUtils` helper methods, then calls `SendAndReset()` to flush the response buffer.
5. **Add dispatch route**: In `libs/server/Resp/RespServerSession.cs`, add a case to `ProcessBasicCommands` or `ProcessArrayCommands` calling the handler from step 4.
Expand Down
63 changes: 36 additions & 27 deletions .github/skills/add-garnet-command/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,17 +82,39 @@ EVALSHA, // Note: Update LastDataCommand if adding new data commands after this

**File:** `libs/server/Resp/Parser/RespCommand.cs`

Two parsing paths exist:
Two parsing tiers exist:

### Fast path: `FastParseCommand()` / `FastParseArrayCommand()`
Two fast-path methods exist with different constraints:
- **`FastParseCommand()`**: For commands with a fixed number of arguments and command names up to **9 characters**. Uses `ulong` pointer comparisons on `(count << 4) | length` patterns.
- **`FastParseArrayCommand()`**: For commands with a variable number of arguments and command names up to **16 characters**. Uses similar `ulong` comparison patterns but accommodates longer names.
### Hash table path: `RespCommandHashLookup` (primary path for most commands)
The hash table in `libs/server/Resp/Parser/RespCommandHashLookup.cs` provides O(1) lookup for all built-in commands. This is the **recommended path for all new commands**.

Only add here if the command name is a simple word (no dots or special characters).
**To add a new primary command**, add one line to `PopulatePrimaryTable()`:
```csharp
Add("DELIFGREATER", RespCommand.DELIFGREATER);
```

### Slow path: `SlowParseCommand()`
For longer names, dot-prefixed names (like `RI.CREATE`), or names that don't fit the fast-path pattern.
**To add a command with subcommands** (e.g., `MYPARENT SUBCMD`):
1. Add the parent with `hasSub: true`:
```csharp
Add("MYPARENT", RespCommand.MYPARENT, hasSub: true);
```
2. Define the subcommand array:
```csharp
private static readonly (string Name, RespCommand Command)[] MyparentSubcommands =
[
("SUBCMD1", RespCommand.MYPARENT_SUBCMD1),
("SUBCMD2", RespCommand.MYPARENT_SUBCMD2),
];
```
3. Build the table in the static constructor:
```csharp
myparentSubTable = BuildSubTable(MyparentSubcommands, out myparentSubTableMask);
```
4. Wire it into `LookupSubcommand()`:
```csharp
RespCommand.MYPARENT => (myparentSubTable, myparentSubTableMask),
```

**⚠️ Important:** Use the exact wire-protocol spelling for the hash table name string. Some commands use hyphens (e.g., `"SET-CONFIG-EPOCH"` not `"SETCONFIGEPOCH"`). Check `CmdStrings.cs` for the canonical spelling.

**⚠️ Convention:** Define the command name string in **`libs/server/Resp/CmdStrings.cs`** and reference it from the parser, rather than using inline `"..."u8` literals. This keeps command name strings centralized and reusable (e.g., for error messages).

Expand All @@ -101,26 +123,13 @@ For longer names, dot-prefixed names (like `RI.CREATE`), or names that don't fit
public static ReadOnlySpan<byte> DELIFGREATER => "DELIFGREATER"u8;
```

**Pattern for slow-path commands:**
```csharp
else if (command.SequenceEqual(CmdStrings.DELIFGREATER))
{
return RespCommand.DELIFGREATER;
}
```

**Pattern for dot-prefixed commands (e.g., `RI.CREATE`):**
```csharp
else if (command.SequenceEqual(CmdStrings.RICREATE))
{
return RespCommand.RICREATE;
}
```

Add this before the final `return RespCommand.INVALID;` at the end of `SlowParseCommand`.
### SIMD fast path: `FastParseCommand()` (optional, for hottest commands only)
Static `Vector128<byte>` patterns that match the full RESP encoding (`*N\r\n$L\r\nCMD\r\n`) in a single 16-byte comparison. Only needed for the most performance-critical commands with:
- Fixed argument count
- Command names of 3-6 characters
- No dots or special characters

**⚠️ Caveat: Dot-prefixed commands and ACL**
If your command uses a dot (e.g., `RI.CREATE`), you must also update **`libs/server/ACL/ACLParser.cs`** so that the ACL system can map the dotted wire name to the enum name. Search for how existing dot-handling works (look for `Replace(".", "")` or similar normalization).
Most new commands should **NOT** be added here — the hash table + MRU cache provide excellent performance for all commands. Only add SIMD patterns if benchmarking shows the command is a bottleneck.

---

Expand Down
242 changes: 242 additions & 0 deletions benchmark/BDN.benchmark/Operations/CommandParsingBenchmark.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,242 @@
// Copyright (c) Microsoft Corporation.
// Licensed under the MIT license.

using BenchmarkDotNet.Attributes;
using Garnet.server;

namespace BDN.benchmark.Operations
{
/// <summary>
/// Benchmark for RESP command parsing only (no storage operations).
/// Calls ParseRespCommandBuffer directly to measure pure parsing throughput
/// across all optimization tiers.
/// </summary>
[MemoryDiagnoser]
public unsafe class CommandParsingBenchmark : OperationsBase
{
// Tier 1a: SIMD Vector128 fast path (3-6 char commands with fixed arg counts)
static ReadOnlySpan<byte> CMD_PING => "*1\r\n$4\r\nPING\r\n"u8;
static ReadOnlySpan<byte> CMD_GET => "*2\r\n$3\r\nGET\r\n$1\r\na\r\n"u8;
static ReadOnlySpan<byte> CMD_SET => "*3\r\n$3\r\nSET\r\n$1\r\na\r\n$1\r\nb\r\n"u8;
static ReadOnlySpan<byte> CMD_INCR => "*2\r\n$4\r\nINCR\r\n$1\r\ni\r\n"u8;
static ReadOnlySpan<byte> CMD_EXISTS => "*2\r\n$6\r\nEXISTS\r\n$1\r\na\r\n"u8;

// Tier 1b: Scalar ulong switch (variable-arg commands)
static ReadOnlySpan<byte> CMD_SETEX => "*4\r\n$5\r\nSETEX\r\n$1\r\na\r\n$2\r\n60\r\n$1\r\nb\r\n"u8;
static ReadOnlySpan<byte> CMD_EXPIRE => "*3\r\n$6\r\nEXPIRE\r\n$1\r\na\r\n$2\r\n60\r\n"u8;

// Old Tier 2 (FastParseArrayCommand): near top of switch chains (short names, common first chars)
static ReadOnlySpan<byte> CMD_HSET => "*4\r\n$4\r\nHSET\r\n$1\r\nh\r\n$1\r\nf\r\n$1\r\nv\r\n"u8;
static ReadOnlySpan<byte> CMD_LPUSH => "*3\r\n$5\r\nLPUSH\r\n$1\r\nl\r\n$1\r\nv\r\n"u8;
static ReadOnlySpan<byte> CMD_ZADD => "*4\r\n$4\r\nZADD\r\n$1\r\nz\r\n$1\r\n1\r\n$1\r\nm\r\n"u8;

// Old Tier 2 (FastParseArrayCommand): deep in switch chains (long names, double-digit $ header)
static ReadOnlySpan<byte> CMD_ZRANGEBYSCORE => "*4\r\n$13\r\nZRANGEBYSCORE\r\n$1\r\nz\r\n$1\r\n0\r\n$2\r\n10\r\n"u8;
static ReadOnlySpan<byte> CMD_ZREMRANGEBYSCORE => "*4\r\n$16\r\nZREMRANGEBYSCORE\r\n$1\r\nz\r\n$1\r\n0\r\n$2\r\n10\r\n"u8;
static ReadOnlySpan<byte> CMD_HINCRBYFLOAT => "*4\r\n$12\r\nHINCRBYFLOAT\r\n$1\r\nh\r\n$1\r\nf\r\n$3\r\n1.5\r\n"u8;

// Old Tier 3 (SlowParseCommand): sequential SequenceEqual scan
static ReadOnlySpan<byte> CMD_SUBSCRIBE => "*2\r\n$9\r\nSUBSCRIBE\r\n$2\r\nch\r\n"u8;
static ReadOnlySpan<byte> CMD_GEORADIUS => "*6\r\n$9\r\nGEORADIUS\r\n$1\r\ng\r\n$1\r\n0\r\n$1\r\n0\r\n$3\r\n100\r\n$2\r\nkm\r\n"u8;
static ReadOnlySpan<byte> CMD_SETIFMATCH => "*4\r\n$10\r\nSETIFMATCH\r\n$1\r\na\r\n$1\r\nb\r\n$1\r\n0\r\n"u8;

// Pre-allocated buffers (pinned for pointer stability)
byte[] bufPing, bufGet, bufSet, bufIncr, bufExists, bufSetex, bufExpire, bufHset, bufLpush, bufZadd, bufSubscribe;
byte[] bufZrangebyscore, bufZremrangebyscore, bufHincrbyfloat, bufGeoradius, bufSetifmatch;

public override void GlobalSetup()
{
base.GlobalSetup();

// Pre-seed a key so GET/EXISTS don't return NOT_FOUND
SlowConsumeMessage("*3\r\n$3\r\nSET\r\n$1\r\na\r\n$1\r\nb\r\n"u8);

bufPing = GC.AllocateArray<byte>(CMD_PING.Length, pinned: true);
CMD_PING.CopyTo(bufPing);
bufGet = GC.AllocateArray<byte>(CMD_GET.Length, pinned: true);
CMD_GET.CopyTo(bufGet);
bufSet = GC.AllocateArray<byte>(CMD_SET.Length, pinned: true);
CMD_SET.CopyTo(bufSet);
bufIncr = GC.AllocateArray<byte>(CMD_INCR.Length, pinned: true);
CMD_INCR.CopyTo(bufIncr);
bufExists = GC.AllocateArray<byte>(CMD_EXISTS.Length, pinned: true);
CMD_EXISTS.CopyTo(bufExists);
bufSetex = GC.AllocateArray<byte>(CMD_SETEX.Length, pinned: true);
CMD_SETEX.CopyTo(bufSetex);
bufExpire = GC.AllocateArray<byte>(CMD_EXPIRE.Length, pinned: true);
CMD_EXPIRE.CopyTo(bufExpire);
bufHset = GC.AllocateArray<byte>(CMD_HSET.Length, pinned: true);
CMD_HSET.CopyTo(bufHset);
bufLpush = GC.AllocateArray<byte>(CMD_LPUSH.Length, pinned: true);
CMD_LPUSH.CopyTo(bufLpush);
bufZadd = GC.AllocateArray<byte>(CMD_ZADD.Length, pinned: true);
CMD_ZADD.CopyTo(bufZadd);
bufSubscribe = GC.AllocateArray<byte>(CMD_SUBSCRIBE.Length, pinned: true);
CMD_SUBSCRIBE.CopyTo(bufSubscribe);
bufZrangebyscore = GC.AllocateArray<byte>(CMD_ZRANGEBYSCORE.Length, pinned: true);
CMD_ZRANGEBYSCORE.CopyTo(bufZrangebyscore);
bufZremrangebyscore = GC.AllocateArray<byte>(CMD_ZREMRANGEBYSCORE.Length, pinned: true);
CMD_ZREMRANGEBYSCORE.CopyTo(bufZremrangebyscore);
bufHincrbyfloat = GC.AllocateArray<byte>(CMD_HINCRBYFLOAT.Length, pinned: true);
CMD_HINCRBYFLOAT.CopyTo(bufHincrbyfloat);
bufGeoradius = GC.AllocateArray<byte>(CMD_GEORADIUS.Length, pinned: true);
CMD_GEORADIUS.CopyTo(bufGeoradius);
bufSetifmatch = GC.AllocateArray<byte>(CMD_SETIFMATCH.Length, pinned: true);
CMD_SETIFMATCH.CopyTo(bufSetifmatch);
}

// === Tier 1a: SIMD Vector128 fast path ===

[Benchmark]
public RespCommand ParsePING()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufPing);
return result;
}

[Benchmark]
public RespCommand ParseGET()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufGet);
return result;
}

[Benchmark]
public RespCommand ParseSET()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufSet);
return result;
}

[Benchmark]
public RespCommand ParseINCR()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufIncr);
return result;
}

[Benchmark]
public RespCommand ParseEXISTS()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufExists);
return result;
}

// === Tier 1b: Scalar ulong switch ===

[Benchmark]
public RespCommand ParseSETEX()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufSetex);
return result;
}

[Benchmark]
public RespCommand ParseEXPIRE()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufExpire);
return result;
}

// === Old Tier 2 (FastParseArrayCommand): near top of switch ===

[Benchmark]
public RespCommand ParseHSET()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufHset);
return result;
}

[Benchmark]
public RespCommand ParseLPUSH()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufLpush);
return result;
}

[Benchmark]
public RespCommand ParseZADD()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufZadd);
return result;
}

// === Old Tier 2 (FastParseArrayCommand): deep in switch (long names) ===

[Benchmark]
public RespCommand ParseZRANGEBYSCORE()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufZrangebyscore);
return result;
}

[Benchmark]
public RespCommand ParseZREMRANGEBYSCORE()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufZremrangebyscore);
return result;
}

[Benchmark]
public RespCommand ParseHINCRBYFLOAT()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufHincrbyfloat);
return result;
}

// === Old Tier 3 (SlowParseCommand): sequential SequenceEqual scan ===

[Benchmark]
public RespCommand ParseSUBSCRIBE()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufSubscribe);
return result;
}

[Benchmark]
public RespCommand ParseGEORADIUS()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufGeoradius);
return result;
}

[Benchmark]
public RespCommand ParseSETIFMATCH()
{
RespCommand result = default;
for (int i = 0; i < batchSize; i++)
result = session.ParseRespCommandBuffer(bufSetifmatch);
return result;
}
}
}
Loading
Loading