Skip to content
Open
65 changes: 46 additions & 19 deletions content/influxdb3/enterprise/admin/clustering.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,18 +59,23 @@ influxdb3 serve --mode=ingest
# Multiple modes
influxdb3 serve --mode=ingest,query

# All modes (default)
# All modes (default, for single-node Enterprise only)
influxdb3 serve --mode=all
```

Available modes:

- `all`: All capabilities enabled (default)
- `all`: All capabilities enabled (single-node Enterprise deployments only)
- `ingest`: Data ingestion and line protocol parsing
- `query`: Query execution and data retrieval
- `compact`: Background compaction and optimization
- `process`: Data processing and transformations

> [!Warning]
> Use `all` mode for **single-node** Enterprise deployments only.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> Use `all` mode for **single-node** Enterprise deployments only.
> #### Don't use all mode in a multi-node cluster
>
> Use `all` mode for **single-node** Enterprise deployments only.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied in 689cacc. The warning now has a heading "Don't use all mode in a multi-node cluster" and the opening line matches your suggestion.

> Avoid using `all` mode in a multi-node cluster—some cluster features such as replication and catalog refresh aren't designed to work with `all`-mode nodes.
Comment thread
jstirnaman marked this conversation as resolved.
Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> Avoid using `all` mode in a multi-node cluster—some cluster features such as replication and catalog refresh aren't designed to work with `all`-mode nodes.
> Some cluster features such as replication and catalog refresh aren't designed to work with `all`-mode nodes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied in 689cacc.

> In a multi-node cluster, use explicit modes (`ingest`, `query`, `compact`, `process`) and assign `compact` to exactly one node.

## Allocate threads by node type

### Critical concept: Thread pools
Expand Down Expand Up @@ -194,6 +199,11 @@ influxdb3 serve \

Compactor nodes optimize stored data through background compaction processes.

> [!Warning]
> Only **one** compactor node can run per cluster.
> Multiple compactors writing compacted data to the same location will cause data corruption.
> Any node mode that includes compaction (`compact` or `all`) counts toward this limit.

### Dedicated compactor (32 cores)

```bash
Expand Down Expand Up @@ -298,21 +308,25 @@ influxdb3 \

### Small cluster (3 nodes)

> [!Note]
> Only one node per cluster can run compaction.
> In this example, Node 1 handles ingest, query, and compaction while Nodes 2–3 handle ingest and query only.

```yaml
# Node 1: All-in-one primary
mode: all
# Node 1: Ingest, query, and compactor
mode: ingest,query,compact
cores: 32
io_threads: 8
datafusion_threads: 24

# Node 2: All-in-one secondary
mode: all
# Node 2: Ingest and query (no compaction)
mode: ingest,query
cores: 32
io_threads: 8
datafusion_threads: 24

# Node 3: All-in-one tertiary
mode: all
# Node 3: Ingest and query (no compaction)
mode: ingest,query
cores: 32
io_threads: 8
datafusion_threads: 24
Expand All @@ -333,8 +347,14 @@ cores: 48
io_threads: 4
datafusion_threads: 44

# Nodes 5-6: Compactor + Process
mode: compact,process
# Node 5: Compactor (only one compactor per cluster)
mode: compact
cores: 32
io_threads: 4
datafusion_threads: 28

# Node 6: Process node
mode: process
cores: 32
io_threads: 4
datafusion_threads: 28
Expand All @@ -355,13 +375,13 @@ cores: 64
io_threads: 4
datafusion_threads: 60

# Nodes 9-10: Dedicated compactors
# Node 9: Dedicated compactor (only one compactor per cluster)
mode: compact
cores: 32
io_threads: 2
datafusion_threads: 30

# Nodes 11-12: Process nodes
# Nodes 10-12: Process nodes
mode: process
cores: 32
io_threads: 6
Expand Down Expand Up @@ -553,7 +573,8 @@ GROUP BY event_type;
- Growing number of small Parquet files
- Increasing query times due to file fragmentation

**Solution:** Add compactor nodes or increase DataFusion threads (see [Compactor node issues](#compactor-node-issues))
**Solution:** For nodes using the default Parquet-backed storage engine, increase DataFusion threads on your single compactor node (see [Compactor node issues](#compactor-node-issues)).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Solution:** For nodes using the default Parquet-backed storage engine, increase DataFusion threads on your single compactor node (see [Compactor node issues](#compactor-node-issues)).
**Solution:** For nodes using the Parquet-backed storage engine, increase DataFusion threads on your single compactor node (see [Compactor node issues](#compactor-node-issues)).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied in 689cacc — "default" removed.

The Performance Preview with PachaTree storage does not use DataFusion for compaction—refer to the [Performance Preview documentation](/influxdb3/enterprise/performance-preview/) for tuning guidance.

## Troubleshoot node configurations

Expand Down Expand Up @@ -602,7 +623,7 @@ free -h

```bash
# Check: Compaction queue length
# Solution: Add more compactor nodes or increase threads
# Solution: Increase threads on the single compactor node (only one compactor is allowed per cluster)
--datafusion-num-threads=30
```

Expand All @@ -617,14 +638,20 @@ free -h

## Migrate to specialized nodes

### From all-in-one to specialized
### From single-node to specialized cluster

> [!Note]
> `all` mode is intended for single-node Enterprise deployments.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> `all` mode is intended for single-node Enterprise deployments.
> `all` mode is only for single-node Enterprise deployments.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied in 689cacc.

> When scaling to a multi-node cluster, replace `all` with explicit modes and assign `compact` to exactly one node.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it needs to be more than this; when upgrading from an All to multinode, the "new" compact node must have the same node-id as the all node being replaced.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorporated in 689cacc. The migration example now shows the compact node in Phase 2 using --node-id=node0 — the same node-id as the all-mode node being replaced — with a comment making this requirement explicit.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> When scaling to a multi-node cluster, replace `all` with explicit modes and assign `compact` to exactly one node.
When scaling a single `all` node cluster to a multi-node cluster:
- Replace the `all` node with nodes that have explicit, specialized modes
- Assign `compact` mode to exactly one node that uses the same node-id as the `all` node being replaced

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied in 689cacc. The scaling guidance is now outside the callout as bullet points, incorporating the same-node-id requirement from @philjb's comment.


```bash
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Fix the notation/syntax in this "bash" example so that it represents real code. Observe the following for all examples scaling from all mode to a multi-node cluster:

  • Replace the all node with nodes that have explicit, specialized modes
  • Assign compact mode to exactly one node that uses the same node-id as the all node being replaced

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 689cacc. The pseudo-bash is replaced with real influxdb3 serve commands. Phase 2 now explicitly shows the compact node reusing --node-id=node0 (same as the replaced all-mode node), per @philjb's requirement.

# Phase 1: Baseline (all nodes identical)
all nodes: --mode=all --num-io-threads=8
# Phase 1: Baseline (single-node deployment)
node1: --mode=all --num-io-threads=8

# Phase 2: Identify workload patterns
# Monitor which nodes handle most writes vs queries
# Phase 2: Expand to multi-node — replace all-in-one with explicit modes
# Assign compact to exactly one node
node1: --mode=ingest,query,compact --num-io-threads=8
node2: --mode=ingest,query --num-io-threads=8

# Phase 3: Gradual specialization
node1: --mode=ingest,query --num-io-threads=12
Expand Down