diff --git a/docs/guides/snapshot-storage-migration.mdx b/docs/guides/snapshot-storage-migration.mdx new file mode 100644 index 00000000..4d0fe0a5 --- /dev/null +++ b/docs/guides/snapshot-storage-migration.mdx @@ -0,0 +1,265 @@ +--- +title: "Migrating Snapshot Storage Backend" +description: "Move a multi-node Restate cluster from one snapshot destination to another." +tags: ["deployment", "snapshots", "migration"] +--- + +This guide describes how to migrate a multi-node Restate cluster from one snapshot storage backend to another (for example, between different buckets/prefixes, or from MinIO to GCS). + +The migration temporarily increases partition replication to ensure every node hosts every partition before snapshots are disabled. This prevents trim-gap failures during rolling restarts. The migration leverages Restate's `worker.durability-mode` configuration option to prevent any log trimming during the transition, ensuring no data loss even if the old snapshots become unavailable before new ones are created. + +**Prerequisites:** +- Restate server version **1.6 or later** (required for `worker.durability-mode` configuration) +- Rolling restart capability for your cluster +- Access to both old and new snapshot storage backends during migration +- Capacity to temporarily run each partition on every worker node (partition replication = cluster size) +- [`restatectl`](/server/clusters#controlling-clusters-with-restatectl) CLI configured to communicate with your cluster + + + + +Capture the current cluster replication settings so you can restore them later: + +```shell +restatectl config get +``` + +Note the current **Partition replication** value (for example `{node: 2}`). + + + + +Set partition replication to your cluster size `N` (the number of worker nodes). This ensures every node has a local copy of every partition before you disable snapshots. + +```shell +restatectl config set --partition-replication N +``` + +Do **not** use `--replication` here unless you also want to increase log replication. + +:::note[Why increase partition replication?] +Without this step, when the cluster controller reconfigures partition replica sets during rolling restarts, some nodes may be unable to serve a given partition depending on their prior local partition store state, the log trim point, and available snapshots. With partition replication matching cluster size, every node will maintain a warm replica of every partition and is able to resume without the need for a new snapshot on restart, provided the log was not trimmed during its downtime. +::: + + + + +Wait until every partition has a replica on every node and followers have no lag: + +```shell +restatectl partitions list +``` + +Verify that: +- Each partition ID appears `N` times (once per node) +- All rows show `LSN-LAG` of `0` (or consistently near `0`) + +For example, with 8 partitions and 3 nodes, you should see 24 rows total. + + + + +Roll out a configuration update to disable automatic snapshots and switch to conservative durability mode: + +```toml restate.toml +[worker.snapshots] +# Disable automatic snapshots by removing/commenting destination +# destination = "s3://old-bucket/prefix" + +[worker] +# Use the strictest mode - requires BOTH replicas AND snapshots for trim +# When snapshot destination is not set, this halts all log trimming +durability-mode = "snapshot-and-replica-set" +``` + +This effectively disables both snapshotting and log trimming. The system will log a warning every 60 seconds: *"Detected cluster environment with no snapshot repository configured. Automatic log trimming is disabled..."* - this is expected during the migration. + +Perform a rolling restart of all cluster nodes with the new configuration. Restart one node at a time, waiting for it to rejoin and partitions to become active before proceeding to the next node. + +:::tip[Live traffic during migration] +With partition replication matching cluster size, rolling restarts have minimal impact on live traffic. Requests in-flight on a restarting node may fail—use [idempotency keys](/develop/ts/service-communication#idempotent-invocations) to make retries safe. +::: + + + + +Check the cluster status to confirm all partitions are active: + +```shell +restatectl partitions list +``` + +You should see all partitions with the `ARCHIVED` column empty or unchanged: + +``` +ID NODE MODE STATUS EPOCH APPLIED DURABLE ARCHIVED LSN-LAG UPDATED +0 N1:1 Leader Active 5 1234 1234 - 0 2s ago +1 N2:1 Leader Active 5 5678 5678 - 0 1s ago +... +``` + +The `ARCHIVED` column shows `-` (due to no known snapshot). This is expected. + +The applied LSN should increase over time if there is cluster activity but the archived LSN should remain `-`: + + + + +Roll out a configuration update with the new snapshot destination: + +```toml restate.toml +[worker.snapshots] +destination = "s3://new-bucket/prefix" # New repository + +[worker] +# Use conservative settings +durability-mode = "snapshot-and-replica-set" +trim-delay-interval = "24h" +``` + +Perform a rolling restart of all cluster nodes (one at a time, verifying health between each). + + + + +Trigger manual snapshots for all partitions to populate the new repository immediately: + +```shell +restatectl snapshot create +``` + +You should see output confirming each partition was snapshotted: + +``` +Snapshot created for partition 0: snap_15GSJBOfxk3x8k1CfPwfxrb (log 0 @ LSN >= 49622035) +Snapshot created for partition 1: snap_2xHJKLMnop4y9z2DgQwgAbc (log 1 @ LSN >= 49622040) +... +``` + + + + +Check that snapshots exist in the new storage backend. For S3: + +```shell +aws s3 ls s3://new-bucket/prefix/ --recursive | head -20 +``` + +Each partition should have a `latest.json` file and a snapshot directory: + +``` +prefix/0/latest.json +prefix/0/lsn_00000000000000860864-snap_13yBpep1H1jKGAzHhqkmCyt/... +prefix/1/latest.json +prefix/1/lsn_00000000000000860870-snap_2xHJKLMnop4y9z2DgQwgAbc/... +... +``` + +Confirm the archived LSN column now shows the snapshot LSN values: + +```shell +restatectl partitions list +``` + +Expected output: + +``` +ID NODE MODE STATUS EPOCH APPLIED DURABLE ARCHIVED LSN-LAG UPDATED +0 N1:1 Leader Active 5 1250 1250 1234 0 2s ago +1 N2:1 Leader Active 5 5700 5700 5678 0 1s ago +``` + + + + +After the new snapshot repository is verified, restore the original partition replication value you recorded earlier: + +```shell +restatectl config set --partition-replication +``` + + + + +Roll out a configuration update with production settings: + +```toml restate.toml +[worker] +# Return to balanced mode (recommended for production) +durability-mode = "balanced" +``` + +Perform a rolling restart of all cluster nodes (one at a time, verifying health between each). + + + + +Check that the cluster status is healthy: + +```shell +restatectl status --extra +``` + +All nodes should be healthy and all partitions active with no warnings. + +Confirm log trimming has resumed: + +```shell +restatectl log list +``` + +The trim point should gradually increase as durability conditions are met. + + + + +After confirming the cluster is migrated to the new snapshot backend: + +1. Remove old snapshots +2. Revoke access to the old storage backend + + + + +## Durability mode reference + +| Mode | Description | Use case | +|------|-------------|----------| +| `balanced` | Requires snapshot AND at least one replica flushed | Production default (when snapshots configured) | +| `snapshot-and-replica-set` | Requires snapshot AND all replicas flushed | Migration phase (strictest) | +| `snapshot-only` | Requires only snapshot, ignores replicas | Special cases | +| `replica-set-only` | Requires all replicas flushed, ignores snapshots | Default without snapshots | +| `none` | Disables automatic durability tracking | Testing only | + +## Rollback plan + +If you encounter issues during migration, the rollback procedure depends on how far you've progressed: + +**During steps 1-3** (before log trimming is disabled): + +No destructive changes have been made. Simply restore partition replication to the original value: + +```shell +restatectl config set --partition-replication +``` + +**During steps 4-5** (log trimming disabled, no new snapshots yet): + +Restore the original configuration pointing to the old snapshot repository, perform a rolling restart, then restore partition replication: + +```shell +restatectl config set --partition-replication +``` + +**During steps 6-8** (configuring new repository, creating snapshots): + +If no log trimming has occurred since the original repository was disabled, you can safely discard the new repository and revert to the original configuration. Restore partition replication after the rollback. + +**After step 9** (partition replication restored, normal operations): + +If logs have been trimmed based on snapshot LSNs published to the new repository, you must follow the same migration process to return to the original destination: disable log trimming, update snapshot destination, create and verify snapshots, then re-enable log trimming. + +## See also + +- [Configuring automatic snapshotting](/server/snapshots#configuring-automatic-snapshotting) +- [Controlling clusters with restatectl](/server/clusters#controlling-clusters-with-restatectl)