Skip to content

[WIP] Cluster modules improvements#34

Open
contrasam wants to merge 77 commits intomainfrom
feature/cluster-improvements
Open

[WIP] Cluster modules improvements#34
contrasam wants to merge 77 commits intomainfrom
feature/cluster-improvements

Conversation

@contrasam
Copy link
Copy Markdown
Contributor

No description provided.

contrasam and others added 30 commits April 11, 2026 09:54
10 phases (22–31): cluster + persistence audit, pluggable serialization
(Kryo/JSON replacing Java native serialization), Redis shared persistence
for cross-node StatefulActor recovery, cluster observability, reliability
hardening, performance optimization, management API, and final validation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents the bug where StatefulActor state is lost when a cluster node
fails and the actor is reassigned to another node. The test is @disabled
until Phase 26 (Cluster + Shared Persistence Integration) fixes this.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Critical: StatefulActor state lost on cluster reassignment (no shared persistence).
High: Java native serialization for inter-node messages, MessageTracker resource
leak, no retry/backoff in EtcdMetadataStore. Findings mapped to Phases 23-31.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Phase 22 complete. Findings document covers 11 issues (2 critical, 4 high,
5 medium) across cluster and persistence modules, prioritised for Phases 23-31.
State-loss-on-reassignment bug documented with regression test.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…plementations

Defines pluggable byte[] serialize/deserialize contract in cajun-core.
JavaSerializationProvider is the backward-compat fallback. KryoSerializationProvider
(primary) uses ThreadLocal Kryo with registration-not-required + objenesis.
JsonSerializationProvider (secondary) uses Jackson with full type info.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces ObjectOutputStream/ObjectInputStream with provider.serialize/deserialize.
Length-prefixed framing (4-byte int + payload) used for stream-based transport.
RemoteMessage and MessageAcknowledgment no longer require Serializable.
lib copy defaults to KryoSerializationProvider; cajun-core copy defaults to JavaSerializationProvider.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…leSnapshotStore

Replaces ObjectOutputStream/ObjectInputStream with provider.serialize/deserialize.
Journal and snapshot entries written as raw byte files. Default provider is
JavaSerializationProvider for backward compatibility with existing journal files.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t store

Removes T extends Serializable constraint from LmdbMessageJournal, LmdbSnapshotStore,
and LmdbBatchedMessageJournal. serializeEntry/deserializeEntry now delegate to
SerializationProvider. Default provider is JavaSerializationProvider for backward compat.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…d Java providers

Tests cover: non-Serializable records via Kryo, sealed interfaces, JournalEntry
generics, concurrent Kryo isolation, FileMessageJournal integration (Kryo + Java
backward-compat). Non-Serializable types verified to fail with Java provider.
Kryo provider now uses Objenesis StdInstantiatorStrategy for classes without
no-arg constructors (Java records, final classes).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Phase 23 complete. SerializationProvider interface + Kryo/JSON/Java implementations
wired into ReliableMessagingSystem, FileMessageJournal, and LmdbMessageJournal.
Serializable constraint removed from LMDB journals. Tests green (15/15).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Documents Redis key namespace (Hash for journal, String for snapshot),
command mapping for all MessageJournal and SnapshotStore operations,
Redis Cluster hash tag co-location strategy, AOF persistence mode guidance,
tradeoff comparison vs LMDB and file, Lettuce async API design notes,
and 6 known limitations/risks. Basis for Phase 25 implementation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…e and lib

Adds io.lettuce:lettuce-core:6.3.2.RELEASE as the Redis client library.
Lettuce chosen over Jedis for async-first API (CompletableFuture native)
compatible with MessageJournal/SnapshotStore async contract. No implementation
yet — RedisPersistenceProvider implemented in Phase 25.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Phase 24 complete. Schema design covers Hash-based journal, String-based
snapshot, key namespace with Cluster hash tags, AOF durability guidance,
Redis vs LMDB vs file tradeoffs, and Lettuce async API patterns.
Lettuce 6.3.2.RELEASE added to cajun-persistence and lib. All tests green.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements MessageJournal<M> on top of Redis using Lettuce. Uses a Lua
script for atomic sequence-counter increment + HSET in a single round-trip.
readFrom filters and sorts hash fields by sequence number, truncateBefore
deletes fields strictly below the threshold, and getHighestSequenceNumber
reads the counter key returning -1 for absent keys. Actor IDs with colons
are sanitized to underscores; hash-tag notation forces Redis Cluster co-location.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…mantics

Implements SnapshotStore<S> backed by Redis. Each actor stores exactly one
snapshot (latest wins) at key {prefix}:snapshot:{sanitized(actorId)}. The
full SnapshotEntry (actorId + state + sequenceNumber) is serialized and
stored as a Redis String. isHealthy() uses a synchronous PING check.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…s tag

RedisPersistenceProvider creates a single shared Lettuce connection with
StringCodec/ByteArrayCodec and provides factory methods for RedisMessageJournal
and RedisSnapshotStore. Batched journals are wrapped in SimpleBatchedMessageJournal.
The 'requires-redis' JUnit tag is added to excludeTags in both cajun-persistence
and lib test tasks so integration tests are not run without a Redis instance.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tuce

- RedisMessageJournalTest: 8 tests covering append, readFrom, truncateBefore,
  getHighestSequenceNumber, and actorId sanitization using mocked async commands
- RedisSnapshotStoreTest: 4 tests covering saveSnapshot, getLatestSnapshot,
  deleteSnapshots using mocked Lettuce connection
- Use concrete RedisFuture anonymous impl to avoid Mockito strict-stubbing issues
  with default interface methods (toCompletableFuture)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ulActor recovery

- RedisIntegrationTest: 7 tests covering journal round-trip, truncation, highest
  sequence number, snapshot save/load/delete/overwrite, and health check
- RedisStatefulActorTest: verifies Phase 22 C1 bug fix — actor recovers state from
  Redis after full ActorSystem restart, then continues processing correctly
- Both test classes tagged @tag("requires-redis") and excluded from default test run

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…startup health check

Registers the given provider in PersistenceProviderRegistry and sets it as
the default during start(), so all StatefulActors on the node use shared
persistence. Logs WARN if provider reports unhealthy at startup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Verifies null-system replacement, existing-system preservation, record
traversal, nested Pid rehydration, and null-state handling.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…dis cross-node

Replaces MockBatchedMessageJournal/MockSnapshotStore with shared RedisPersistenceProvider
so both system1 and system2 use the same Redis journal/snapshot keys. StatefulActor on
system2 replays 5 messages from Redis and recovers count=5; 1 more increment → count=6.
Original @disabled test kept as historical bug documentation at method level.
Tagged requires-redis, UUID actor IDs prevent cross-run interference.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…append and recovery

Measures message throughput and recovery latency for FileMessageJournal
and RedisMessageJournal over N=500 messages. Results printed to stdout.
Tagged performance (+ requires-redis for Redis tests) — excluded from default runs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…essagingSystem

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…lthCheck()

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… update logback.xml

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
contrasam and others added 30 commits April 11, 2026 14:33
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Double-counted remote message failures: routeToNode().exceptionally()
  now skips incrementRemoteMessageFailures() when messagingSystem is a
  ReliableMessagingSystem (doSendMessage already counts the failure)
- Silent SerializationException in handleClient(): add explicit catch
  before IOException so deserialization errors are logged rather than
  swallowed by the executor's uncaught handler
- Jackson RCE via DefaultTyping.EVERYTHING: replace allowIfBaseType(Object)
  + EVERYTHING with configurable trusted package prefixes and NON_FINAL;
  default INSTANCE restricts to com.cajunsystems.*, java.*, javax.*

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add 29-2-PLAN.md and 29-2-SUMMARY.md documenting the three P1 fixes
(double-counted metrics, silent SerializationException, Jackson RCE).
Update STATE.md decisions and ROADMAP.md Phase 29 section.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
30-1: ClusterConfiguration builder + ClusterManagementApi interface +
      listNodes/listActors read ops + 10 unit tests
30-2: migrateActor + drainNode implementations + drain-and-rejoin tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…erManagementApi

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…e() into ClusterActorSystem

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…d-op tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ad-ops plan

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…agementApi

migrateActor: validates target via metadata store (authoritative, not in-memory
knownNodes cache), updates etcd assignment, invalidates TTL cache, stops local
actor when this node is the source so state is persisted before lazy recovery.

drainNode: queries live node list from etcd, excludes drained node, distributes
actors via RendezvousHashing, runs migrations in parallel, swallows per-actor
failures (best-effort drain), fails fast if no other nodes available.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Moves the ConcurrentHashMap-backed MetadataStore stub from an inner
class in ClusterManagementApiReadTest to a package-private top-level
class, available to all cluster test files without duplication.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…g etcd assignment

ClusterActorSystem.shutdown(actorId) deletes the actor's etcd entry as part of
unregistering from the cluster. This caused migrateActor to write the new target
assignment then immediately delete it when stopping the local mailbox.

Add shutdownLocalOnly() (package-private) that calls super.shutdown() only,
and use it in DefaultClusterManagementApi.migrateActor() so the new assignment
written to etcd is preserved after the local actor stops.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4 migrate tests: updatesMetadataStore, unknownTarget, notCurrentlyAssigned,
listActors reflects new assignment.
5 drain tests: migratesAll, emptyNode, singleNodeFails, consistentHashing
distribution, drain-and-rejoin cycle.
All use InMemoryMetadataStore + NoopMessagingSystem — no etcd required.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…plan

Phase 30 fully complete. Add 30-2-SUMMARY documenting the shutdownLocalOnly
deviation. Update STATE.md decisions and ROADMAP.md Phase 30 to ✅.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
31-1: Shared test helper extraction (WatchableInMemoryMetadataStore +
      InMemoryMessagingSystem) + 3 chaos tests + 3 lifecycle/drain tests
31-2: cluster_mode.md rewrite + cluster-deployment.md + cluster-serialization.md
      + 100-msg Redis state recovery test + ClusterStatefulRecoveryExample

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ssagingSystem as shared test helpers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two bugs fixed while writing the cluster lifecycle integration tests:

1. shutdownLocalOnly() leak: Actor.stop() always calls system.shutdown(actorId)
   which deleted the metadata entry — defeating the purpose of shutdownLocalOnly.
   Fix: track actors in skipMetadataDeleteActors; shutdown(actorId) skips the
   delete when the flag is set.

2. System stop deletes actor metadata: ClusterActorSystem.stop() called
   super.shutdown() which stopped each actor, triggering Actor.stop() →
   system.shutdown(actorId) → metadataStore.delete(). This erased assignments
   before the leader could reassign them to surviving nodes.
   Fix: suppress metadata deletion for all local actors during stop(); node key
   deletion already signals departure — individual actor metadata is left for
   the leader to redistribute.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces the original 141-line stub with a ~300-line document covering
ClusterConfiguration builder, ClusterManagementApi, ClusterMetrics,
NodeCircuitBreaker, TTL cache, Redis persistence, and serialization.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Expanded from 141 to ~300 lines. Covers cross-node state recovery
(Phase 26+), ClusterManagementApi with rolling upgrade workflow,
NodeCircuitBreaker, ClusterMetrics, health check, graceful degradation,
and an updated production checklist.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New ~200-line guide covering etcd cluster setup, Redis durability
config, JVM startup bootstrap, health monitoring with Prometheus,
rolling upgrade procedure, Kubernetes StatefulSet example, and a
troubleshooting section.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New ~150-line guide covering KryoSerializationProvider vs
JsonSerializationProvider tradeoffs, security model, schema evolution
with Kryo, Kryo-to-JSON migration procedure, and verification steps.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…erStateTest

New testStatefulActorFullLifecycle_100Messages verifies that a StatefulActor
accumulating 100 Redis-journaled messages fully recovers on a new cluster node,
then correctly processes additional messages (expected total: 101).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Self-contained example demonstrating stateful actor recovery across a
simulated cluster failover. Uses inline SimpleMetadataStore and
SimpleMessagingSystem (no real etcd/gRPC needed), RedisPersistenceProvider,
20 increments on system1, drainNode + stop, then recovery and 5 more
increments on system2 verified at count=25.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mark 31-2 as complete in PLAN.md, create 31-2-SUMMARY.md with commit
hashes and deviations, update STATE.md (Phase 31 complete, decisions
logged), mark 31-2 as done in ROADMAP.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant