ATLAS

Active-inference Training with Learned Adaptive Stigmergy

"Don't train on what humans wrote about the world.
Train on what you actually discover about the world.
Validate what you claim. Own what you build."

An open source contribution from OpenHub Research (Thailand)
Website: atlasagi.org · Author: Robin Dey · Institution: https://openhubresearch.org/

What is ATLAS?

ATLAS is a next-generation LLM training framework built in pure Rust with zero external crate dependencies — the SQLite principle applied to AI infrastructure.

It fuses four architectural innovations:

Component	Role	Key property
ASTRA-dev	Live discovery engine	~10s/cycle, NASA/WHO/World Bank APIs, causal inference
GraphPalace	Stigmergic memory	Pheromone-guided curriculum, O(1/√T) convergence
TRM-CausalValidator	Recursive validator	7M params, 0.1% compute, Quality Gate 6
ZK Schnorr proofs	Provenance chain	LLM output → live API, cryptographically verifiable

v4.0.0 — Champagnat n-morphic Framework + OLMo-3-7B Fix:

🧬 InvasionFitnessScorer — morphic fitness f(y) = success − cost − Σcos_sim·n̄ (fixes pheromone saturation)
🌊 CanonicalPheromoneUpdate — principled decay Δρ ∝ μ·σ²·n̄·∂₁s (Champagnat-Méléard 2011)
⚖️ BarBovier2017Constraints — stability gate: explore_ratio × batch_size > 10, temp > 1/√batch
🔀 CognitiveBranching — n-morphic OODA bifurcation on plateau detection
🔆 HJConcentrationPrior — Hopf-Cole sharpening T_eff(s) = T₀/(1+γs) in TRM recursion
🔧 Issue #7 fix — OLMo-3-7B SWA (24/32 sliding layers, window=4096) + YaRN RoPE + config.json auto-patch

v4.0.2 — BF16 GPU Inference Path (Issue #9):

⚡ BF16 W16A32 — weights in BF16 (14 GB) vs f32 (28 GB); GpuBufBf16, GpuBufKind, upload_bf16() in atlas-tensor
🔥 GEMV kernels — sgemv_bf16_kernel + sgemv_f32_kernel: one-warp-per-row for N=1 decode; fixes 32× tiled-GEMM inefficiency
🚀 OLMo-3-7B-Think: 4.1 → 19.9 tok/s (4.8× speedup, A100-SXM4-40GB, W16A32)

v4.0.3 — Math Integrity Fixes (Issue #11):

🧮 CanonicalPheromoneUpdate λ decay — replaced linear formula base_rate × (1 − canonical_term) (went negative when term > 1, dead gradient at clamp boundary) with base_rate × exp(−canonical_term): always positive, smooth, zero-gradient fidelity, hardware-safe for v6 ASIC spec
🏆 InvasionFitnessScorer competition kernel — fixed negative Lotka-Volterra coefficients: raw cosine_sim ∈ [−1, 1] was giving fitness bonuses to anti-correlated strategies (mutualism, not competition); replaced with α_ij = ReLU(cos_sim − 0.2) — threshold at 4σ above noise floor in d=384 embedding space; competition_threshold added to InvasionFitnessConfig
✅ 532/532 tests (+4 new regression tests); GPU validated: 47/47 A100 model tests, OLMo-3-7B-Think still 19.9 tok/s

The result: a self-improving scientific intelligence that trains on what it actually discovers about the world — real causal relationships from live data, validated by recursive architecture, guided by stigmergic memory.

Nobody has built this before. See CHARTER.md for the full architecture.

The Big Idea

Every other LLM is trained on:

What humans wrote on the internet (web scrapes, Wikipedia)
Synthetic data generated by another LLM (GPT-4 distillation)
Human-curated datasets (expensive, frozen at curation time)

atlas-7b is trained on:

What an autonomous science engine actually discovers about the world
Real causal relationships extracted from live NASA, WHO, World Bank APIs
Validated findings with Bayesian confidence scores and PC/FCI causal inference
A corpus that grows every 10 seconds and never contains stale or duplicated information

This is not a better fine-tuning recipe. This is a different paradigm for what training data can be.

Pure Rust — Zero Dependencies

The SQLite principle applied to AI infrastructure.

atlas/
├── Cargo.toml          # workspace root — [dependencies] is empty by design
├── kernels/
│   ├── matmul.cu       # raw CUDA kernel (no cudarc crate)
│   ├── attention.cu    # flash attention from scratch
│   └── quant.cu        # INT4/INT8 quantization
└── crates/
    ├── atlas-core/     # error types, traits, config
    ├── atlas-tensor/   # Tensor + CUDA FFI (the seed of everything)
    ├── atlas-grad/     # autograd tape, backward pass
    ├── atlas-optim/    # AdamW, cosine LR scheduler
    ├── atlas-quant/    # INT4/INT8 quantization, QLoRA
    ├── atlas-model/    # transformer: MultiHeadAttn, FFN, RMSNorm, RoPE
    ├── atlas-tokenize/ # BPE tokenizer (sentencepiece port)
    ├── atlas-palace/   # GraphPalace stigmergic memory: A* search, 5-type pheromones, Active Inference
    ├── atlas-mcp/      # MCP server: 28 palace tools via JSON-RPC 2.0 stdio + connection pool
    ├── atlas-api/      # OpenAI-compatible HTTP endpoint: /v1/chat/completions, SSE streaming
    ├── atlas-trm/      # TRM-CausalValidator (7M params, arXiv:2510.04871)
    ├── atlas-causal/   # PC/FCI causal inference (py-causal port)
    ├── atlas-bayes/    # Bayesian confidence scoring
    ├── atlas-astra/    # ASTRA OODA engine (~8K LOC, full port)
    ├── atlas-corpus/   # LiveDiscoveryCorpus + DeepSupervisionTrainer + quality gates
    ├── atlas-zk/       # ZK Schnorr proofs (asi-build port)
    ├── atlas-http/     # HTTP client via raw libc syscalls
    ├── atlas-json/     # JSON parser from source
    ├── atlas-safety/   # Horn-clause safety constitution, 5-state FSM, CircuitBreaker
    ├── atlas-bridge/   # ZK-attested Rings↔ETH interface (Sepolia-compatible)
    └── atlas-cli/      # CLI: train / discover / eval / prove / mcp / api / bench

21 crates. One coherent system. Zero external Rust dependencies.

CUDA is called via raw extern "C" FFI from build.rs + .cu kernel files — no cudarc, no tch, no candle. The same approach that makes SQLite trustworthy, applied to GPU compute.

// atlas-tensor/src/lib.rs — the first line of ATLAS
pub struct Tensor {
    data:  Vec<f32>,
    shape: Vec<usize>,
}

Every billion-parameter transformer starts here.

Seven Pillars

GraphPalace Memory — pheromone-weighted persistent knowledge; search_by_embedding(), hot_paths(), deposit_pheromones()
Morphic Warm-Start — O(1/√T) cross-run convergence (proven in BUTTERS, R²=0.982, p<10⁻³⁰)
Stigmergic RLVR — r_total = α·r_verifiable + β·r_pheromone; pheromone decay prevents reward hacking
Active Inference Data Gen — palace cold spots direct ASTRA to fill knowledge gaps
ZK Knowledge Claims — Schnorr proof chain from LLM output to raw API data; hallucinations have broken proof trails
LiveDiscoveryCorpus — ASTRA's output as a living training dataset; ~86K quality examples/month
TRM-CausalValidator — 7M-param recursive validator; z = net(x,y,z) × 6 recursions; Quality Gate 6; generates Type 5 training traces

GPU Inference

ATLAS v4.0.0 delivers a fully GPU-resident forward pass — hidden states stay in VRAM between tokens, with pre-pinned weight upload at model load time.

A100-SXM4-40GB Benchmark (sm_80, CUDA 12.9)

Model	Params	GPU tok/s	VRAM	Notes
SmolLM2-135M	135M	37.7	507 MiB	f32, sm_80
SmolLM2-360M	360M	25.4	~1.4 GB	f32
SmolLM2-1.7B	1.7B	12.6	~6.5 GB	f32, 2.4× over CPU
TinyLlama-1.1B	1.1B	20.9	~8.4 GB	f32
OLMo-3-7B-Think	7B	19.9	~14 GB	BF16 W16A32 (v4.0.2+); was 4.1 tok/s CPU

CUDA Kernel Suite

Kernel	What it does
`rmsnorm_forward`	RMSNorm in CUDA — replaces per-token CPU loop
`rope_forward`	RoPE rotation — parallel over heads
`silu_mul_forward`	SwiGLU gate fused — single CUDA pass
`atlas_adamw_step`	AdamW optimizer step entirely on GPU
`sgemm_vec`	Zero-copy matrix×vector; `GpuVec` activation buffer

CUDA portability: all kernels use rsqrtf() (not __rsqrtf()) for cross-platform compatibility.

atlas-api — OpenAI-Compatible Endpoint

ATLAS v4.0.0 adds atlas-api — an OpenAI-compatible HTTP inference server. Drop-in replacement for any OpenAI API client.

# Start the server
./target/release/atlas api serve --model /home/user/models/smollm2-135m --port 8080

Endpoints

Endpoint	Method	Description
`/v1/chat/completions`	POST	Chat completions with SSE streaming
`/v1/completions`	POST	Text completions
`/v1/models`	GET	List available models

Usage Examples

# Chat completion (streaming)
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "atlas",
    "messages": [{"role": "user", "content": "What is morphic resonance?"}],
    "stream": true
  }'

# Non-streaming
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "atlas",
    "messages": [{"role": "user", "content": "Explain stigmergic memory"}],
    "stream": false
  }'

# List models
curl http://localhost:8080/v1/models

Features: SSE streaming, CORS headers, echo mode for testing, 40 tests, 0 external dependencies.

DeepSupervisionTrainer

The DeepSupervisionTrainer in atlas-corpus implements multi-pass deep supervision — each training batch runs N_sup=4..16 forward passes, summing loss across all supervision points with pheromone-driven latent carry between passes.

// atlas-corpus/src/deep_supervision.rs
pub struct DeepSupervisionTrainer {
    pub n_sup: usize,          // 4..16 forward passes per batch
    pub latent_carry: bool,    // carry hidden state between passes
    pub pheromone_weight: f32, // pheromone × mutation-selection coupling
    pub loss_trace: Vec<f32>,  // per-pass loss telemetry
}

Theory (TRM arXiv:2510.04871 validated): deep supervision accounts for >75% of TRM's improvement over baseline transformers. The DeepSupervisionTrainer implements this in the training loop — each N_sup pass is one phenotypic morph; latent carry approximates Lotka-Volterra equilibrium n̄ᵢ; pheromone_weight sets the mutation-selection coupling μ.

Convergence prediction: doubling N_sup → √2 speedup in O(1/√T) morphic convergence. Testable via the loss_trace telemetry.

Horn-Clause Safety Constitution

atlas-safety v4.0.0 adds a tractable safety constitution implemented as Horn clauses, alongside the existing 5-state FSM and CircuitBreaker.

8 safety principles across 4 non-overlapping domains:
  ┌─────────────────┬──────────────────────────────────┐
  │ capability      │ scope_limits, capability_bounds  │
  │ data            │ provenance_required, dedup_gate  │
  │ deployment      │ audit_trail, circuit_breaker     │
  │ reasoning       │ causal_grounding, zk_verifiable  │
  └─────────────────┴──────────────────────────────────┘

Why Horn clauses? Young (2026, arXiv:2501.15446) proves NP-hardness of general safety constitution verification. Horn-clause restriction (≤12 principles, 4 non-overlapping domains) ensures polynomial tractability — the safety checker can verify any system state in O(n·m) where n = principles, m = state predicates. No exponential blowup.

PalaceBackend Trait

atlas-palace v4.0.0 extracts a PalaceBackend trait, enabling pluggable storage backends without API changes:

pub trait PalaceBackend: Send + Sync {
    fn search(&self, query: &str, limit: usize) -> Vec<DrawerMatch>;
    fn deposit_pheromones(&mut self, path: &[RoomId], ptype: PheromoneType, intensity: f32);
    fn navigate(&self, from: RoomId, to: RoomId) -> Vec<RoomId>;
    fn hot_paths(&self, limit: usize) -> Vec<Path>;
    // ... 32 additional methods
}

// Palace implements PalaceBackend — fully trait-object safe
pub struct Palace { /* existing implementation */ }
impl PalaceBackend for Palace { /* ... */ }

// Swap backends without changing caller code
let palace: Box<dyn PalaceBackend> = Box::new(Palace::new(config));

This is the prerequisite for LadybugDB migration (Q3 2026) — a drop-in Grafeo/LadybugDB backend can replace the default implementation with zero API changes.

Build Order (7 Stages, ~22 Weeks)

Stage	Weeks	Crates	Milestone
1	1–4	atlas-core → tensor → grad → optim → quant	f32 matmul CPU+GPU, backward pass through 2-layer MLP
2	5–7	atlas-model → tokenize	OLMo 3 7B forward pass in pure Rust, token generation
3	8–9	atlas-palace + atlas-mcp	GraphPalace 36-method engine native, MCP server
4	10–11	atlas-trm	TRM-CausalValidator, <10ms causal graph pass/fail
5	12–16	http → json → bayes → causal → zk → astra	Full ASTRA OODA in Rust, ZK provenance
6	17–20	atlas-corpus + atlas-api	QLoRA SFT, DeepSupervisionTrainer, OpenAI API
7	21–22	atlas-zk (ext) → cli	End-to-end proof chain, atlas-7b release binary

Architecture Diagrams

Eight publication-quality figures are in docs/dashboard/diagrams/. The interactive dashboard (project overview, roadmap, papers, component status) is at docs/dashboard/index.html.

Figure	Description
Fig. 1	Full System Architecture (v3.0, TRM cluster)
Fig. 2	Discovery Flywheel — the self-improving loop
Fig. 3	ASTRA OODA + GraphPalace integration
Fig. 4	Morphic Warm-Start cross-run convergence
Fig. 5	Stigmergic RLVR pheromone reward function
Fig. 6	ZK Provenance Chain
Fig. 7	Training Pipeline phase roadmap
Fig. 8	Hybrid Generative-Recursive Architecture (TRM v3.0)

Paper Strategy

Paper	Venue	Contribution
Paper 1	EMNLP 2026	ATLAS architecture + LiveDiscoveryCorpus
Paper 2	NeurIPS 2026	Discovery Flywheel — closed-loop scientific intelligence
Paper 3	ICML 2027	Stigmergic RLVR — pheromone reward prevents policy collapse
Paper 4	ICLR 2027	O(1/√T) morphic convergence for LLMs (co-author Robin Dey)
Paper 5	IEEE S&P 2027	End-to-end ZK provenance for LLM outputs
Paper 6	ICLR/NeurIPS 2027	Hybrid generative-recursive architecture (TRM integration)

Getting Started

git clone https://github.com/web3guru888/ATLAS.git
cd ATLAS

# Run all tests (excludes CUDA-requiring tensor tests on CPU-only machines)
cargo test --workspace --exclude atlas-tensor

# Build the atlas binary
cargo build --release -p atlas-cli

# Full OODA discovery loop
./target/release/atlas discover --cycles 5 --output corpus.json

# Train on discoveries
./target/release/atlas train --corpus corpus.json --epochs 3

# Start OpenAI-compatible API server
./target/release/atlas api serve --model /path/to/model --port 8080

# ZK-prove a claim
./target/release/atlas prove --claim "Pheromone trails compound information gain" \
    --secret $(openssl rand -hex 16)

# Inspect palace memory
./target/release/atlas palace --stats --hot

# MCP server (connect to Claude Desktop / Cursor)
./target/release/atlas mcp serve --palace my-palace.json

Prerequisites:

Rust 1.75+ (rustup update stable)
CUDA 12.x + nvcc (optional; falls back to CPU if absent)
GPU with sm_75+ (Tesla T4 / A100+) for CUDA training path

Status — v4.0.3

532/532 tests passing · 21 crates · Zero external crate dependencies · CUDA sm_80 on A100-SXM4-40GB · 19.9 tok/s OLMo-3-7B-Think (BF16)

🏔 v4.0.3 is the current release. Math integrity fixes: CanonicalPheromoneUpdate exp decay (λ never negative) + InvasionFitnessScorer ReLU competition threshold (no mutualism). GPU-validated: 532/532 workspace + 47/47 A100 model tests. OLMo-3-7B-Think 19.9 tok/s confirmed.

What Works

✅ Discovery is real — atlas discover --cycles 3 hits NASA POWER, WHO GHO, World Bank, ArXiv live APIs; causal inference via PC algorithm; Bayesian quality gates
✅ Memory is real — 5-type pheromone system (exploitation/exploration/success/traversal/recency), MMAS ceiling, A* semantic pathfinding (α·C_sem + β·C_phe + γ·C_str), Active Inference agents; atlas palace --hot shows pheromone trails
✅ Training is real — SFT with GradTape + AdamW + LoRA (rank=8) + gradient accumulation + safetensors checkpoint; DeepSupervisionTrainer (N_sup=4..16, loss trace, latent carry)
✅ GPU inference is real — SmolLM2-135M at 37.7 tok/s on A100-SXM4-40GB; OLMo-3-7B-Think at 19.9 tok/s (BF16 GPU, W16A32, 14 GB VRAM — Issue #9 fixed); SWA + YaRN RoPE (Issue #7 fixed)
✅ API is real — atlas api serve exposes /v1/chat/completions + /v1/completions + /v1/models; SSE streaming; CORS; 40 tests
✅ Provenance is real — Schnorr proofs + Groth16 stub (HMAC-SHA256, BLS12-381-compatible interface) + ProvenanceChain; atlas prove generates verifiable proofs
✅ Safety is real — Horn-clause constitution (8 principles, 4 domains, Young 2026 NP-hardness validated); 5-state FSM (BOOT→NOMINAL→DEGRADED→SAFE_MODE→EMERGENCY_STOP); CircuitBreaker; append-only audit log
✅ Bridge is real — AtlasBridge with ZK-attested deposit/withdraw, Sepolia chain_id=11155111, Groth16 proof per transaction
✅ MCP is real — atlas mcp serve exposes 28 tools via JSON-RPC 2.0; McpConnectionPool (max 5, 5-min idle eviction); connects to Claude Desktop / Cursor

Version History

Version	Theme	Tests
v0.1.0	Infrastructure: f32 matmul, backward pass, GPU (7 stages)	186
v0.2.0	Real Memory Palace + MCP (28 tools, JSON-RPC 2.0)	236
v0.3.0 + v0.4.0	Real Discovery Engine + Validated Model Loading	260
v0.5.0	Real Training Loop (LoRA, grad-accum, safetensors checkpoint)	353
v0.6.0	Safety FSM + Groth16 stub + ZK Bridge	383
v0.7.0	Benchmarks, CI, CHANGELOG, REPRODUCIBILITY	383
v1.0.0	Production Release — all milestones complete	383
v2.0.0	CAS Decay + OODA Feedback + Stigmergic Sampler + GPU dispatch (37.7 tok/s on A100)	400
v3.0.0-α.1	atlas-api + PalaceBackend + GPU-resident forward pass + DeepSupervisionTrainer + Horn-clause safety	426
v4.0.0	Champagnat n-morphic framework + Issue #7 fix (SWA + YaRN RoPE + config.json auto-patch for OLMo-3-7B)	528
v4.0.1	Docs + test cleanup for v4.0.0 / Issue #7	528
v4.0.2	BF16 GPU inference path (Issue #9): OLMo-3-7B-Think 4.1 → 19.9 tok/s (4.8×), W16A32, GEMV kernels	528
v4.0.3	Math integrity (Issue #11): λ exp decay + ReLU competition threshold. 47/47 GPU model tests.	532

Crate Status

Crate	Stage	Tests	Status
atlas-core	1	2	✅ Error types, Result, traits
atlas-tensor	1	6	✅ CPU+GPU matmul, INT8/INT4, sm_80 kernels (A100); GPU AdamW kernel; sgemm_vec zero-copy; BF16 GEMV (`GpuBufBf16`, `sgemv_bf16_kernel`, W16A32 inference path)
atlas-grad	1	9	✅ GradTape, matmul/relu/add backward
atlas-optim	1	6	✅ AdamW + CosineScheduler, warmup
atlas-quant	1	7	✅ INT8, INT4, symmetric scaling
CUDA kernels	1	—	✅ tiled GEMM, rmsnorm, rope, silu_mul, AdamW, INT8/INT4 — compiled on A100-SXM4-40GB (sm_80)
atlas-json	2	12	✅ Recursive descent parser, surrogate pairs
atlas-tokenize	2	6	✅ GPT-2 byte-level BPE, tokenizer.json
atlas-model	2	27	✅ OLMo 3 / Llama 3, RoPE, GQA, SwiGLU, SWA, YaRN RoPE, config.json auto-patch; GPU-resident forward pass
atlas-palace	3	79	✅ A* search, 5-type pheromones, Active Inference, MMAS, PalaceBackend trait, session_id, PalaceConfig; v4.0.3: `CanonicalPheromoneUpdate` uses `exp(−x)` decay (always positive, smooth, hardware-safe)
atlas-mcp	3	32	✅ 28 MCP tools, JSON-RPC 2.0, live palace dispatch; McpConnectionPool (max 5, 5-min idle eviction)
atlas-api	3	40	✅ OpenAI-compatible HTTP: /v1/chat/completions, /v1/completions, /v1/models; SSE streaming; CORS
atlas-trm	4	12	✅ TRM-CausalValidator depth-6 RNN, Bayesian combining
atlas-http	5	11	✅ HTTP/1.1 TcpStream, chunked decoding, curl HTTPS
atlas-bayes	5	13	✅ BetaPrior, BayesNetwork, QualityGate, Jaccard novelty
atlas-causal	5	10	✅ PC algorithm, Fisher-Z, standard normal CDF, Meek rules
atlas-zk	5	19	✅ Schnorr + Groth16 stub (HMAC-SHA256, BLS12-381 interface)
atlas-astra	5	15	✅ OODA: NASA POWER / WHO GHO / World Bank / ArXiv; OodaFeedback adaptive explore_ratio
atlas-corpus	6	79	✅ SftTrainer, LoRA (rank=8), grad-accum, safetensors checkpoint; DeepSupervisionTrainer (N_sup 4–16, loss_trace); v4.0.3: `InvasionFitnessScorer` uses `ReLU(cos_sim − 0.2)` competition (α_ij ≥ 0, no mutualism)
atlas-safety	6	30	✅ Horn-clause constitution (8 principles, 4 domains); 5-state FSM; CircuitBreaker; append-only audit log
atlas-bridge	6	8	✅ ZK-attested Rings↔ETH interface, Sepolia chain_id=11155111
atlas-cli	7	30	✅ discover / corpus / train / eval / prove / palace / mcp / api / bench / status
TOTAL		532	✅ All passing — v4.0.3

Quick Start

git clone https://github.com/web3guru888/ATLAS.git
cd ATLAS
cargo build --release -p atlas-cli

# Full OODA discovery + training loop
./target/release/atlas discover --cycles 3 --output my-corpus.json
./target/release/atlas train --corpus my-corpus.json --epochs 2
./target/release/atlas prove --claim "CO2 drives warming" --secret deadbeef01020304
./target/release/atlas palace --stats --hot

# OpenAI-compatible API server
./target/release/atlas api serve --model /path/to/model --port 8080

# MCP server (connect to Claude Desktop / Cursor)
./target/release/atlas mcp serve --palace my-palace.json

# Run benchmarks
./target/release/atlas bench --all

atlas-mcp — Model Context Protocol Server

ATLAS exposes its memory palace as 28 MCP tools via stdio JSON-RPC 2.0, ready for Claude Desktop, Cursor, or any MCP client. v4.0.0 adds McpConnectionPool — lazy pool (max 5 connections, 5-min idle eviction) preventing connection leaks across concurrent MCP clients.

# Add to your Claude Desktop config (~/.config/claude/claude_desktop_config.json)
{
  "mcpServers": {
    "atlas-palace": {
      "command": "./target/release/atlas",
      "args": ["mcp", "--palace", "my-palace.json"]
    }
  }
}

Tool categories:

Category	Tools	Examples
Navigation	8	`palace_search`, `palace_navigate`, `palace_find_similar`
Operations	5	`palace_add_wing`, `palace_add_room`, `palace_add_drawer`
Knowledge Graph	7	`palace_kg_add`, `palace_kg_query`, `palace_kg_contradictions`
Stigmergy	5	`palace_deposit_pheromones`, `palace_hot_paths`, `palace_cold_spots`
Agent Diary	3	`palace_create_agent`, `palace_diary_write`, `palace_diary_read`

Every tool call modifies live palace state. Pheromone trails compound across sessions. Knowledge graphs grow with every interaction.

Benchmarks

ATLAS includes a zero-dependency benchmark suite using atlas_core::bench::Bench. Run with:

cargo test --workspace --exclude atlas-tensor -- --ignored --nocapture

Representative results (Ubuntu, Rust 1.95, A100-SXM4-40GB, CUDA 12.9):

Benchmark	Metric	Description
`gpu_inference_smollm2`	37.7 tok/s	SmolLM2-135M GPU inference (f32), A100-SXM4-40GB
`gpu_benchmark_olmo3_7b_think_bf16`	19.9 tok/s	OLMo-3-7B-Think BF16 GPU inference (W16A32), A100-SXM4-40GB
`palace_search_1000`	~50–200 µs/op	TF-IDF semantic search across 1000 drawers
`astar_100_nodes`	~20–100 µs/op	Pheromone-guided A* pathfinding (100-node KG)
`pheromone_deposit_decay_1000`	~5–20 µs/op	10 deposits + full decay cycle per iteration
`kg_query_100_edges`	~0.5–2 µs/op	KG edge lookup from a source node
`rmsnorm_2048`	~1–5 µs/op	RMSNorm on 2048-dim vector
`rope_128dim_apply`	~50–200 ns/op	RoPE rotation on a single attention head
`schnorr_prove_verify`	~200–500 ns/op	Schnorr ZK proof generation + verification
`json_parse_1kb`	~5–20 µs/op	Parse a 1KB JSON document (zero-dep parser)

Note: Numbers vary by hardware. Run benchmarks on your own machine for accurate results.

Key Numbers

37.7 tok/s — GPU inference throughput (SmolLM2-135M on A100-SXM4-40GB, v4.0.0)
19.9 tok/s — GPU inference throughput (OLMo-3-7B-Think, BF16 W16A32, A100-SXM4-40GB, v4.0.2; was 4.1 tok/s CPU = 4.8× speedup)
2.4× — GPU speedup over CPU inference (SmolLM2-1.7B: 12.6 vs 5.2 tok/s)
507 MiB — VRAM for pre-pinned SmolLM2-135M weights
d = 10.6 — Cohen's d for palace-memory vs. no-memory (ASTRA experiments)
34.4× — more discoveries with memory than without
R² = 0.982 — O(1/√T) convergence fit (BUTTERS morphic warm-start)
1.83× — cross-domain novelty acceleration (DC-24 experiment)
7M params — TRM-CausalValidator size vs. 7B base model (1000× smaller)
45% — TRM accuracy on ARC-AGI-1 (Samsung SAIL Montreal, arXiv:2510.04871)
<10ms — target TRM validation latency per causal graph
~86K — quality-gated training examples per month from ASTRA
8 principles / 4 domains — Horn-clause safety constitution (Young 2026, arXiv:2501.15446)

v4.0 — Champagnat n-Morphic Framework ✅ Implemented

ATLAS v4.0 implements the Champagnat n-Morphic Framework (Issue #6), grounded in Champagnat-Méléard 2011 (PTRF) and Baar-Bovier-Champagnat 2017 (AAP). All Tier 1 (Sprint 1+2) proposals are live as of v4.0.0:

Module	Crate	Key idea
`InvasionFitnessScorer`	atlas-corpus	Replaces raw pheromone softmax; α_ij = ReLU(cos_sim − 0.2) — Lotka-Volterra valid (v4.0.3)
`CognitiveBranching`	atlas-astra	Detects explore_ratio plateau → bifurcates OODA
`CanonicalPheromoneUpdate`	atlas-palace	Principled decay λ = base_rate × exp(−canonical_term) — always positive, smooth (v4.0.3)
`HJConcentrationPrior`	atlas-trm	Hopf-Cole sharpening across TRM recursion steps
`PolymorphicTrainer`	atlas-corpus	k=2,3 morphs (fast/slow/creative) with competition matrix

Mathematical foundation: DeepSupervisionTrainer IS a k-Morphic Trait Substitution System (exact, not analogy). Each N_sup pass = one phenotypic morph. Champagnat Theorem 3.1 derivably explains TRM's >75% gain from deep supervision. Full theory: see research reports.

Hugging Face Model Card

ATLAS models are published to Hugging Face under the openhubresearch organization.

First release: openhubresearch/ATLAS-OLMo-3-7B-Think-v4 — OLMo-3-7B-Think run through the ATLAS v4.0.3 n-morphic framework with BF16 inference (19.9 tok/s A100-SXM4-40GB, W16A32, 532/532 tests, 47/47 GPU model tests).

---
language: en
license: apache-2.0
library_name: atlas
tags:
  - atlas
  - stigmergic-memory
  - active-inference
  - causal-inference
  - pure-rust
  - zero-dependencies
  - champagnat-morphic
  - bf16-inference
base_model: allenai/OLMo-3-0125-7B
---

Models run through ATLAS carry the full n-morphic framework: InvasionFitnessScorer (Lotka-Volterra valid competition), CanonicalPheromoneUpdate (principled adaptive decay), BarBovier2017Constraints (stability gates), CognitiveBranching (OODA bifurcation), and HJConcentrationPrior (Hopf-Cole sharpening). See atlasagi.org for model releases and the LiveDiscoveryCorpus dataset.

License

Code (crates/, kernels/, scripts/): Apache 2.0
Documentation, paper, figures, datasets: CC BY 4.0

See NOTICE for attribution to incorporated components.

Citation

@software{atlas2026,
  title       = {ATLAS: Active-inference Training with Learned Adaptive Stigmergy},
  author      = {Robin Dey},
  year        = {2026},
  institution = {OpenHub Research, Thailand},
  url         = {https://github.com/web3guru888/ATLAS},
  note        = {Pure Rust LLM training framework. Zero external dependencies.
                 v4.0.3: 21 crates, 532 tests, Champagnat n-morphic framework,
                 BF16 GPU inference — OLMo-3-7B-Think 19.9 tok/s on A100-SXM4-40GB (W16A32).
                 Math-validated: exp decay (Issue #11) + Lotka-Volterra competition fix.}
}

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
.github/workflows		.github/workflows
crates		crates
docs/dashboard		docs/dashboard
kernels		kernels
scripts		scripts
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CHARTER.md		CHARTER.md
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
DEPLOY_REPORT.md		DEPLOY_REPORT.md
LICENSE		LICENSE
LICENSE-DOCS		LICENSE-DOCS
NOTICE		NOTICE
README.md		README.md
REPRODUCIBILITY.md		REPRODUCIBILITY.md
atlas-corpus.json		atlas-corpus.json

Folders and files

Latest commit

History

Repository files navigation

ATLAS

What is ATLAS?

The Big Idea

Pure Rust — Zero Dependencies

Seven Pillars

GPU Inference

A100-SXM4-40GB Benchmark (sm_80, CUDA 12.9)

CUDA Kernel Suite

atlas-api — OpenAI-Compatible Endpoint

Endpoints

Usage Examples

DeepSupervisionTrainer

Horn-Clause Safety Constitution

PalaceBackend Trait

Build Order (7 Stages, ~22 Weeks)

Architecture Diagrams

Paper Strategy

Getting Started

Status — v4.0.3

What Works

Version History

Crate Status

Quick Start

atlas-mcp — Model Context Protocol Server

Benchmarks

Key Numbers

v4.0 — Champagnat n-Morphic Framework ✅ Implemented

Hugging Face Model Card

License

Citation

About

Topics

Resources

License

Licenses found

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages