Skip to content

kaiwilliams-dev/real-sam

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Real Sam

A language model built from biological neurons — not transformers.

Real Sam uses Leaky Integrate-and-Fire (LIF) spiking neurons, curriculum learning, and environment-driven plasticity to learn language the way a brain does: through binary spikes, temporal dynamics, and sparse computation.

Results

Phase Seq Length Val Loss Perplexity
Words 8 2.64 14
Phrases 16 2.55 13
Sentences 32 2.48 12
Stories 64 2.40 11

6M parameters. ~10% firing rate. Trained on a single GTX 1050 Ti.

Architecture

Token → STE Spike Encoder → Linear Projection (256 → 512)
    → 6x Environment Spiking Blocks (LIF + diversity + residual)
    → Weight-Tied Readout → Next Token

Each token becomes a binary spike vector via the Straight-Through Estimator. Six layers of LIF neurons with learnable decay process the sequence recurrently — membrane potential carries temporal context, spikes are binary {0, 1}. The readout layer reuses the embedding matrix transpose, saving 60%+ parameters.

Generation is O(1) per token. No attention. No KV cache. Just spiking state.

Key Innovations

Curriculum Learning — Data complexity increases in phases (words, phrases, sentences, stories, conversations), mimicking infant language development. Phase transitions are automatic, triggered by loss convergence.

Shared Environment — One global stress signal modulates all neurons simultaneously, like cortisol in the bloodstream. High loss = stressed environment = neurons explore more. Inspired by Cortical Labs' DishBrain and the Free Energy Principle.

Neuron Diversity — Each neuron has a fixed "personality" sampled at initialization (LogNormal diversity factor), simulating biological receptor density. Same environment, different responses. Sensitive explorers and resilient anchors.

Firing Rate Regularization — Neurons maintain ~10% sparse firing through a loss penalty, not threshold manipulation. Backprop naturally discovers efficient sparse codes, just like biological cortex.

Getting Started

git clone https://github.com/nakaiwilliams/real-sam.git
cd real-sam
pip install -r requirements.txt

Download training data

python -m src.large_data --data-dir data --vocab 4096

This downloads TinyStories, Alpaca, Dolly, and OpenAssistant data, trains a BPE tokenizer, and caches everything locally.

Train

python src/train.py --mode v4 --epochs 80 --batch-size 32 --grad-accum 4

Training runs on CUDA, Apple MPS, or CPU. A GTX 1050 Ti (4GB VRAM) handles batch_size=32 comfortably.

Resume from checkpoint

python src/train.py --mode v4 --resume --epochs 80

Chat

python -m src.chat --checkpoint checkpoints/real-sam-v4.pt

Generate text

python src/generate.py --checkpoint checkpoints/real-sam-v4.pt --prompt "Once upon a time"

Project Structure

src/
  neurons.py          LIF neuron implementations (V1-V4)
  encoder.py          STE spike encoder (tokens → binary spikes)
  network.py          Full model architectures (RealSam V1-V4)
  train.py            Training loop with curriculum learning
  data.py             BPE tokenizer and dataset utilities
  curriculum_data.py  Multi-phase curriculum data pipeline
  chat.py             Interactive chat interface
  generate.py         Text generation
  spiking_ner.py      Spiking NER model (for PII detection)
  train_spiking_ner.py  NER training pipeline

docs/
  index.html          Project landing page

checkpoints/          Model checkpoints (not in git — train or download)
data/                 Training data (not in git — download via script)

Model Versions

Version Params Key Feature Notes
V1 ~1M Basic LIF + recurrence Character-level Shakespeare
V2 ~3M Residual blocks + LayerNorm BPE tokenizer, conversation data
V3 ~6M Homeostatic thresholds Per-neuron adaptive thresholds (deprecated)
V4 ~6M Environment + diversity Shared stress signal, neuron personalities

Requirements

  • Python 3.9+
  • PyTorch 2.0+
  • snnTorch 0.7+
  • tokenizers
  • datasets (HuggingFace)
  • tqdm, numpy, matplotlib

How It Works

Real Sam processes language through spiking dynamics:

  1. Encoding: Each BPE token is embedded and passed through a sigmoid + threshold to produce a binary spike vector. The Straight-Through Estimator provides gradients for backpropagation.

  2. Processing: Six stacked spiking blocks process the sequence one token at a time. Each block has:

    • A feedforward path (fc_in)
    • A recurrent path from previous spikes (fc_rec)
    • A LIF neuron with learnable decay (beta)
    • A gated residual connection
  3. Environment: A shared stress signal, computed from the training loss, modulates all neurons' gain. Each neuron's response is scaled by its fixed diversity factor — some neurons are sensitive explorers, others are resilient anchors.

  4. Readout: The output projects back to embedding space and multiplies by the transposed embedding matrix (weight tying). This produces next-token logits without a separate vocabulary projection.

  5. Curriculum: Training data complexity increases automatically through phases. The model learns words before phrases, phrases before sentences, and so on — just like a child.

Why Spikes

Transformers are brilliant. But they're not how brains work.

Biological neurons communicate through binary spikes — discrete events in time. Information is encoded in when neurons fire, not in continuous activation values. This is fundamentally more efficient: most neurons are silent most of the time.

Real Sam explores whether this principle can work for language. It's not trying to beat GPT-4. It's asking: what if we built language models the way evolution built brains?

The answer, so far: 6 million spiking neurons can learn grammar, narrative structure, and basic conversation. Not perfectly. But they do it with ~10% of neurons active at any time, O(1) generation per token, and no attention mechanism at all.

License

MIT. See LICENSE.

Built by Nakai Williams. Powered by spikes, not attention.

About

Bio-inspired spiking language model. Neurons that learn to speak. Built with LIF spiking neurons, curriculum learning, and environment-driven plasticity.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages