Skip to content

Tobi-Adesoye/renorm-native

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

54 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation


renorm-native

PyPI version Python PyTorch License: MIT


πŸš€ Overview

renorm-native is a PyTorch-compatible neural network module designed to improve numerical stability in deep learning models.

It provides transformer-ready layers that are robust to:

  • Training instability (NaNs / exploding gradients)
  • Irregular tensor shapes and sequence lengths
  • Mixed CPU/GPU execution environments
  • Memory pressure in large-scale workloads

It is designed to be a drop-in architectural component for modern deep learning pipelines.


πŸ“¦ Installation

Install from PyPI:

pip install renorm-native

Upgrade to latest version:

pip install --upgrade renorm-native

⚑ Quick Start (30 seconds)

Transformer Layer Example

import torch
from renorm import RenormTransformerLayer

# Initialize layer
layer = RenormTransformerLayer(dim=512, heads=8)

# Dummy input: (batch, sequence, features)
x = torch.randn(2, 16, 512)

# Forward pass
y = layer(x)

print(y.shape)

Expected Output

torch.Size([2, 16, 512])

🧠 Core API

1. RenormTransformerLayer

A lightweight transformer block with built-in normalization stability.

RenormTransformerLayer(
    dim: int,
    heads: int,
    eps: float = 1e-5
)

Parameters:

  • dim: Hidden dimension size
  • heads: Number of attention heads
  • eps: Numerical stability constant

2. RenormLinear

A stable replacement for torch.nn.Linear.

from renorm.layers import RenormLinear

Example:

layer = RenormLinear(256, 128)
y = layer(torch.randn(4, 256))

βš™οΈ Device Compatibility

Automatically works across:

  • CPU (Windows / Linux / Mac)
  • CUDA (NVIDIA GPUs)
  • Mixed environments (fallback-safe execution)

Example:

device = "cuda" if torch.cuda.is_available() else "cpu"

layer = RenormTransformerLayer(dim=512, heads=8).to(device)
x = torch.randn(2, 16, 512).to(device)

y = layer(x)

πŸ§ͺ Minimal Validation Test

Run this to verify installation:

python -c "from renorm import RenormTransformerLayer; print(RenormTransformerLayer(dim=256, heads=4))"

Expected behavior: no errors and model prints successfully.


πŸ— Architecture Summary

renorm-native uses a dual-path execution design:

  • CUDA Path (GPU):

    • Optimized tensor execution path
    • High-performance kernel routing (where available)
  • CPU Path (Fallback):

    • Stable numerical execution engine
    • Strict variance preservation for stability

This ensures consistent behavior across heterogeneous compute environments.


πŸ“Š Stability Design Principles

1. Variance Stabilization

Prevents numerical collapse in deep stacks by maintaining bounded activation scaling.

2. Memory Safety

Ensures gradient computation remains isolated from unsafe tensor views in dynamic graphs.

3. Execution Portability

Same model behavior across CPU and GPU environments.


πŸ“Œ Example Use Case

  • Transformer models (LLMs)
  • Time-series forecasting systems
  • Anomaly detection pipelines
  • Edge-device inference systems
  • Low-memory GPU environments

⚠️ Notes

  • Requires PyTorch β‰₯ 2.0
  • Python β‰₯ 3.10 recommended
  • CUDA optional but supported

πŸ“„ License

MIT License β€” see LICENSE for details.


🀝 Contributing

Contributions, issues, and improvements are welcome.


πŸ”— Project

Maintained by the renorm-native team.


🧩 Enterprise / Production Add-On Section


🏒 Enterprise / Production Usage

renorm-native can be used in production systems requiring deterministic numerical stability under high load.

Typical deployment environments:

  • GPU inference clusters (CUDA-enabled)
  • On-prem ML pipelines
  • Edge inference systems
  • Distributed training environments (PyTorch DDP)

πŸ” Enterprise License Mode (Optional)

Some builds may enable enterprise validation for regulated or production deployments.

Environment Variable

export RENORM_ENTERPRISE_KEY="your_token_here"

Format

base64_payload.hex_hmac_signature

Programmatic Validation

from renorm.auth import check_enterprise_license

check_enterprise_license()

Failure Modes

Condition Behavior
Missing key Raises PermissionError
Invalid signature Raises PermissionError
Expired token Raises TimeoutError

βš™οΈ Production Integration Pattern

Recommended structure in production pipelines:

import torch
from renorm import RenormTransformerLayer

def build_model():
    model = RenormTransformerLayer(dim=1024, heads=16)
    return model

def forward_pass(model, x):
    return model(x)

πŸ§ͺ CI / Validation Test

Run a deterministic sanity check:

python -c "
import torch
from renorm import RenormTransformerLayer

layer = RenormTransformerLayer(dim=256, heads=4)
x = torch.randn(2, 8, 256)
y = layer(x)

assert y.shape[-1] == 256
print('OK')
"

πŸ“Š Performance Notes

renorm-native is optimized for:

  • Stable forward/backward propagation under long sequence lengths
  • Reduced numerical drift in deep stacks
  • Consistent execution across heterogeneous compute backends

It is not intended as a raw speed-optimized kernel replacement for PyTorch primitives.


πŸ”„ Compatibility Matrix

Environment Status
CPU (Windows) βœ… Supported
CPU (Linux) βœ… Supported
CUDA 11+ βœ… Supported
MPS (Apple Silicon) ⚠️ Experimental
Distributed training (DDP) βœ… Compatible

🧠 Design Philosophy

renorm-native prioritizes:

  • Numerical correctness over raw speed
  • Stability over aggressive optimization
  • Cross-device consistency over hardware specialization

It is designed to behave predictably under:

  • gradient explosion conditions
  • low precision arithmetic
  • fragmented tensor memory layouts

πŸ“¦ Recommended Deployment (Docker)

FROM pytorch/pytorch:2.2.0-cuda11.8-cudnn8-runtime

WORKDIR /app

RUN pip install renorm-native

COPY . .

CMD ["python", "main.py"]

πŸ“ˆ Benchmark (Example Placeholder)

Layer Stability Score NaN Rate
torch.nn.LayerNorm baseline medium under stress
renorm-native improved near-zero

(Replace with your real measured results when ready β€” do NOT leave as-is in final production release if publishing publicly.)


🌐 Roadmap

Planned improvements:

  • Distributed kernel optimization (multi-GPU aware routing)
  • Expanded attention primitives
  • Quantization-aware renormalization mode
  • Torch compile integration (torch.compile support)

πŸ“© Support

For production integration or enterprise deployment:

About

Custom CUDA & Triton fused layers for self-stabilizing transformer architectures. Accelerate forward/backward passes and prevent gradient explosions in large-scale LLM training.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages