S4 is a high-performance, S3-compatible object storage server written in Rust. It solves the inode exhaustion problem common with traditional file-based storage systems and provides advanced features like atomic directory operations and content-addressable deduplication.
Demo Console: s4console Β· Login:
root/password12345Β· Resets every 10 min
Demo API: s4core Β· Access Key ID / Secret Access Key:
my-secret-key_id/my-secret-access-keyΒ· Resets every 10 min
- S3 API Compatible: Full compatibility with AWS S3 API (AWS CLI, boto3, etc.)
- Inode Problem Solved: Append-only log storage eliminates inode exhaustion
- Content Deduplication: Automatic deduplication saves 30-50% storage space
- Object Versioning: S3-compatible versioning with delete markers
- Lifecycle Policies: Automatic object expiration and cleanup of old versions
- Atomic Operations: Rename directories with millions of files in milliseconds
- Strict Consistency: Data is guaranteed to be written before returning success
- IAM & Admin API: Role-based access control (Reader, Writer, SuperUser) with JWT authentication
- S3 Select SQL: Query CSV/JSON/Parquet objects with full SQL (powered by Apache DataFusion)
- Multi-Object SQL: Extended S3 Select with glob patterns for querying across multiple objects
- High Performance: Optimized for single-node performance
S4 uses a hybrid storage approach:
- Tiny Objects (< 4KB): Stored inline in the metadata database
- All Other Objects (> 4KB): Stored in append-only volume files (~1GB each)
This approach ensures:
- Minimal inode usage (1 billion objects = ~1000 files)
- Maximum write performance (sequential writes)
- Fast recovery (metadata in ACID database)
- Rust 1.70 or later
- Linux (recommended) or macOS
# Clone the repository
git clone https://github.com/org/s4.git
cd s4
# Build the project
cargo build --release
# Run the server
./target/release/s4-serverS4 provides official Docker images for easy deployment.
# Run S4 server (basic)
docker run -d \
--name s4core \
-p 9000:9000 \
-v s4-data:/data \
-e S4_BIND=0.0.0.0:9000 \
s4core/s4core:latest
# Run with custom credentials
docker run -d \
--name s4core \
-p 9000:9000 \
-v s4-data:/data \
-e S4_BIND=0.0.0.0:9000 \
-e S4_ACCESS_KEY_ID=myaccesskey \
-e S4_SECRET_ACCESS_KEY=mysecretkey \
s4core/s4core:latest
# Run with IAM enabled
docker run -d \
--name s4core \
-p 9000:9000 \
-v s4-data:/data \
-e S4_BIND=0.0.0.0:9000 \
-e S4_ROOT_PASSWORD=password12345 \
s4core/s4core:latest
# Build the image locally
docker build -t s4-server .The project includes a docker-compose.yml that runs S4 server together with the web admin console.
# Run full stack (server + web console)
docker compose up --build
# Run in background
docker compose up -d --build
# Run only the server
docker compose up s4-server --build
# With custom environment variables
S4_ROOT_PASSWORD=password12345 docker compose up --buildAfter startup:
- S4 API: http://localhost:9000
- Web Console: http://localhost:3000 (login with root credentials)
docker-compose.yml overview:
services:
s4core:
build: .
ports:
- "9000:9000"
volumes:
- s4-data:/data
environment:
- S4_BIND=0.0.0.0:9000
- S4_ROOT_PASSWORD=${S4_ROOT_PASSWORD:-}
- S4_ACCESS_KEY_ID=${S4_ACCESS_KEY_ID:-}
- S4_SECRET_ACCESS_KEY=${S4_SECRET_ACCESS_KEY:-}
s4-console:
image: s4core/s4console:latest
ports:
- "3000:3000"
environment:
- S4_BACKEND_URL=http://s4-server:9000
depends_on:
- s4coreFor web console-only development, see frontend/README.md.
S4 is configured through environment variables:
| Variable | Description | Default | Example |
|---|---|---|---|
S4_BIND |
S4 BIND host | 127.0.0.1:9000 |
0.0.0.0:9000 |
S4_ROOT_USERNAME |
Root admin username | root |
admin |
S4_ROOT_PASSWORD |
Root admin password (enables IAM) | None (IAM disabled) | password12345 |
S4_JWT_SECRET |
Secret key for signing JWT tokens | Auto-generated at startup (dev mode only) | 256-bit-crypto-random-string-like-this-1234567890ABCDEF |
S4_ACCESS_KEY_ID |
Access key for S3 authentication | Auto-generated dev key | myaccesskey |
S4_SECRET_ACCESS_KEY |
Secret key for S3 authentication | Auto-generated dev key | mysecretkey |
S4_DATA_DIR |
Base directory for storage | System temp dir | /var/lib/s4 |
S4_MAX_UPLOAD_SIZE |
Maximum upload size per request | 5GB |
10GB, 100MB, 1024KB |
S4_TLS_CERT |
Path to TLS certificate (PEM format) | None (HTTP mode) | /etc/ssl/certs/s4.pem |
S4_TLS_KEY |
Path to TLS private key (PEM format) | None (HTTP mode) | /etc/ssl/private/s4-key.pem |
S4_LIFECYCLE_ENABLED |
Enable lifecycle policy worker | true |
true, false, 1, 0 |
S4_LIFECYCLE_INTERVAL_HOURS |
Lifecycle evaluation interval (hours) | 24 |
1, 6, 24, 168 |
S4_LIFECYCLE_DRY_RUN |
Dry-run mode (log without deleting) | false |
true, false, 1, 0 |
S4_METRICS_ENABLED |
Prometheus metrics | true) | false |
S4_SELECT_ENABLED |
Enable/disable S3 Select SQL engine | true |
false |
S4_SELECT_MAX_MEMORY |
Per-query memory limit for SQL engine | 256MB |
512MB, 1GB |
S4_SELECT_TIMEOUT |
SQL query timeout (seconds) | 60 |
120 |
Size format: Supports GB/G, MB/M, KB/K, or bytes (no suffix).
Example (HTTP):
export S4_ACCESS_KEY_ID=myaccesskey
export S4_SECRET_ACCESS_KEY=mysecretkey
export S4_DATA_DIR=/var/lib/s4
export S4_MAX_UPLOAD_SIZE=10GB
./target/release/s4-serverConfigure AWS CLI to use S4:
aws configure set aws_access_key_id myaccesskey
aws configure set aws_secret_access_key mysecretkeyBasic operations:
# Create a bucket
aws --endpoint-url http://localhost:9000 s3 mb s3://mybucket
# Upload a file
aws --endpoint-url http://localhost:9000 s3 cp file.txt s3://mybucket/file.txt
# List objects
aws --endpoint-url http://localhost:9000 s3 ls s3://mybucket
# Download a file
aws --endpoint-url http://localhost:9000 s3 cp s3://mybucket/file.txt downloaded.txt
# Delete a file
aws --endpoint-url http://localhost:9000 s3 rm s3://mybucket/file.txt
# Delete a bucket
aws --endpoint-url http://localhost:9000 s3 rb s3://mybucketS4 supports S3-compatible object versioning to preserve, retrieve, and restore every version of every object.
# Enable versioning on bucket
aws s3api put-bucket-versioning \
--bucket mybucket \
--versioning-configuration Status=Enabled \
--endpoint-url https://127.0.0.1:9000 \
--no-verify-ssl
# Upload file (version 1)
echo "version 1" | aws s3api put-object \
--bucket mybucket \
--key file.txt \
--body - \
--endpoint-url https://127.0.0.1:9000 \
--no-verify-ssl
# Upload again (version 2)
echo "version 2" | aws s3api put-object \
--bucket mybucket \
--key file.txt \
--body - \
--endpoint-url https://127.0.0.1:9000 \
--no-verify-ssl
# List all versions
aws s3api list-object-versions \
--bucket mybucket \
--prefix file.txt \
--endpoint-url https://127.0.0.1:9000 \
--no-verify-ssl
# Get specific version
aws s3api get-object \
--bucket mybucket \
--key file.txt \
--version-id "ff495d34-c292-4af4-9d10-e186272010ed" \
first_version.txt \
--endpoint-url https://127.0.0.1:9000 \
--no-verify-ssl
# Delete object (creates delete marker)
aws s3api delete-object \
--bucket mybucket \
--key file.txt \
--endpoint-url https://127.0.0.1:9000 \
--no-verify-sslS4 supports automatic object expiration and cleanup based on lifecycle rules.
# Create lifecycle configuration file
cat > lifecycle.json <<'EOF'
{
"Rules": [
{
"ID": "expire-logs",
"Status": "Enabled",
"Filter": {
"Prefix": "logs/"
},
"Expiration": {
"Days": 30
}
},
{
"ID": "cleanup-old-versions",
"Status": "Enabled",
"Filter": {
"Prefix": ""
},
"NoncurrentVersionExpiration": {
"NoncurrentDays": 90
}
}
]
}
EOF
# Set lifecycle configuration
aws s3api put-bucket-lifecycle-configuration \
--bucket mybucket \
--lifecycle-configuration file://lifecycle.json \
--endpoint-url https://127.0.0.1:9000 \
--no-verify-ssl
# Get lifecycle configuration
aws s3api get-bucket-lifecycle-configuration \
--bucket mybucket \
--endpoint-url https://127.0.0.1:9000 \
--no-verify-ssl
# Delete lifecycle configuration
aws s3api delete-bucket-lifecycle \
--bucket mybucket \
--endpoint-url https://127.0.0.1:9000 \
--no-verify-sslLifecycle worker configuration:
# Enable/disable lifecycle worker (default: enabled)
export S4_LIFECYCLE_ENABLED=true
# Set evaluation interval in hours (default: 24)
export S4_LIFECYCLE_INTERVAL_HOURS=24
# Enable dry-run mode to test without deleting (default: false)
export S4_LIFECYCLE_DRY_RUN=trueS4 includes a built-in IAM system with role-based access control. IAM is enabled when S4_ROOT_PASSWORD is set.
Roles:
- Reader -- can list buckets/objects and download objects
- Writer -- Reader permissions plus create/delete buckets and objects
- SuperUser -- full admin access including user management
Starting with IAM enabled:
export S4_ROOT_PASSWORD=password12345
./target/release/s4-serverAdmin API usage (curl):
# Login (get JWT token)
TOKEN=$(curl -s -k -X POST https://localhost:9000/api/admin/login \
-H 'Content-Type: application/json' \
-d '{"username":"root","password":"password12345"}' | jq -r '.token')
# List users
curl -s -k https://localhost:9000/api/admin/users \
-H "Authorization: Bearer $TOKEN"
# Create a user
curl -s -k -X POST https://localhost:9000/api/admin/users \
-H "Authorization: Bearer $TOKEN" \
-H 'Content-Type: application/json' \
-d '{"username":"alice","password":"alice123","role":"Writer"}'
# Generate S3 credentials for a user
curl -s -k -X POST https://localhost:9000/api/admin/users/<user-id>/credentials \
-H "Authorization: Bearer $TOKEN"
# Update user (change role, password, or active status)
curl -s -k -X PUT https://localhost:9000/api/admin/users/<user-id> \
-H "Authorization: Bearer $TOKEN" \
-H 'Content-Type: application/json' \
-d '{"role":"Reader"}'
# Delete S3 credentials
curl -s -k -X DELETE https://localhost:9000/api/admin/users/<user-id>/credentials \
-H "Authorization: Bearer $TOKEN"
# Delete user
curl -s -k -X DELETE https://localhost:9000/api/admin/users/<user-id> \
-H "Authorization: Bearer $TOKEN"Using S3 with IAM credentials:
After generating S3 credentials via the Admin API, use them with AWS CLI:
aws configure set aws_access_key_id S4AK_xxxxxxxx
aws configure set aws_secret_access_key xxxxxxxx
aws --endpoint-url https://localhost:9000 --no-verify-ssl s3 lsLegacy S4_ACCESS_KEY_ID / S4_SECRET_ACCESS_KEY environment credentials continue to work as a fallback with full (SuperUser) access.
S4 includes a built-in SQL query engine powered by Apache DataFusion. Query your stored objects directly β no need to download them first.
Single-Object Query (S3 Select API):
# Upload a CSV file
aws --endpoint-url http://localhost:9000 s3 cp employees.csv s3://mybucket/employees.csv
# Query it with SQL (via curl β returns binary event stream)
curl -X POST "http://localhost:9000/mybucket/employees.csv?select&select-type=2" \
-H "Content-Type: application/xml" \
-d '<?xml version="1.0" encoding="UTF-8"?>
<SelectObjectContentRequest>
<Expression>SELECT name, salary FROM s3object WHERE CAST(salary AS INT) > 100000</Expression>
<ExpressionType>SQL</ExpressionType>
<InputSerialization>
<CSV><FileHeaderInfo>USE</FileHeaderInfo></CSV>
</InputSerialization>
<OutputSerialization>
<CSV/>
</OutputSerialization>
</SelectObjectContentRequest>'Supported input formats: CSV, JSON (Lines/Document), Parquet. Output formats: CSV, JSON.
Multi-Object SQL Query (S4 Extended):
S4 extends S3 Select with multi-object queries using glob patterns:
# Upload multiple CSV files
aws --endpoint-url http://localhost:9000 s3 cp data1.csv s3://mybucket/logs/data1.csv
aws --endpoint-url http://localhost:9000 s3 cp data2.csv s3://mybucket/logs/data2.csv
# Query across all matching objects (JSON output)
curl -X POST "http://localhost:9000/mybucket?sql" \
-H "Content-Type: application/json" \
-d '{"sql": "SELECT * FROM '\''logs/*.csv'\'' WHERE status = '\''ERROR'\''", "format": "csv", "output": "json"}'
# Aggregation across files (CSV output)
curl -X POST "http://localhost:9000/mybucket?sql" \
-H "Content-Type: application/json" \
-d '{"sql": "SELECT COUNT(*) as total, AVG(CAST(value AS DOUBLE)) as avg_val FROM '\''logs/*.csv'\''", "format": "csv", "output": "csv"}'Full SQL support includes WHERE, GROUP BY, ORDER BY, LIMIT, JOIN, window functions, CTEs, and aggregate functions.
S4 supports S3-compatible CORS (Cross-Origin Resource Sharing) for browser-based access.
# Set CORS configuration
curl -X PUT "http://localhost:9000/mybucket?cors" \
-H "Content-Type: application/xml" \
-d '<?xml version="1.0" encoding="UTF-8"?>
<CORSConfiguration>
<CORSRule>
<AllowedOrigin>https://example.com</AllowedOrigin>
<AllowedMethod>GET</AllowedMethod>
<AllowedMethod>PUT</AllowedMethod>
<AllowedHeader>*</AllowedHeader>
<MaxAgeSeconds>3600</MaxAgeSeconds>
</CORSRule>
</CORSConfiguration>'
# Get CORS configuration
curl "http://localhost:9000/mybucket?cors"
# Delete CORS configuration
curl -X DELETE "http://localhost:9000/mybucket?cors"S4 supports TLS for encrypted connections. TLS is disabled by default and enabled automatically when both certificate and key paths are provided.
Generating self-signed certificates (for development):
# Generate self-signed certificate and key
openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -nodes \
-subj "/CN=localhost"Running with TLS:
export S4_TLS_CERT=/path/to/cert.pem
export S4_TLS_KEY=/path/to/key.pem
./target/release/s4-serverUsing with AWS CLI (HTTPS):
# For self-signed certificates, use --no-verify-ssl
aws --endpoint-url https://localhost:9000 --no-verify-ssl s3 ls
# For production with valid certificates
aws --endpoint-url https://s4.example.com:9000 s3 lsCertificate requirements:
- PEM-encoded X.509 certificate
- PEM-encoded private key (RSA, ECDSA, or Ed25519)
- Certificate chain is supported (include intermediate certs in cert.pem)
You can also use a config.toml file:
[server]
bind = "0.0.0.0:9000"
[storage]
data_path = "/var/lib/s4/volumes"
metadata_path = "/var/lib/s4/metadata.redb"
[tuning]
inline_threshold = 4096 # 4KB
volume_size_mb = 1024 # 1GB
strict_sync = true- Architecture Guide - Detailed architecture documentation
- Contributing Guide - How to contribute to S4
- API Documentation - API reference documentation
See CONTRIBUTING.md for development setup and guidelines.
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
π§ Early Development - S4 is currently in active development. Not ready for production use.
