Solo Cheetah is a high-performance file uploader designed for automated data pipeline operations. It monitors directories for marker files, processes related files using configurable matchers, and uploads them to cloud storage backends with parallel execution.
Solo Cheetah automatically monitors directories, processes files, and uploads them to cloud storage. It's designed for high-throughput data pipelines with features like:
- 🚀 High Performance - Parallel processing with configurable workers
- 🔄 Hot Reload - Update configuration without restarting
- 📦 Multiple Storage - S3, GCS, Azure Blob, Local Directory support
- ☁️ Multicloud - Unified blob storage via go-cloud with workload identity
- 🔍 Smart Matching - Flexible file pattern matching (basic and sequential)
- ⚡ Reliable - Built-in retry logic and error handling
- 📊 Observable - Comprehensive logging and monitoring
Architecture: Scanner → Processor → Storage
- Scanner - Monitors directories and finds marker files in batches
- Processor - For each marker file, uses file matchers to find related files:
- Basic matcher: Files by extension (e.g.,
data.rcd.gz,data.rcd_sig) - Sequential matcher: Numbered files (e.g.,
sidecar/data_01.gz,sidecar/data_02.gz)
- Basic matcher: Files by extension (e.g.,
- Storage - Uploads to multiple backends in parallel (S3, GCS, Azure Blob, Local Directory)
Ready to start? Follow our comprehensive guide:
👉 Getting Started Guide - Complete tutorial with installation, configuration, and examples
What you'll learn:
- Installation and first pipeline setup
- Common use cases (backup, streaming, multi-destination)
- Key concepts (pipelines, scanner, processor, storage)
- Configuration hot reload
- Docker deployment
- Performance tuning
-
Getting Started Guide ⭐ Start here!
- Installation and first pipeline
- Common use cases with examples
- Key concepts explained
- Performance tuning tips
-
- Complete configuration documentation
- All available options
- Pipeline, scanner, processor, storage settings
-
- All CLI flags and usage
- Config hot reload with
--config-check-interval - Environment variables
-
- System design and components
- Scanner → Processor → Storage flow
- File matching strategies
- Development Guide 👨💻
- Building from source
- Running locally with MinIO
- Testing and linting
- Docker development
- Contributing guidelines
Find ready-to-use configurations in test/config/.cheetah/:
cheetah-local.yaml- Local development with MinIOcheetah-container.yaml- Docker container deployment
- 🐛 Report Issues - Bug reports and feature requests
- 💬 Discussions - Questions and community help
- 📋 Releases - Download binaries
- 📖 Wiki - Additional guides
This project is licensed under the Apache License 2.0. See the LICENSE file for details.