Selgetabel

LLM-powered Excel data processing. Describe what you need in natural language — get structured operations, formulas, and downloadable Excel files.

How It Works

Upload Excel file(s)
Describe your data processing requirement in natural language
LLM generates structured JSON operations (not raw formulas)
Engine executes operations and produces Excel files with real formulas

All formulas are 100% reproducible — no LLM-generated code is executed directly.

Tech Stack

Layer	Technology
Frontend	React Router v7, Vite, TypeScript, Tailwind CSS
Backend	Python FastAPI, multi-provider LLM support
Storage	PostgreSQL, MinIO (S3-compatible)
Infra	pnpm workspace, Turborepo, Docker Compose

Quick Start (Docker)

Prerequisites

Docker & Docker Compose

Deploy

# Clone and enter the project
git clone https://github.com/xiefenga/selgetabel.git
cd selgetabel/docker

# Create environment config
cp .env.example .env

Edit .env and configure the required variables:

# Required
POSTGRES_PASSWORD=strong_password   # Database password
MINIO_ROOT_PASSWORD=strong_password # Object storage password
JWT_SECRET_KEY=xxx                  # Run: openssl rand -hex 32

Start the services:

docker compose up -d

# Access the app at http://localhost:8080

After startup, configure LLM providers through the admin panel (Settings > LLM Providers). See LLM Providers for details.

Upgrade

cd docker
./scripts/upgrade.sh <version>

Local Development

Prerequisites

Node.js 22+ / pnpm 10+
Python 3.11+
PostgreSQL & MinIO (or use docker compose -f docker/docker-compose.dev.yml up -d)

Setup

# Install frontend dependencies
pnpm install

# Install backend dependencies
pnpm --filter @selgetabel/api install

# Start all services
pnpm dev

Service	URL
Web	http://localhost:5173
API	http://localhost:8000
API Docs	http://localhost:8000/docs

Commands

pnpm dev          # Start web + API
pnpm dev:api      # Start API only
pnpm build        # Build all packages
pnpm format       # Format code (Prettier)
pnpm check-types  # Type checking

Architecture

selgetabel/
├── apps/
│   ├── api/           # Python FastAPI backend
│   │   ├── app/
│   │   │   ├── main.py        # App entry point
│   │   │   ├── api/routes/    # Route handlers
│   │   │   ├── engine/        # Core: parser, executor, formula gen, intent classifier
│   │   │   ├── processor/     # Processing pipeline stages
│   │   │   ├── services/      # Business logic: chat, intent, context, file I/O, auth
│   │   │   ├── persistence/   # Data access layer
│   │   │   ├── models/        # SQLAlchemy ORM
│   │   │   └── core/          # Config, DB, JWT
│   │   └── pyproject.toml
│   └── web/           # React Router v7 frontend
│       ├── app/
│       │   ├── routes/        # File-based routing
│       │   ├── components/    # Shared UI components
│       │   ├── features/      # Feature modules
│       │   └── lib/           # Utilities & API client
│       └── vite.config.ts
├── docker/            # Docker Compose deployment
├── docs/              # Technical documentation
│   ├── design/        # System design & architecture
│   ├── specs/         # Protocol & format specifications
│   ├── conventions/   # Coding standards & workflows
│   └── guides/        # How-to guides
├── package.json
├── pnpm-workspace.yaml
└── turbo.json

Key engine components:

intent_classifier.py — LLM-based intent classification (CHAT/ANALYSIS/PROCESSING/UNCLEAR)
context_builder.py — Formats context for LLM prompts based on intent type
excel_generator.py — Converts JSON expressions to Excel formulas
llm_client.py — Unified LLM API interface with multi-provider support

Key services:

intent_service.py — Orchestrates intent recognition and routing decisions
context_service.py — Manages multi-turn conversation context
chat_stream.py — Streams chat responses for non-processing intents

LLM Providers

The system supports multiple LLM providers with database-driven configuration. Providers, models, and credentials are managed via the admin API (/llm/*).

Supported providers:

Provider	Type	Status
OpenAI	`openai`	Available
OpenAI-compatible	`openai_compatible`	Available
Anthropic	`anthropic`	Planned
Azure OpenAI	`azure_openai`	Planned
DeepSeek	`deepseek`	Planned
Qwen	`qwen`	Planned
Ollama	`ollama`	Planned

Stage-level routing — different pipeline stages (intent, generate, title) can use different provider/model combinations.

See LLM Provider Design for the full architecture.

Intent Classification

The system uses LLM-based intent classification to understand user queries and route them appropriately:

Intent	Description
`chat`	Pure conversation without file processing (greetings, questions about features, etc.)
`analysis`	Data analysis/summary without modifying data
`processing`	Data processing that modifies/transforms/exports data
`unclear`	Ambiguous requirements or missing parameters, requires clarification

Classification rules:

If no files are uploaded, intent cannot be analysis or processing — must be chat
If user provides vague instructions (e.g., "group by" without specifying which column), system returns unclear with a clarification question
If user provides a short response like "yes", "correct", or a column name, system inherits the intent from the previous conversation turn

Clarification flow: When intent is unclear or requires clarification, the system asks a gentle follow-up question before proceeding.

Multi-turn Conversation

The system maintains conversation context through a Thread/Turn model:

Thread: A conversation session containing multiple turns
Turn: A single exchange (user query → system response)
Context snapshots: Each turn can save its context state for future reference
File inheritance: When user doesn't upload files in a new turn, the system automatically inherits files from previous turns

Context types are built based on intent:

chat: Conversation history, topic continuity analysis
analysis: Historical analysis records, data insights, file analysis history
processing: Operation history, data state, available files, file dependencies

Processing Pipeline

The backend streams SSE events through a multi-stage pipeline:

Intent Recognition → Generate + Validate (LLM, with retry) → Execute → Export

Intent Recognition: LLM classifies user intent (chat/analysis/processing/unclear)
Generate: LLM produces structured JSON operations from natural language
Validate: Parser checks format and applies function whitelist (with automatic retry on failure)
Execute: Engine runs operations and generates Excel formulas
Export: Outputs downloadable .xlsx with embedded formulas

Supported Operations

Operation	Description
`aggregate`	Column aggregation (SUM, AVERAGE, SUMIF, etc.)
`add_column`	Add calculated column with formula
`update_column`	Update existing column values
`compute`	Scalar computation on variables
`filter`	Filter rows by condition
`sort`	Sort by column(s)
`group_by`	Group and aggregate
`take`	Limit row count
`select_columns`	Select specific columns
`drop_columns`	Remove columns
`create_sheet`	Create new worksheet
`pivot`	Data pivot table (crosstab)

Documentation

Operation Specification — JSON operation format
SSE Protocol — Server-Sent Events protocol
Steps Storage — ThreadTurn steps format
LLM Provider Design — Multi-provider architecture
Engine Architecture — Core engine design
Database Design — Data model
Docker Scripts — Deployment scripts guide

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 136 Commits
.agents/skills		.agents/skills
.cursor/skills		.cursor/skills
.github/workflows		.github/workflows
.vscode		.vscode
apps		apps
docker		docker
docs		docs
fixtures		fixtures
temp		temp
.dockerignore		.dockerignore
.env.local.example		.env.local.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
README.zh-CN.md		README.zh-CN.md
commitlint.config.ts		commitlint.config.ts
install.sh		install.sh
package.json		package.json
pivot.md		pivot.md
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
skills-lock.json		skills-lock.json
turbo.json		turbo.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Selgetabel

How It Works

Tech Stack

Quick Start (Docker)

Prerequisites

Deploy

Upgrade

Local Development

Prerequisites

Setup

Commands

Architecture

LLM Providers

Intent Classification

Multi-turn Conversation

Processing Pipeline

Supported Operations

Documentation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Selgetabel

How It Works

Tech Stack

Quick Start (Docker)

Prerequisites

Deploy

Upgrade

Local Development

Prerequisites

Setup

Commands

Architecture

LLM Providers

Intent Classification

Multi-turn Conversation

Processing Pipeline

Supported Operations

Documentation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages