An offline, rule‑based NLP engine that reads the grammar of your text with spaCy dependency parsing and generates Wh‑questions, True/False, MCQ and Cloze — each tagged with difficulty and a confidence score, exportable to 8 formats across 4 languages. No LLM. No API keys. No cloud.
Almost every "AI" question tool is a thin wrapper around a large language model — it guesses and you can't see why. QuestionGen is the opposite:
- 🧠 Transparent rule engine — spaCy dependency parsing + hand‑written rules. Same input → same output, every time.
- 🔌 100% offline — runs entirely on your machine. No API keys, no per‑request cost, no data leaving your device.
- 🪪 Shows its work — every question carries the source sentence and the rule that produced it.
- 🎓 Classroom‑ready — exports straight into Anki, Moodle, Canvas, Kahoot! and Quizizz.
| Feature | Description |
|---|---|
| Wh‑Questions | Who / What / Where / When / Why / How, via dependency parsing |
| True / False | Original statement + an auto‑negated false variant |
| MCQ | Fill‑in‑the‑blank with 4 choices (NER + noun‑chunk distractors) |
| Cloze | Fill‑in‑the‑blank deletions of key terms |
| Difficulty tagging | Easy / Medium / Hard from syntactic depth, clause count & NER density |
| Confidence scoring | 0.0–1.0 per question; filterable in the UI and API |
| File upload | Extract text from .txt, .pdf, .docx |
| 8 export formats | Markdown · JSON · PDF · Anki (.apkg) · Moodle XML · Canvas QTI · Kahoot! (.xlsx) · Quizizz (.xlsx) |
| 4 languages | English · German · French · Spanish (+ multilingual fallback) |
| Live dashboard | Animated stat chips, type breakdown, difficulty distribution & confidence ring |
| Two UIs | A polished dark glassmorphism theme and a bold brutalist "QGEN" theme — both vanilla, no build step |
| Polished UX | Skeleton loaders, toasts, scroll‑reveal, keyboard shortcuts (⌘/Ctrl+Enter), light/dark toggle, fully responsive & accessible |
| Generate (English) | Multilingual (German) |
|---|---|
![]() |
![]() |
| Live results dashboard | Stats band |
![]() |
![]() |
🎬 Full 1‑minute walkthrough with voice‑over + subtitles: demo_video/qgen_demo_av.mp4
Input (pasted text or .txt / .pdf / .docx)
│
▼
Language detection (langdetect) → spaCy model selection
├── en_core_web_sm (English)
├── de_core_news_sm (German)
├── fr_core_news_sm (French)
├── es_core_news_sm (Spanish)
└── xx_ent_wiki_sm (multilingual fallback)
│
▼
Per‑sentence NLP
├── Dependency parsing → Wh‑questions (nlp/rules.py)
├── Negation → True / False (nlp/question_types.py)
├── NER + noun chunks → MCQ (nlp/question_types.py)
└── Key‑term deletion → Cloze (nlp/question_types.py)
│
▼
Enrichment
├── Difficulty label (nlp/difficulty.py)
└── Confidence score (nlp/confidence.py)
│
▼
FastAPI JSON → Browser UI → Export (8 formats)
- Backend: Python · FastAPI · Uvicorn · spaCy (dependency parsing + NER)
- Parsing/IO: langdetect · pdfminer.six · python‑docx
- Exporters: genanki (Anki) · fpdf2 (PDF) · openpyxl (Kahoot/Quizizz) · custom Moodle XML / Canvas QTI
- Frontend: Vanilla HTML / CSS / JS — no framework, no build step — self‑hosted web fonts
- Deploy: Render (native Python) · Docker · any VPS
Prerequisites: Python 3.9+ and a modern browser.
# 1 — install dependencies
pip install -r backend/requirements.txt
python -m spacy download en_core_web_sm
# 2 — (optional) extra language models
python -m spacy download de_core_news_sm # German
python -m spacy download fr_core_news_sm # French
python -m spacy download es_core_news_sm # Spanish
python -m spacy download xx_ent_wiki_sm # multilingual fallback
# 3 — run (serves the API AND the frontend on one port)
python -m uvicorn backend.app:app --reload --host 127.0.0.1 --port 8000Open http://127.0.0.1:8000/ for the bold QGEN UI, or /index.html for the classic dark theme.
The app is a single service — one uvicorn process serves both the API and the static frontend.
A render.yaml blueprint is included. Push to GitHub → Render → New → Blueprint → pick the repo. It runs:
- Build:
pip install -r backend/requirements.txt && python -m spacy download en_core_web_sm - Start:
uvicorn backend.app:app --host 0.0.0.0 --port $PORT
docker build -t questiongen .
docker run -p 8000:8000 questiongenpip install -r backend/requirements.txt && python -m spacy download en_core_web_sm
uvicorn backend.app:app --host 0.0.0.0 --port 8000 # + nginx/Caddy for HTTPSMemory: each spaCy model loads ~80–150 MB. English‑only fits a 512 MB free tier; all 4 languages need ~1 GB.
Interactive docs are auto‑generated at /docs.
Lists supported language codes and their spaCy models.
{
"text": "Marie Curie discovered radium in 1898 in Paris.",
"lang": "en",
"include_tf": true,
"include_mcq": true,
"include_cloze": false,
"min_confidence": 0.0
}Response:
{
"questions": [
{ "question": "Who discovered radium?", "type": "wh", "difficulty": "Easy", "confidence": 0.85,
"explanation": "Marie Curie discovered radium in 1898 in Paris." },
{ "statement": "Marie Curie discovered radium in 1898.", "answer": true, "type": "true_false",
"difficulty": "Medium", "confidence": 0.7 },
{ "question": "Marie Curie discovered ______ in 1898.", "answer": "radium",
"choices": ["radium","polonium","uranium","element"], "type": "mcq",
"difficulty": "Medium", "confidence": 0.9 }
],
"meta": { "sentences": 1, "language": "en", "wh_count": 1, "tf_count": 1, "mcq_count": 1, "cloze_count": 0 }
}Multipart upload (.txt / .pdf / .docx). Query params: lang, include_tf, include_mcq, include_cloze, min_confidence.
curl -F "file=@notes.pdf" "http://127.0.0.1:8000/generate/upload?lang=en"{ "questions": [ ... ], "format": "anki", "title": "My Question Bank" }format: markdown · json · pdf · anki · moodle · canvas · kahoot · quizizz. Returns a file download.
ai-question-generator/
├── backend/
│ ├── app.py # FastAPI app + static-frontend mount + home route
│ ├── requirements.txt
│ ├── nlp/
│ │ ├── pipeline.py # orchestration
│ │ ├── rules.py # Wh-question rule engine
│ │ ├── question_types.py # True/False · MCQ · Cloze
│ │ ├── difficulty.py # Easy/Medium/Hard labelling
│ │ ├── confidence.py # 0–1 confidence scoring
│ │ └── multilingual.py # language detection + model loading
│ ├── ingest/file_parser.py # .txt / .pdf / .docx extraction
│ └── export/exporters.py # 8 export formats
├── frontend/
│ ├── index.html # classic dark glassmorphism UI
│ ├── index-bold.html # bold "QGEN" brutalist UI (default at /)
│ ├── styles.css · styles-bold.css
│ ├── app.js # shared logic (no framework)
│ └── fonts/anton.woff2 # self-hosted display font
├── tools/ # test + demo automation (Playwright, TTS, etc.)
├── Dockerfile · render.yaml · Procfile · runtime.txt
└── README.md
A full backend test harness exercises every feature across all languages:
python tools/test_features.py # all 4 languages · all question types · upload · all 8 exportsLatest run: 17/17 checks passing (English, German, French, Spanish generation; auto‑detect; confidence filter; file upload; and all eight export formats).
- Works best on declarative, factual sentences.
- Non‑English quality depends on how cleanly each language's spaCy model maps to the dependency labels the rules expect.
- MCQ distractors are heuristic (NER + noun chunks), not semantically ranked.
- PDF export uses a basic layout (non‑Latin‑1 glyphs are sanitized).
Built with spaCy + FastAPI · Parsed, not guessed.




