Skip to content

pariksith/Q-GEN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

QuestionGen — AI Question Generator

Turn any text into ready-to-use questions — parsed, not guessed.

An offline, rule‑based NLP engine that reads the grammar of your text with spaCy dependency parsing and generates Wh‑questions, True/False, MCQ and Cloze — each tagged with difficulty and a confidence score, exportable to 8 formats across 4 languages. No LLM. No API keys. No cloud.

Python FastAPI spaCy No LLM License: MIT

▶ Live demo

QuestionGen hero

✨ Why this is different

Almost every "AI" question tool is a thin wrapper around a large language model — it guesses and you can't see why. QuestionGen is the opposite:

  • 🧠 Transparent rule engine — spaCy dependency parsing + hand‑written rules. Same input → same output, every time.
  • 🔌 100% offline — runs entirely on your machine. No API keys, no per‑request cost, no data leaving your device.
  • 🪪 Shows its work — every question carries the source sentence and the rule that produced it.
  • 🎓 Classroom‑ready — exports straight into Anki, Moodle, Canvas, Kahoot! and Quizizz.

🚀 Features

Feature Description
Wh‑Questions Who / What / Where / When / Why / How, via dependency parsing
True / False Original statement + an auto‑negated false variant
MCQ Fill‑in‑the‑blank with 4 choices (NER + noun‑chunk distractors)
Cloze Fill‑in‑the‑blank deletions of key terms
Difficulty tagging Easy / Medium / Hard from syntactic depth, clause count & NER density
Confidence scoring 0.0–1.0 per question; filterable in the UI and API
File upload Extract text from .txt, .pdf, .docx
8 export formats Markdown · JSON · PDF · Anki (.apkg) · Moodle XML · Canvas QTI · Kahoot! (.xlsx) · Quizizz (.xlsx)
4 languages English · German · French · Spanish (+ multilingual fallback)
Live dashboard Animated stat chips, type breakdown, difficulty distribution & confidence ring
Two UIs A polished dark glassmorphism theme and a bold brutalist "QGEN" theme — both vanilla, no build step
Polished UX Skeleton loaders, toasts, scroll‑reveal, keyboard shortcuts (⌘/Ctrl+Enter), light/dark toggle, fully responsive & accessible

🖼️ Screenshots & demo

QuestionGen — full demo
End-to-end in seconds: generate → switch through 4 languages → dashboard → export.

Generate (English) Multilingual (German)
Live results dashboard Stats band

🎬 Full 1‑minute walkthrough with voice‑over + subtitles: demo_video/qgen_demo_av.mp4


🧩 How it works

Input (pasted text or .txt / .pdf / .docx)
      │
      ▼
Language detection (langdetect) → spaCy model selection
  ├── en_core_web_sm   (English)
  ├── de_core_news_sm  (German)
  ├── fr_core_news_sm  (French)
  ├── es_core_news_sm  (Spanish)
  └── xx_ent_wiki_sm   (multilingual fallback)
      │
      ▼
Per‑sentence NLP
  ├── Dependency parsing → Wh‑questions     (nlp/rules.py)
  ├── Negation           → True / False     (nlp/question_types.py)
  ├── NER + noun chunks  → MCQ              (nlp/question_types.py)
  └── Key‑term deletion  → Cloze            (nlp/question_types.py)
      │
      ▼
Enrichment
  ├── Difficulty label  (nlp/difficulty.py)
  └── Confidence score  (nlp/confidence.py)
      │
      ▼
FastAPI JSON  →  Browser UI  →  Export (8 formats)

🛠️ Tech stack

  • Backend: Python · FastAPI · Uvicorn · spaCy (dependency parsing + NER)
  • Parsing/IO: langdetect · pdfminer.six · python‑docx
  • Exporters: genanki (Anki) · fpdf2 (PDF) · openpyxl (Kahoot/Quizizz) · custom Moodle XML / Canvas QTI
  • Frontend: Vanilla HTML / CSS / JSno framework, no build step — self‑hosted web fonts
  • Deploy: Render (native Python) · Docker · any VPS

⚡ Quick start (local)

Prerequisites: Python 3.9+ and a modern browser.

# 1 — install dependencies
pip install -r backend/requirements.txt
python -m spacy download en_core_web_sm

# 2 — (optional) extra language models
python -m spacy download de_core_news_sm   # German
python -m spacy download fr_core_news_sm   # French
python -m spacy download es_core_news_sm   # Spanish
python -m spacy download xx_ent_wiki_sm    # multilingual fallback

# 3 — run (serves the API AND the frontend on one port)
python -m uvicorn backend.app:app --reload --host 127.0.0.1 --port 8000

Open http://127.0.0.1:8000/ for the bold QGEN UI, or /index.html for the classic dark theme.


☁️ Deployment

The app is a single service — one uvicorn process serves both the API and the static frontend.

Render (no Docker — recommended)

A render.yaml blueprint is included. Push to GitHub → Render → New → Blueprint → pick the repo. It runs:

  • Build: pip install -r backend/requirements.txt && python -m spacy download en_core_web_sm
  • Start: uvicorn backend.app:app --host 0.0.0.0 --port $PORT

Docker

docker build -t questiongen .
docker run -p 8000:8000 questiongen

Any VPS

pip install -r backend/requirements.txt && python -m spacy download en_core_web_sm
uvicorn backend.app:app --host 0.0.0.0 --port 8000   # + nginx/Caddy for HTTPS

Memory: each spaCy model loads ~80–150 MB. English‑only fits a 512 MB free tier; all 4 languages need ~1 GB.


🔌 API reference

Interactive docs are auto‑generated at /docs.

GET /health{ "ok": true }

GET /languages

Lists supported language codes and their spaCy models.

POST /generate

{
  "text": "Marie Curie discovered radium in 1898 in Paris.",
  "lang": "en",
  "include_tf": true,
  "include_mcq": true,
  "include_cloze": false,
  "min_confidence": 0.0
}

Response:

{
  "questions": [
    { "question": "Who discovered radium?", "type": "wh", "difficulty": "Easy", "confidence": 0.85,
      "explanation": "Marie Curie discovered radium in 1898 in Paris." },
    { "statement": "Marie Curie discovered radium in 1898.", "answer": true, "type": "true_false",
      "difficulty": "Medium", "confidence": 0.7 },
    { "question": "Marie Curie discovered ______ in 1898.", "answer": "radium",
      "choices": ["radium","polonium","uranium","element"], "type": "mcq",
      "difficulty": "Medium", "confidence": 0.9 }
  ],
  "meta": { "sentences": 1, "language": "en", "wh_count": 1, "tf_count": 1, "mcq_count": 1, "cloze_count": 0 }
}

POST /generate/upload

Multipart upload (.txt / .pdf / .docx). Query params: lang, include_tf, include_mcq, include_cloze, min_confidence.

curl -F "file=@notes.pdf" "http://127.0.0.1:8000/generate/upload?lang=en"

POST /export

{ "questions": [ ... ], "format": "anki", "title": "My Question Bank" }

format: markdown · json · pdf · anki · moodle · canvas · kahoot · quizizz. Returns a file download.


📂 Project structure

ai-question-generator/
├── backend/
│   ├── app.py                  # FastAPI app + static-frontend mount + home route
│   ├── requirements.txt
│   ├── nlp/
│   │   ├── pipeline.py         # orchestration
│   │   ├── rules.py            # Wh-question rule engine
│   │   ├── question_types.py   # True/False · MCQ · Cloze
│   │   ├── difficulty.py       # Easy/Medium/Hard labelling
│   │   ├── confidence.py       # 0–1 confidence scoring
│   │   └── multilingual.py     # language detection + model loading
│   ├── ingest/file_parser.py   # .txt / .pdf / .docx extraction
│   └── export/exporters.py     # 8 export formats
├── frontend/
│   ├── index.html              # classic dark glassmorphism UI
│   ├── index-bold.html         # bold "QGEN" brutalist UI (default at /)
│   ├── styles.css · styles-bold.css
│   ├── app.js                  # shared logic (no framework)
│   └── fonts/anton.woff2       # self-hosted display font
├── tools/                      # test + demo automation (Playwright, TTS, etc.)
├── Dockerfile · render.yaml · Procfile · runtime.txt
└── README.md

✅ Testing

A full backend test harness exercises every feature across all languages:

python tools/test_features.py     # all 4 languages · all question types · upload · all 8 exports

Latest run: 17/17 checks passing (English, German, French, Spanish generation; auto‑detect; confidence filter; file upload; and all eight export formats).


🚧 Limitations

  • Works best on declarative, factual sentences.
  • Non‑English quality depends on how cleanly each language's spaCy model maps to the dependency labels the rules expect.
  • MCQ distractors are heuristic (NER + noun chunks), not semantically ranked.
  • PDF export uses a basic layout (non‑Latin‑1 glyphs are sanitized).

Built with spaCy + FastAPI · Parsed, not guessed.

About

QGEN — An offline, rule‑based NLP engine that reads the grammar of your text with spaCy dependency parsing and generates Wh‑questions, True/False, MCQ and Cloze — each tagged with difficulty and a confidence score, exportable to 8 formats across 4 languages. No LLM. No API keys. No cloud.

Topics

Resources

License

Stars

Watchers

Forks

Contributors