A lint-style detector for common AI writing patterns. Built on spaCy for grammatical understanding (POS tags, dependency parsing, lemmatisation) plus targeted regex for character-level patterns. No LLM in the loop. Designed to run as a pre-commit hook so that when you (or your coding agent) write prose, the obvious AI tells get caught before you ship.
Seven categories of detection, drawn from Wikipedia's Signs of AI writing, the Max Planck (2024) word-frequency study, and community-cataloged AI tells:
| Rule | What it flags |
|---|---|
| SLOP001 | Cliché phrases (a testament to, navigate the complexities of, ~50 more) |
| SLOP002 | Single-word AI tells (delve, tapestry, multifaceted, ...) |
| SLOP003 | Density-flagged words (robust, pivotal, leverage, ... when overused) |
| SLOP020 | Parallelism clichés (not just X but Y, it's not X, it's Y, ...) |
| SLOP021 | Imperative-negation parallelism (Sandbox the call. Not the process.) |
| SLOP022 | Cliché sentence openers (Certainly, Moreover, Furthermore, ...) |
| Rule | What it flags |
|---|---|
| SLOP010 | Em-dash density (only when both density AND raw count are high) |
| SLOP011 | Exclamation density |
| SLOP012 | Em-dash listicle (dash followed by 3+ comma-separated items) |
| SLOP030 | Emoji-led bullet points |
| SLOP031 | Repeated bold-headed bullet pattern (* **Header:** description) |
| SLOP032 | Decorative unicode (→, ✔, ★ in prose) |
| Rule | What it flags |
|---|---|
| SLOP040 | Uniform sentence length (low coefficient of variation, info) |
| SLOP050 | Wall-of-text paragraph (over 120 words by default) |
| SLOP051 | Semicolon stitching (more than 1 semicolon per paragraph) |
| SLOP053 | Difficult readability (Flesch reading ease below 40, info) |
| Rule | What it flags |
|---|---|
| SLOP060 | Passive voice density (>30% of finite verbs are passive) |
| SLOP061 | Nominalisation overuse (-tion/-ment/-ity density per 100 words) |
| SLOP062 | Weak-verb density (be/have/do/get/make as >50% of root verbs) |
| SLOP063 | Noun-pile syndrome (Kubernetes infrastructure cost optimization strategy, info) |
Run deslop --list-rules for the live list.
- It does not prove that text was AI-written. It detects patterns that AI overuses. Heavily edited AI text passes; well-written human prose can also occasionally trip a rule.
- It does not call any LLM or remote API. Everything runs locally — spaCy's small English model (~50 MB) provides POS tagging and dependency parsing, and the rest is regex and statistics.
- It is for self-policing your own output, not for accusing others.
If you want a probability score, use a perplexity-based detector. This is a lint, not a classifier.
pip install deslop
This pulls in spaCy 3.7+ and the en_core_web_sm English model (~50 MB) as part of the install — no separate python -m spacy download step needed.
Add to .pre-commit-config.yaml:
repos:
- repo: https://github.com/adamcharnock/deslop
rev: v0.1.0
hooks:
- id: deslopThe hook runs against staged Markdown / plain-text / RST files on every commit.
deslop README.md docs/*.md
deslop --list-rules
echo "in today's fast-paced world we delve into things" | deslop --stdin
Exit code is 0 when clean, 1 when any blocking finding is reported. By default, only warning and error severities block — info findings are reported but pass. Adjust with --fail-on={info,warning,error}. Use --max-findings N to allow up to N blocking findings before failing.
Either a .deslop.toml at repo root, or a [tool.deslop] table in pyproject.toml:
[tool.deslop]
# Em-dash density: requires BOTH density and raw-count to fire, so two dashes
# in a short paragraph are stylistic, not flagged.
em_dash_per_500_words = 14
min_em_dashes = 6
exclamation_per_500_words = 2
sentence_cv_threshold = 0.35
min_sentences_for_uniformity = 8
overused_word_threshold = 3
min_words_for_density = 200
# Density / readability:
max_paragraph_words = 120
max_semicolons_per_paragraph = 1
flesch_reading_ease_floor = 40
disable = ["SLOP032", "SLOP040"]Per-line escape, mirroring how other linters do it:
We delve into things. <!-- deslop: ignore -->
We delve into things. <!-- deslop: ignore=SLOP002 -->Fenced code blocks (``` and ~~~) and inline code spans (`) are blanked out before any rule runs, so prose checks don't fire on identifiers or sample strings.
- Some flagged words are perfectly legitimate vocabulary. The density-based rule (SLOP003) only fires when a word appears repeatedly in the same document. The single-use list (SLOP002) is intentionally conservative.
- Sentence segmentation uses a simple regex and can miscount with unusual punctuation. The uniformity rule defaults to
infoseverity for that reason. - Regex-based detection is easy to evade. That is the point: this catches unedited slop, which is the failure mode you actually want to avoid when committing prose.