diff --git a/skills/.curated/brooks-lint/LICENSE.txt b/skills/.curated/brooks-lint/LICENSE.txt new file mode 100644 index 00000000..829723d0 --- /dev/null +++ b/skills/.curated/brooks-lint/LICENSE.txt @@ -0,0 +1,21 @@ +MIT License + +Copyright (c) 2025 hyhmrright + +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in all +copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE +SOFTWARE. diff --git a/skills/.curated/brooks-lint/SKILL.md b/skills/.curated/brooks-lint/SKILL.md new file mode 100644 index 00000000..f9466627 --- /dev/null +++ b/skills/.curated/brooks-lint/SKILL.md @@ -0,0 +1,186 @@ +--- +name: brooks-lint +description: > + AI code reviews grounded in six classic engineering books (The Mythical Man-Month, + Code Complete, Refactoring, Clean Architecture, The Pragmatic Programmer, Domain-Driven + Design). Four analysis modes: PR review, architecture audit, tech debt assessment, test + quality review. Every finding follows Iron Law: Symptom, Source, Consequence, Remedy. + Use when: reviewing code, checking a PR, auditing architecture, assessing tech debt or + test quality, or discussing Brooks's Law, conceptual integrity, code smells, refactoring, + clean architecture, DDD, SOLID, test debt, flaky tests, mock abuse, or legacy code testability. +--- + +# Brooks-Lint + +Code quality diagnosis using principles from six classic software engineering books. + +> **Full skill with slash commands and Claude Code plugin support:** +> install from `--repo hyhmrright/brooks-lint --path skills/brooks-lint` + +## The Iron Law + +``` +NEVER suggest fixes before completing risk diagnosis. +EVERY finding must follow: Symptom → Source → Consequence → Remedy. +``` + +Violating this law produces reviews that list rule violations without explaining why they +matter. A finding without a consequence and a remedy is not a finding — it is noise. + +## When to Use + +**Auto-triggers:** +- User asks to review code, check a PR, or assess code quality +- User shares code and asks "what do you think?" or "is this good?" +- User discusses architecture, module structure, or system design +- User asks why the codebase is hard to maintain, why velocity is declining +- User mentions: code smells, refactoring, clean architecture, DDD, SOLID, Brooks, + conceptual integrity, second system effect, tech debt, ubiquitous language, + test smells, test debt, unit testing quality, flaky tests, mock abuse, + legacy code testability, characterization tests + +## Mode Detection + +Read the context and pick ONE mode before doing anything else. + +| Context | Mode | +|---------|------| +| Code diff, specific files/functions, PR description, "review this" | **Mode 1: PR Review** | +| Project directory structure, module questions, "audit the architecture" | **Mode 2: Architecture Audit** | +| "tech debt", "where to refactor", health check, systemic maintainability questions | **Mode 3: Tech Debt Assessment** | +| Test files shared, "are our tests good?", test debt, flaky tests, mock abuse, legacy code testability | **Mode 4: Test Quality Review** | + +**If context is genuinely ambiguous after reading:** ask once — "Should I do a PR-level code +review, a broader architecture audit, or a tech debt assessment?" — then proceed without +further clarification questions. + +## The Six Decay Risks + +(Full definitions, symptoms, sources, and severity guides are in `references/decay-risks.md` — +read it after selecting a mode.) + +| Risk | Diagnostic Question | +|------|---------------------| +| Cognitive Overload | How much mental effort to understand this? | +| Change Propagation | How many unrelated things break on one change? | +| Knowledge Duplication | Is the same decision expressed in multiple places? | +| Accidental Complexity | Is the code more complex than the problem? | +| Dependency Disorder | Do dependencies flow in a consistent direction? | +| Domain Model Distortion | Does the code faithfully represent the domain? | + +## Modes + +### Mode 1: PR Review + +1. Read `references/pr-review-guide.md` for the analysis process +2. Read `references/decay-risks.md` for symptom definitions and source attributions +3. Scan the diff or code for each decay risk in the order specified in the guide +4. Apply the Iron Law to every finding +5. Output using the Report Template below + +### Mode 2: Architecture Audit + +1. Read `references/architecture-guide.md` for the analysis process +2. Read `references/decay-risks.md` for symptom definitions and source attributions +3. Draw the module dependency graph as a Mermaid diagram (Step 1 of the guide) +4. Scan for each decay risk in the order specified in the guide +5. Assign node colors in the Mermaid diagram based on findings (red/yellow/green) +6. Run the Conway's Law check +7. Output using the Report Template below — Mermaid graph FIRST, then Findings + +### Mode 3: Tech Debt Assessment + +1. Read `references/debt-guide.md` for the analysis process +2. Read `references/decay-risks.md` for symptom definitions and source attributions +3. Scan for all six decay risks; list every finding before scoring any of them +4. Apply the Pain x Spread priority formula +5. Output using the Report Template below, plus the Debt Summary Table + +### Mode 4: Test Quality Review + +1. Read `references/test-guide.md` for the analysis process +2. Read `references/test-decay-risks.md` for symptom definitions and source attributions +3. Build the test suite map (unit/integration/E2E counts and ratio) +4. Scan for each test decay risk in the order specified in the guide +5. Output using the Report Template below + +## Report Template + +```` +# Brooks-Lint Review + +**Mode:** [PR Review / Architecture Audit / Tech Debt Assessment / Test Quality Review] +**Scope:** [file(s), directory, or description of what was reviewed] +**Health Score:** XX/100 + +[One sentence overall verdict] + +--- + +## Module Dependency Graph + + + +```mermaid +graph TD + ... +``` + +--- + +## Findings + + + +### 🔴 Critical + +**[Risk Name] — [Short descriptive title]** +Symptom: [exactly what was observed in the code] +Source: [Book title — Principle or Smell name] +Consequence: [what breaks or gets worse if this is not fixed] +Remedy: [concrete, specific action] + +### 🟡 Warning + +**[Risk Name] — [Short descriptive title]** +Symptom: ... +Source: ... +Consequence: ... +Remedy: ... + +### 🟢 Suggestion + +**[Risk Name] — [Short descriptive title]** +Symptom: ... +Source: ... +Consequence: ... +Remedy: ... + +--- + +## Summary + +[2–3 sentences: what is the most important action, and what is the overall trend] +``` + +## Health Score Calculation + +Base score: 100 +Deductions: +- Each 🔴 Critical finding: −15 +- Each 🟡 Warning finding: −5 +- Each 🟢 Suggestion finding: −1 +Floor: 0 (score cannot go below 0) + +## Reference Files + +Read on demand — do not preload all files: + +| File | When to Read | +|------|-------------| +| `references/decay-risks.md` | After selecting a mode, before starting the review | +| `references/pr-review-guide.md` | At the start of every Mode 1 (PR Review) | +| `references/architecture-guide.md` | At the start of every Mode 2 (Architecture Audit) | +| `references/debt-guide.md` | At the start of every Mode 3 (Tech Debt Assessment) | +| `references/test-guide.md` | At the start of every Mode 4 (Test Quality Review) | +| `references/test-decay-risks.md` | After selecting Mode 4, before starting the review | diff --git a/skills/.curated/brooks-lint/agents/openai.yaml b/skills/.curated/brooks-lint/agents/openai.yaml new file mode 100644 index 00000000..414fd627 --- /dev/null +++ b/skills/.curated/brooks-lint/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "Brooks-Lint" + short_description: "Code reviews grounded in six classic engineering books" + default_prompt: "Review this code for decay risks using Brooks-Lint." diff --git a/skills/.curated/brooks-lint/references/architecture-guide.md b/skills/.curated/brooks-lint/references/architecture-guide.md new file mode 100644 index 00000000..355eefa6 --- /dev/null +++ b/skills/.curated/brooks-lint/references/architecture-guide.md @@ -0,0 +1,165 @@ +# Architecture Audit Guide — Mode 2 + +**Purpose:** Analyze the module and dependency structure of a system for decay risks that +operate at the architectural level. Every finding must follow the Iron Law: +Symptom → Source → Consequence → Remedy. + +**Monorepo note:** Treat each deployable service or library as a top-level module. Draw +dependencies between services, not between their internal packages. Apply the Conway's Law +check at the service ownership level. Within a single service, apply standard module-level analysis. + +--- + +## Analysis Process + +Work through these five steps in order. + +### Step 1: Draw the Module Dependency Graph (Mermaid) + +Before evaluating any risk, map the dependencies as a Mermaid diagram. Use this format: + +````mermaid +graph TD + subgraph UI + WebApp + MobileApp + end + + subgraph Domain + AuthService + OrderService + PaymentService + end + + subgraph Infrastructure + Database + MessageQueue + end + + WebApp --> AuthService + WebApp --> OrderService + MobileApp --> AuthService + MobileApp --> OrderService + OrderService --> PaymentService + OrderService --> Database + OrderService --> MessageQueue + PaymentService --> Database + AuthService -.->|circular| OrderService + + classDef critical fill:#ff6b6b,stroke:#c92a2a,color:#fff + classDef warning fill:#ffd43b,stroke:#e67700 + classDef clean fill:#51cf66,stroke:#2b8a3e,color:#fff + + class PaymentService critical + class OrderService warning + class Database,MessageQueue,AuthService,WebApp,MobileApp clean +```` + +**Phase A (during Step 1):** Generate the graph structure only — nodes, subgraphs, and edges. +Do NOT add `classDef` or `class` lines yet. You need findings from Steps 2-4 before coloring. + +**Phase B (after Step 4):** Add `classDef` definitions and `class` assignments based on findings. +The example above shows the final output after both phases. + +Rules: +1. **Nodes** — Use top-level directories or services as nodes, not individual files +2. **Grouping** — One `subgraph` per architectural layer or top-level directory (e.g., UI, Domain, Infrastructure) +3. **Edges** — Solid arrows (`-->`) point FROM the depending module TO the dependency; use dotted arrows with label (`-.->|circular|`) for circular dependencies. If no circular dependencies exist, use only solid arrows +4. **Node limit** — Keep the graph to ~50 nodes maximum; collapse low-risk leaf modules into their parent if needed +5. **Fan-out** — For any node with fan-out > 5, use a descriptive label: `HighFanOutModule["ModuleName (fan-out: 7)"]` +6. **Colors** — Apply `classDef` colors AFTER completing Steps 2-4: `critical` (red `#ff6b6b`) for nodes with Critical findings, `warning` (yellow `#ffd43b`) for Warning findings, `clean` (green `#51cf66`) for nodes with no findings or only Suggestions. If no findings at all, classify all nodes as `clean` +7. **Direction** — Default to `graph TD` (top-down); use `graph LR` only if the architecture is clearly a left-to-right pipeline + +### Step 2: Scan for Dependency Disorder + +*The most architecturally consequential risk — scan this first.* + +Look for: +- Circular dependencies (any ⚠️ in the map above) +- Arrows flowing upward (high-level domain depending on low-level infrastructure) +- Stable, widely-depended-on modules that import from frequently-changing modules +- Modules with fan-out > 5 +- Absence of a clear layering rule (no consistent answer to "what depends on what?") + +### Step 3: Scan for Domain Model Distortion + +Look for: +- Do module names match the business domain vocabulary? +- Is there a layer called "services" that contains all the business logic while domain objects + are pure data structures? +- Are there modules that cross bounded context boundaries (e.g., billing logic in the user module)? +- Is there an anti-corruption layer where external systems interface with the domain? + +### Step 4: Scan for Remaining Four Risks + +Check each in turn: + +**Knowledge Duplication:** +- Are there multiple modules implementing the same concept independently? +- Does the same domain concept appear under different names in different modules? + +**Accidental Complexity:** +- Are there entire layers in the architecture that do not add value? +- Are there modules whose responsibility cannot be stated in one sentence? + +**Change Propagation:** +- Which modules are "blast radius hotspots"? (A change here requires changes in many other modules) +- Does the dependency map reveal why certain features are slow to develop? + +**Cognitive Overload:** +- Can the module responsibility of each module be stated in one sentence from its name alone? +- Would a new developer know which module to add a new feature to? + +### Step 5: Conway's Law Check + +After the six-risk scan, assess the relationship between architecture and team structure: + +- Does the module/service structure reflect the team structure? + (Conway's Law: "Organizations design systems that mirror their communication structure") +- If yes: is this intentional design or accidental coupling? +- A mismatch that causes cross-team coordination overhead for every feature is 🔴 Critical. +- A mismatch that is theoretical but not yet causing pain is 🟡 Warning. +- If team structure is unknown, note this as context missing and skip the check. + +--- + +## Applying the Iron Law + +For every finding identified above, write it in this format: + +``` +**[Risk Name] — [Short title]** +Symptom: [the exact structural evidence — reference module names from the dependency map] +Source: [Book title — Principle or Smell name] +Consequence: [what architectural consequence follows if this is not addressed] +Remedy: [concrete architectural action] +``` + +--- + +## Output + +Use the standard Report Template from `SKILL.md`. +Mode: Architecture Audit +Scope: the project or directory audited. + +Place the Mermaid dependency graph FIRST in the report, before the Findings section, +under the heading "Module Dependency Graph". In each finding, reference the relevant +node by name (e.g., "See the red node `PaymentService` in the graph above") so the +reader can cross-reference visually. The `classDef` color assignments must be added +LAST, after all findings have been identified and severity levels determined. + +--- + +## Design Note: Analysis-Render Separation + +The dependency graph follows a two-step conceptual model: + +1. **Analysis** — Identify nodes (modules), edges (dependencies), groups (folders/layers), + and severity per node. This produces a logical dependency structure independent of + any diagram format. +2. **Render** — Convert the logical structure to Mermaid syntax (graph TD, subgraph, + classDef, etc.). + +This separation means adding an alternative output format (D2, Graphviz, SVG) in the +future only requires a new renderer — the analysis logic stays the same. diff --git a/skills/.curated/brooks-lint/references/debt-guide.md b/skills/.curated/brooks-lint/references/debt-guide.md new file mode 100644 index 00000000..6beb5913 --- /dev/null +++ b/skills/.curated/brooks-lint/references/debt-guide.md @@ -0,0 +1,105 @@ +# Tech Debt Assessment Guide — Mode 3 + +**Purpose:** Identify, classify, and prioritize technical debt across the entire codebase. +Every finding must follow the Iron Law: Symptom → Source → Consequence → Remedy. + +--- + +## Evidence Gathering + +If you have insufficient evidence to assess the codebase, ask the user ONE question — +choose the single question most relevant to what you already know: + +1. "Which part of the codebase takes the longest to modify for a typical feature?" +2. "Which module do developers avoid touching, and why?" +3. "Which parts of the system have the fewest tests and the most bugs?" +4. "Is there a module that only one person fully understands?" + +After one answer, proceed. Do not ask more than one question. +If the user declines or says they don't know, proceed with available evidence and note +which areas could not be assessed. + +--- + +## Analysis Process + +Work through these three steps in order. + +### Step 1: Full Decay Risk Scan + +Scan for all six decay risks across the entire codebase. List every finding before scoring +any of them. This prevents anchoring on early findings and missing systemic patterns. + +For each risk, look for: + +**Cognitive Overload:** Are there widespread naming problems, deeply nested logic, or +excessively long functions spread across many modules? + +**Change Propagation:** Which modules cause the most ripple effects when changed? +Are there modules that everyone must modify when adding a new feature? + +**Knowledge Duplication:** How many times is the same concept implemented independently? +Is the domain vocabulary consistent across the codebase? + +**Accidental Complexity:** Are there architectural layers or abstractions that add no value? +Is the infrastructure overhead proportional to the problem being solved? + +**Dependency Disorder:** Are there dependency cycles? Does domain logic depend on infrastructure? +Are there modules with no clear layering position? + +**Domain Model Distortion:** Is business logic in the right layer? +Do code names match business names? Are domain objects anemic? + +### Step 2: Score Each Finding with Pain × Spread + +After listing all findings, score each one: + +**Pain score (1–3):** How much does this slow down development today? +- 3: Developers actively avoid touching this area; it causes bugs on most changes +- 2: This area is noticeably slower to work in than the rest of the codebase +- 1: This is a quality issue but not currently causing active pain + +**Spread score (1–3):** How many files, modules, or developers does this affect? +- 3: Affects 5+ modules or all developers on the team +- 2: Affects 2–4 modules or a subset of the team +- 1: Isolated to one module or one developer's area + +**Priority = Pain × Spread** (max 9) + +| Priority | Classification | Action | +|----------|---------------|--------| +| 7–9 | Critical debt | Address in next sprint | +| 4–6 | Scheduled debt | Plan within quarter | +| 1–3 | Monitored debt | Log and watch | + +### Step 3: Group by Decay Risk + +Report findings grouped by risk type, not by file or module. +Grouping by risk reveals systemic patterns: +- "Change Propagation is systemic" → architectural intervention needed +- "Cognitive Overload is isolated" → localized refactoring sufficient + +--- + +## Output + +Use the standard Report Template from `SKILL.md`. +Mode: Tech Debt Assessment + +After the Findings section, append a Debt Summary Table: + +``` +## Debt Summary + +| Risk | Findings | Avg Priority | Dominant Classification | +|------|----------|-------------|------------------------| +| Cognitive Overload | N | X.X | Monitored / Scheduled / Critical | +| Change Propagation | N | X.X | ... | +| Knowledge Duplication | N | X.X | ... | +| Accidental Complexity | N | X.X | ... | +| Dependency Disorder | N | X.X | ... | +| Domain Model Distortion | N | X.X | ... | + +**Recommended focus:** [The one or two risks with the highest average priority — these are +where investment will have the most impact.] +``` diff --git a/skills/.curated/brooks-lint/references/decay-risks.md b/skills/.curated/brooks-lint/references/decay-risks.md new file mode 100644 index 00000000..ee1dda89 --- /dev/null +++ b/skills/.curated/brooks-lint/references/decay-risks.md @@ -0,0 +1,241 @@ +# Decay Risk Reference + +Six patterns that cause software to degrade over time. +For each finding, identify: which risk, which symptom, which source book. + +Every finding must follow the Iron Law: +Symptom → Source → Consequence → Remedy + +--- + +## Risk 1: Cognitive Overload + +**Diagnostic question:** How much mental effort does a human need to understand this? + +When cognitive load exceeds working memory capacity, developers make mistakes, avoid touching +the code, and slow down. This risk compounds: hard-to-understand code resists the refactoring +that would make it easier to understand. + +### Symptoms + +- Function longer than 20 lines where multiple levels of abstraction are mixed together +- Nesting depth greater than 3 levels +- Parameter list with more than 4 parameters +- Magic numbers or unexplained constants +- Variable names that require reading the implementation to understand (e.g., `d`, `tmp2`, `flag`) +- Boolean expressions with 3 or more conditions combined +- Train-wreck chains: `a.getB().getC().doD()` +- Code names that do not match what the business calls the same concept + +### Sources + +| Symptom | Book | Principle / Smell | +|---------|------|-------------------| +| Long Method | Fowler — Refactoring | Long Method | +| Long Parameter List | Fowler — Refactoring | Long Parameter List | +| Message Chains | Fowler — Refactoring | Message Chains | +| Function length and nesting | McConnell — Code Complete | Ch. 7: High-Quality Routines | +| Variable naming | McConnell — Code Complete | Ch. 11: The Power of Variable Names | +| Magic numbers | McConnell — Code Complete | Ch. 12: Fundamental Data Types | +| Domain name mismatch | Evans — Domain-Driven Design | Ubiquitous Language | + +### Severity Guide + +- 🔴 Critical: function > 50 lines, nesting > 5, or virtually no meaningful names +- 🟡 Warning: function 20–50 lines, nesting 4–5, some unclear names +- 🟢 Suggestion: minor naming issues, 1–2 magic numbers, isolated train-wreck chains + +--- + +## Risk 2: Change Propagation + +**Diagnostic question:** How many unrelated things break when you change one thing? + +High change propagation means that a developer modifying feature A must also modify modules B, +C, and D — even though B, C, and D have nothing conceptually to do with A. This slows velocity +and multiplies regression risk on every change. + +### Symptoms + +- Modifying one feature requires touching more than 3 files in unrelated modules +- One class changes for multiple different business reasons (e.g., `UserService` changes for + billing logic AND notification logic AND profile logic) +- A method uses more data from another class than from its own class +- Two classes know each other's internal state directly +- Changing one module requires recompiling or retesting many unrelated modules + +### Sources + +| Symptom | Book | Principle / Smell | +|---------|------|-------------------| +| Shotgun Surgery | Fowler — Refactoring | Shotgun Surgery | +| Divergent Change | Fowler — Refactoring | Divergent Change | +| Feature Envy | Fowler — Refactoring | Feature Envy | +| Inappropriate Intimacy | Fowler — Refactoring | Inappropriate Intimacy | +| Orthogonality violation | Hunt & Thomas — The Pragmatic Programmer | Ch. 2: Orthogonality | +| DIP violation | Martin — Clean Architecture | Dependency Inversion Principle | +| High change propagation radius | Brooks — The Mythical Man-Month | Ch. 2: Brooks's Law (communication overhead) | + +### Severity Guide + +- 🔴 Critical: one change touches > 5 files, or there is a structural dependency inversion (domain depends on infrastructure) +- 🟡 Warning: one change touches 3–5 files, mild coupling between modules +- 🟢 Suggestion: minor coupling, easily isolatable + +--- + +## Risk 3: Knowledge Duplication + +**Diagnostic question:** Is the same decision expressed in more than one place? + +When the same piece of knowledge lives in multiple places, those copies will inevitably drift +apart. This creates silent inconsistencies: both copies pass their tests but disagree on behavior +in edge cases. DRY is not about code lines — it is about decisions. + +### Symptoms + +- Same logic copy-pasted across multiple files or functions +- Same concept named differently in different parts of the codebase + (e.g., `user`, `account`, `member`, `customer` all referring to the same domain entity) +- Parallel class hierarchies that must change in sync + (e.g., adding a new payment type requires adding a class in 3 different hierarchies) +- Configuration values repeated as literals in multiple places +- Two modules that implement the same algorithm independently + +### Sources + +| Symptom | Book | Principle / Smell | +|---------|------|-------------------| +| Code duplication | Fowler — Refactoring | Duplicate Code | +| Parallel Inheritance | Fowler — Refactoring | Parallel Inheritance Hierarchies | +| DRY violation | Hunt & Thomas — The Pragmatic Programmer | DRY: Don't Repeat Yourself | +| Inconsistent naming | Evans — Domain-Driven Design | Ubiquitous Language | +| Alternative Classes | Fowler — Refactoring | Alternative Classes with Different Interfaces | + +### Severity Guide + +- 🔴 Critical: core business logic duplicated across modules, or same domain concept named 3+ different ways +- 🟡 Warning: utility code duplicated, naming inconsistent within a subsystem +- 🟢 Suggestion: minor literal duplication, single naming inconsistency + +--- + +## Risk 4: Accidental Complexity + +**Diagnostic question:** Is the code more complex than the problem it solves? + +Every system has essential complexity (inherent to the problem) and accidental complexity +(introduced by implementation choices). Accidental complexity is the only kind that can be +eliminated. It accumulates silently: each addition seems justified in isolation, but the total +burden grows until developers spend more time maintaining the scaffolding than solving the problem. + +### Symptoms + +- Abstractions built "for future use" with no current consumer + (e.g., a plugin system for a use case that has only one known implementation) +- Classes that barely justify their existence (wrap a single method call) +- Classes that only delegate to another class without adding behavior (pure middle-men) +- Second attempt at a system that is significantly more elaborate than the first, + adding generality for requirements that do not yet exist +- Switch statements that signal missing polymorphism +- Configuration options that have never been changed from their defaults +- Framework code larger than the application it powers + +### Sources + +| Symptom | Book | Principle / Smell | +|---------|------|-------------------| +| Speculative Generality | Fowler — Refactoring | Speculative Generality | +| Lazy Class | Fowler — Refactoring | Lazy Class | +| Middle Man | Fowler — Refactoring | Middle Man | +| Switch Statements | Fowler — Refactoring | Switch Statements | +| Second System Effect | Brooks — The Mythical Man-Month | Ch. 5: The Second-System Effect | +| YAGNI violations | McConnell — Code Complete | Ch. 5: Design in Construction | +| Over-engineering | Hunt & Thomas — The Pragmatic Programmer | Ch. 2: The Evils of Duplication (YAGNI corollary) | + +### Severity Guide + +- 🔴 Critical: an entire subsystem built around a speculative requirement, or framework overhead dominates domain logic +- 🟡 Warning: several unnecessary abstractions or wrapper classes, unused configuration systems +- 🟢 Suggestion: one or two lazy classes or middle-man patterns in non-critical paths + +--- + +## Risk 5: Dependency Disorder + +**Diagnostic question:** Do dependencies flow in a consistent, predictable direction? + +Dependency direction is the skeleton of a software system. When high-level business logic +depends on low-level infrastructure details, or when components depend on things less stable +than themselves, every infrastructure change becomes a business logic change. Circular +dependencies make it impossible to understand or test any component in isolation. + +### Symptoms + +- Circular dependencies between modules or packages +- High-level business logic directly imports from low-level infrastructure + (e.g., a domain service imports from a specific database driver) +- Stable, widely-used components depend on unstable, frequently-changing ones +- Abstract components depending on concrete implementations +- Law of Demeter violations: `order.getCustomer().getAddress().getCity()` +- Module fan-out greater than 5 (imports from more than 5 other modules) +- The system feels like "one mind did not design this" — different modules use + incompatible architectural patterns with no clear rule for which to use where + +### Sources + +| Symptom | Book | Principle / Smell | +|---------|------|-------------------| +| Dependency cycles | Martin — Clean Architecture | Acyclic Dependencies Principle (ADP) | +| DIP violation | Martin — Clean Architecture | Dependency Inversion Principle (DIP) | +| Instability direction | Martin — Clean Architecture | Stable Dependencies Principle (SDP) | +| Abstraction mismatch | Martin — Clean Architecture | Stable Abstractions Principle (SAP) | +| Conceptual integrity | Brooks — The Mythical Man-Month | Ch. 4: Conceptual Integrity | +| Law of Demeter | Hunt & Thomas — The Pragmatic Programmer | Ch. 5: Decoupling and the Law of Demeter | +| SOLID violations | Martin — Clean Architecture | Single Responsibility, Open/Closed Principles | + +### Severity Guide + +- 🔴 Critical: dependency cycles present, or domain layer directly depends on infrastructure layer +- 🟡 Warning: several SDP or DIP violations but no cycles; conceptual inconsistency across modules +- 🟢 Suggestion: minor Demeter violations, slightly elevated fan-out in isolated modules + +--- + +## Risk 6: Domain Model Distortion + +**Diagnostic question:** Does the code faithfully represent the problem it is solving? + +Code that does not speak the language of the problem domain forces every developer to maintain +a mental translation layer between "what the business calls it" and "what the code calls it." +Over time, this translation layer diverges, and the code begins to model the database schema +or the API contract rather than the business concept. Domain logic bleeds into service layers +and the domain objects become empty data containers. + +### Symptoms + +- Business logic scattered across service layers while domain objects have only getters and setters + (anemic domain model) +- Code variable, class, or method names that do not match what business stakeholders call the concept +- A class whose only purpose is to hold data with no behavior (pure data bag) +- A subclass that ignores or overrides most of its parent's behavior (refuses the inheritance) +- Bounded context boundaries crossed without any translation or anti-corruption layer +- Methods that are more interested in the data of another class than their own + (domain logic in the wrong place) + +### Sources + +| Symptom | Book | Principle / Smell | +|---------|------|-------------------| +| Anemic Domain Model | Evans — Domain-Driven Design | Domain Model pattern | +| Ubiquitous Language drift | Evans — Domain-Driven Design | Ubiquitous Language | +| Bounded context violation | Evans — Domain-Driven Design | Bounded Context | +| Data Class | Fowler — Refactoring | Data Class | +| Refused Bequest | Fowler — Refactoring | Refused Bequest | +| Feature Envy | Fowler — Refactoring | Feature Envy | + +### Severity Guide + +- 🔴 Critical: domain logic entirely in service layer, domain objects are pure data bags with no behavior +- 🟡 Warning: partial anemia, some naming inconsistency between code and domain language +- 🟢 Suggestion: minor naming drift in non-core areas, isolated cases of Feature Envy diff --git a/skills/.curated/brooks-lint/references/pr-review-guide.md b/skills/.curated/brooks-lint/references/pr-review-guide.md new file mode 100644 index 00000000..0d81c49c --- /dev/null +++ b/skills/.curated/brooks-lint/references/pr-review-guide.md @@ -0,0 +1,152 @@ +# PR Review Guide — Mode 1 + +**Purpose:** Analyze a code diff or specific files for decay risks that are directly visible +in the changed code. Every finding must follow the Iron Law: Symptom → Source → Consequence → Remedy. + +--- + +## Before You Start + +**Auto-generated files:** If the diff contains generated files (protobuf stubs, OpenAPI clients, +ORM migrations, lock files, minified bundles), skip those files entirely. Generated code reflects +tool choices, not developer decisions. Note in the report which files were skipped and why. + +--- + +## Analysis Process + +Work through these seven steps in order. Do not skip steps. + +### Step 1: Understand the scope + +Read the diff or files and answer: +- What is the stated purpose of this change? +- Which files were modified? +- Flag immediately if the PR changes more than 10 unrelated files — that itself is a + 🟡 Warning: Change Propagation (a PR that touches many unrelated things is a sign + that responsibilities are tangled). + +### Step 2: Scan for Change Propagation + +*Scan this first — it is the most visible risk in a diff.* + +Look for: +- Does this change touch files in modules that have no conceptual connection to the stated purpose? +- Does any modified class change for more than one business reason in this diff? +- Does any method use more data from another class than from its own? + +If the diff shows no cross-module changes beyond what the feature requires → skip, no finding. + +### Step 3: Scan for Cognitive Overload + +Look for: +- Are any new or modified functions longer than 20 lines? +- Is there nesting deeper than 3 levels in new or modified code? +- Are there more than 4 parameters in any new function signature? +- Are there magic numbers or unexplained constants in new code? +- Do new variable or function names require reading the implementation to understand? +- Are there train-wreck chains (3+ method calls chained)? + +### Step 4: Scan for Knowledge Duplication + +Look for: +- Does this change introduce logic that already exists elsewhere in the codebase? +- Does this change introduce a new name for a concept that already has a name? +- Does this change add a class to a hierarchy that has a parallel in another module? + +### Step 5: Scan for Accidental Complexity + +Look for: +- Does this change add an abstraction with only one concrete use? +- Does this change add a class that only wraps another class or delegates everything? +- Does this change add configuration options or extension points that serve no current requirement? + +### Step 6: Scan for Dependency Disorder and Domain Model Distortion + +Look for Dependency Disorder: +- Do any new imports create a dependency from a high-level module to a low-level one? +- Do any new imports introduce a cycle? + +Look for Domain Model Distortion: +- Do new class or variable names match the language the business uses? +- Does any new class hold only data with no behavior, where behavior was expected? + +--- + +## Applying the Iron Law + +For every finding identified above, write it in this format: + +``` +**[Risk Name] — [Short title]** +Symptom: [the exact thing you saw in the diff — quote line numbers if helpful] +Source: [Book title — Principle or Smell name] +Consequence: [what will happen if this is not addressed] +Remedy: [concrete action, specific to this code] +``` + +Do not write a finding that you cannot complete fully. If you can identify a symptom but +cannot state a consequence, you have not understood the risk well enough — re-read +`decay-risks.md` for that risk before writing the finding. + +--- + +## Output + +Use the standard Report Template from `SKILL.md`. +Mode: PR Review +Scope: list the files reviewed (excluding skipped generated files). + +--- + +### Step 7: Quick Test Check + +*Run this last. Three signals only — this is not a full Mode 4 review.* + +If the diff contains only generated files, configuration, or documentation with no +production logic changes → skip Step 7 entirely. + +**Signal 1: Do tests exist for the changed behavior?** + +- Does the diff modify production code? +- Are corresponding test file changes included in the diff? +- If new public behavior was added with no new tests: + → 🟡 Warning: Coverage Illusion — new behavior is untested + → Source: Feathers — Working Effectively with Legacy Code, Ch.1 +- If the change is a pure refactor and existing tests cover the behavior → no finding. + +**Signal 2: Quick Mock Abuse sniff** + +Only check if the diff includes test file changes. + +- Is mock setup code in new/modified tests obviously longer than the test logic? +- Are the primary assertions `expect(mock).toHaveBeenCalledWith(...)` with no behavior verification? +- Does the diff add any methods to production classes that are only called from test files? + +If any of these are true: + → 🟡 Warning: Mock Abuse — test complexity exceeds behavior complexity + → Source: Osherove — The Art of Unit Testing, mock usage guidelines + +**Signal 3: Quick Test Obscurity sniff** + +Only check if the diff includes test file changes. + +- Do new test names express scenario and expected outcome? + (Pattern: `methodName_scenario_expectedResult` or equivalent) +- Are there new tests with multiple assertions and no message strings on any of them? + +If test names are vague or assertions lack messages: + → 🟢 Suggestion: Test Obscurity — test intent is unclear from the test name or assertions + → Source: Meszaros — xUnit Test Patterns, Assertion Roulette (p.224) + +**Output rule:** + +If all three signals are clean → write no Test findings. Proceed directly to the report. + +If findings exist → add them to the Findings section using the standard Iron Law format. +Label the risk as the test decay risk name (e.g., "Coverage Illusion", "Mock Abuse", +"Test Obscurity"). + +> **Note:** Step 7 is a fast check, not a full test audit. When systemic test problems +> are found, note in the Summary: "Consider running `/brooks-lint:brooks-test` for a +> complete test quality diagnosis." diff --git a/skills/.curated/brooks-lint/references/test-decay-risks.md b/skills/.curated/brooks-lint/references/test-decay-risks.md new file mode 100644 index 00000000..9bc636d5 --- /dev/null +++ b/skills/.curated/brooks-lint/references/test-decay-risks.md @@ -0,0 +1,224 @@ +# Test Decay Risk Reference + +Six patterns that cause test suites to degrade over time. +For each finding, identify: which risk, which symptom, which source book. + +Every finding must follow the Iron Law: +Symptom → Source → Consequence → Remedy + +--- + +## Risk T1: Test Obscurity + +**Diagnostic question:** How much effort does it take to understand what this test verifies? + +When test intent is unclear, developers distrust the suite, skip reading failures carefully, +and add new tests that duplicate existing ones without knowing it. An obscure test suite +is one step from an abandoned one. + +### Symptoms + +- Assertion Roulette: multiple assertions with no message string — when one fails, it is + impossible to determine which behavior broke without reading every assertion +- Mystery Guest: test depends on external state (files, database rows, shared fixtures) + that is not visible in the test body +- Test names that do not express the scenario and expected outcome + (e.g., `test1`, `shouldWork`, `testLogin`, `testUserService`) +- General Fixture: an oversized setUp or beforeEach shared by unrelated tests, making + each test's preconditions invisible +- Test body requires reading production code to understand what is being verified + +### Sources + +| Symptom | Book | Principle / Smell | +|---------|------|-------------------| +| Assertion Roulette | Meszaros — xUnit Test Patterns | Assertion Roulette (p.224) | +| Mystery Guest | Meszaros — xUnit Test Patterns | Mystery Guest (p.411) | +| General Fixture | Meszaros — xUnit Test Patterns | General Fixture (p.316) | +| Test naming | Osherove — The Art of Unit Testing | method_scenario_expected naming convention | + +### Severity Guide + +- 🔴 Critical: no test name in the file describes the behavior being tested; all assertions lack messages +- 🟡 Warning: multiple Mystery Guests; several ambiguous test names +- 🟢 Suggestion: minor naming issues; isolated General Fixture + +--- + +## Risk T2: Test Brittleness + +**Diagnostic question:** Do tests break when you refactor without changing behavior? + +Brittle tests punish the act of improving code. When tests fail on every refactor, +developers stop refactoring. The codebase stagnates to protect the test suite — +which is the exact opposite of what tests are for. + +### Symptoms + +- Tests assert on private method results, internal state, or implementation details + rather than observable behavior +- Eager Test: one test method verifies multiple unrelated behaviors; any single change + causes it to fail regardless of which behavior was touched +- Over-specified: assertions enforce mock call order or exact parameter values that are + irrelevant to the behavior being tested +- Renaming or extracting a method causes 5 or more tests to fail even though no behavior changed + +### Sources + +| Symptom | Book | Principle / Smell | +|---------|------|-------------------| +| Eager Test | Meszaros — xUnit Test Patterns | Eager Test (p.228) | +| Implementation coupling | Osherove — The Art of Unit Testing | Test isolation principle | +| Orthogonality violation | Hunt & Thomas — The Pragmatic Programmer | Ch.2: Orthogonality | + +### Severity Guide + +- 🔴 Critical: refactoring with no behavior change causes test failures; > 5 tests coupled to a single implementation detail +- 🟡 Warning: Eager Tests common across the suite; moderate implementation-detail assertions +- 🟢 Suggestion: isolated over-specification in non-critical tests + +--- + +## Risk T3: Test Duplication + +**Diagnostic question:** Is the same test scenario expressed in more than one place? + +Test duplication means that when behavior changes, tests must be updated in multiple +places. Worse, the duplicated tests create false confidence — the scenario passes in +three places, but none of the three actually tests distinct behavior. + +### Symptoms + +- Test Code Duplication: same setup or assertion logic copy-pasted across multiple tests + without extraction into a shared helper +- Lazy Test: multiple tests verifying identical behavior with no differentiation in input, + state, or expected output +- Same boundary condition tested identically at unit, integration, and E2E level — + three copies with no layer differentiation +- Test helper functions or fixtures duplicated across test files instead of shared + +### Sources + +| Symptom | Book | Principle / Smell | +|---------|------|-------------------| +| Test Code Duplication | Meszaros — xUnit Test Patterns | Test Code Duplication (p.213) | +| Lazy Test | Meszaros — xUnit Test Patterns | Lazy Test (p.232) | +| DRY violation in tests | Hunt & Thomas — The Pragmatic Programmer | DRY: Don't Repeat Yourself | + +### Severity Guide + +- 🔴 Critical: core business scenario fully duplicated across all three test layers with no differentiation +- 🟡 Warning: common scenario setup repeated in 5 or more tests without extraction +- 🟢 Suggestion: minor helper duplication; isolated Lazy Tests + +--- + +## Risk T4: Mock Abuse + +**Diagnostic question:** Is the test more complex than the behavior it tests? + +Mock abuse produces tests that pass confidently while verifying nothing real. They create +the illusion of a test suite. The production code can be completely broken as long as +the mocks are set up correctly — and they always are, because the developer wrote both. + +### Symptoms + +- Mock setup code is longer than the test logic itself +- Primary assertion is `expect(mock).toHaveBeenCalledWith(...)` — the test verifies + that a mock was called, not that any real behavior occurred +- Test-only methods added to production classes for lifecycle management in tests +- Single unit test uses more than 3 mocks +- Incomplete Mock: mock object missing fields that downstream code will access, + causing silent failures only visible in integration +- Hard-Coded Test Data: test data has no resemblance to real data shapes or constraints + +### Sources + +| Symptom | Book | Principle / Smell | +|---------|------|-------------------| +| Mock count > 3 | Osherove — The Art of Unit Testing | Mock usage guidelines | +| Testing mock behavior | Meszaros — xUnit Test Patterns | Behavior Verification (p.544) | +| Test-only production methods | Feathers — Working Effectively with Legacy Code | Ch.3: Sensing and Separation | +| Hard-Coded Test Data | Meszaros — xUnit Test Patterns | Hard-Coded Test Data (p.534) | +| Incomplete Mock | Osherove — The Art of Unit Testing | Mock completeness requirement | + +### Severity Guide + +- 🔴 Critical: mock setup > 50% of test code; production class has methods only called from tests +- 🟡 Warning: mocks consistently > 3 per test; primary assertions are mock call verifications +- 🟢 Suggestion: isolated Incomplete Mocks; minor Hard-Coded Test Data + +--- + +## Risk T5: Coverage Illusion + +**Diagnostic question:** Does the test suite actually protect against the failures that matter? + +Coverage percentage is a measure of what was executed during tests, not what was verified. +A test suite can achieve 90% line coverage and still allow every critical failure mode through. +The illusion is more dangerous than acknowledged ignorance — teams stop looking for gaps +because the number says they are covered. + +### Symptoms + +- High line coverage but error-handling branches, boundary conditions, and exception paths + have no corresponding tests +- Happy-path only: no sad paths, no null/empty/zero inputs, no concurrency edge cases +- Legacy code areas are being actively modified with no tests present + (Feathers: "legacy code is code without tests") +- Coverage percentage treated as a sign-off criterion; critical change paths remain untested +- Tests assert on return values but not on important side effects such as database writes, + event publications, or state transitions + +### Sources + +| Symptom | Book | Principle / Smell | +|---------|------|-------------------| +| Legacy code = no tests | Feathers — Working Effectively with Legacy Code | Ch.1: "Legacy code is code without tests" | +| Change coverage vs line coverage | Google — How Google Tests Software | Ch.11: Testing at Google Scale | +| Happy-path only | Osherove — The Art of Unit Testing | Test completeness principle | + +### Severity Guide + +- 🔴 Critical: legacy code area actively being modified with no tests; error-handling paths entirely absent +- 🟡 Warning: coverage > 80% but edge and exception paths are systematically absent +- 🟢 Suggestion: a few non-critical paths missing sad-path tests + +--- + +## Risk T6: Architecture Mismatch + +**Diagnostic question:** Does the test suite structure reflect the system's actual risk profile? + +A test suite with the wrong shape is slow, unreliable, and expensive to maintain — +not because the tests are badly written, but because the wrong test type is being used +for the wrong purpose. An inverted pyramid executes the maximum number of tests at the +level that is slowest, most environment-dependent, and hardest to debug. + +### Symptoms + +- Inverted test pyramid: E2E or integration test count exceeds unit test count, + causing a slow and fragile suite +- Legacy code with no seam points: no interfaces, dependency injection, or seams exist, + making it impossible to test in isolation without modifying production code +- Legacy areas being modified have no Characterization Tests to capture current behavior + before changes are made +- Full suite execution time exceeds 10 minutes (indicates architectural problem, + not a performance problem — too many slow tests) +- High-risk and low-risk paths are tested at identical density; + no risk-based prioritization in test distribution + +### Sources + +| Symptom | Book | Principle / Smell | +|---------|------|-------------------| +| Inverted pyramid | Google — How Google Tests Software | 70:20:10 unit:integration:E2E ratio | +| No seam points | Feathers — Working Effectively with Legacy Code | Ch.4: Seam Model | +| Missing Characterization Tests | Feathers — Working Effectively with Legacy Code | Ch.13: Characterization Tests | +| Suite execution time | Meszaros — xUnit Test Patterns | Test suite design principles | + +### Severity Guide + +- 🔴 Critical: legacy code being modified has no seams and no characterization tests; pyramid fully inverted +- 🟡 Warning: suite execution > 10 minutes; integration/E2E count exceeds unit tests +- 🟢 Suggestion: localized pyramid ratio deviation; a few legacy areas missing characterization tests diff --git a/skills/.curated/brooks-lint/references/test-guide.md b/skills/.curated/brooks-lint/references/test-guide.md new file mode 100644 index 00000000..8a7eab40 --- /dev/null +++ b/skills/.curated/brooks-lint/references/test-guide.md @@ -0,0 +1,125 @@ +# Test Quality Review Guide — Mode 4 + +**Purpose:** Diagnose the health of a test suite using six test-space decay risks. +Every finding must follow the Iron Law: Symptom → Source → Consequence → Remedy. + +--- + +## Before You Start: Build the Test Suite Map + +Before scanning for any risk, map the current test suite structure: + +``` +Unit tests: X files, ~N tests +Integration tests: X files, ~N tests +E2E tests: X files, ~N tests +Ratio: Unit X% : Integration X% : E2E X% +Coverage areas: [modules with tests] vs [modules without tests] +``` + +If you cannot access test files directly, ask the user **one question** — choose the +most relevant: +1. "Which module is hardest to test or has the least coverage?" +2. "When you make a change, how often do unrelated tests break?" +3. "Is there a part of the codebase your team avoids touching because it has no tests?" + +After one answer, proceed. Do not ask more than one question. + +--- + +## Analysis Process + +Work through these five steps in order. + +### Step 1: Scan for Test Obscurity + +*Scan this first — the most visible risk and the one that determines whether the suite +is maintainable at all.* + +Look for: +- Read 5–10 test names at random: can each one communicate subject + scenario + expected + outcome without opening the test body? +- Are there tests where a failure gives no clue which behavior broke (multiple assertions, + no message strings)? +- Does any test depend on external state (files, database rows, env variables, shared mutable + fixtures) that is invisible from within the test body? +- Is there a single massive setUp or beforeEach that every test inherits regardless of + what it actually needs? + +If all test names are clear and setups are minimal → no finding. + +### Step 2: Scan for Test Brittleness and Mock Abuse + +*These two risks co-occur: over-mocking produces tests that are both fragile and vacuous. +Scan them together.* + +Look for Test Brittleness: +- Ask (or check git history): did any recent refactor cause test failures with no + behavior change? +- Are there test methods where the name contains "and" or that assert on 3 or more + unrelated behaviors (Eager Test)? +- Do assertions specify mock call order or exact parameter values that are irrelevant + to the observable behavior? + +Look for Mock Abuse: +- Sample 3–5 tests: is mock setup longer than the test logic? +- Are the primary assertions `expect(mock).toHaveBeenCalledWith(...)` rather than + assertions on outputs, state, or events? +- Are there methods in production classes that are only called from test files? +- Does any single test create more than 3 mock objects? + +### Step 3: Scan for Test Duplication + +Look for: +- Is the same setup block (same variables initialized the same way) repeated across + 5 or more test files without a shared helper? +- Are there multiple tests that pass identical inputs and assert identical outputs + with no differentiation (Lazy Test)? +- Is the same business scenario covered at unit, integration, and E2E level with no + difference in what each layer is testing? + +If duplication is systemic (10 or more instances) → Critical. +If localized (3–5 instances) → Warning. + +### Step 4: Scan for Coverage Illusion and Architecture Mismatch + +Look for Coverage Illusion: +- Pick the most recently modified core module. Are its error-handling branches and + null/boundary inputs covered by tests? +- Are there legacy areas (old functions, no test files nearby) that are actively + being changed? +- Do the tests assert on side effects (DB writes, events emitted, state transitions) + or only on return values? + +Look for Architecture Mismatch: +- Compare the suite map from the start: is the ratio close to 70% unit / 20% integration / 10% E2E? +- If legacy code is being modified, are there Characterization Tests that captured + behavior before the change? +- Is the full suite execution time known? If > 10 minutes, note as 🟡 Warning. +- Are high-risk modules tested at higher density than trivial utilities? + +### Step 5: Apply Iron Law, Output Report + +For every finding identified above, write it in this format: + +``` +**[Test Risk Name] — [Short title]** +Symptom: [the exact thing observed in the test files — quote file names or patterns] +Source: [Book title — Smell or Principle name] +Consequence: [what happens to the test suite if this is not addressed] +Remedy: [concrete, specific action] +``` + +Do not write a finding you cannot complete. If you can identify a symptom but cannot +state a consequence, re-read `test-decay-risks.md` for that risk before writing the finding. + +--- + +## Output + +Use the standard Report Template from `SKILL.md`. +Mode: Test Quality Review +Scope: the test files or directory reviewed. + +Include the Test Suite Map as a code block before the Findings section, +labeled "Test Suite Map".