Fix negative matching #1521 by lekanjava · Pull Request #1522 · OWASP/Nettacker

lekanjava · 2026-04-19T00:06:18Z

This PR significantly improves detection accuracy and reduces false positives for several core modules by implementing robust negative matching and WAF filtering logic for #1521.

Key Changes:

Optimized Clickjacking Logic: Refactored clickjacking.yaml to utilize Nettacker's native reverse: true functionality. This removes complex, computationally expensive negative lookahead regex patterns and improves maintainability.
WAF False Positive Suppression: Added negative content matching to admin.yaml and dir.yaml. This prevents these modules from incorrectly reporting a "Found" directory when a Web Application Firewall (like Cloudflare, Incapsula, or Sucuri) returns a generic 403 Forbidden block page.
Enhanced Takeover Accuracy: Updated subdomain_takeover.yaml with global negative conditions. The module now ignores 403 status codes and specific WAF-related content strings, ensuring that blocked requests do not trigger unintended takeover signatures.
These updates help ensure that Nettacker provides more reliable results when scanning modern infrastructure protected by security proxies.
Fixes # (link your issue here)

Type of change

New core framework functionality
Bugfix (non-breaking change that fixes an issue)
Code refactoring without any functionality changes
New or existing module/payload change
Documentation/localization improvement
Test coverage improvement
Dependency upgrade
Other improvement (best practice, cleanup, optimization, etc)

Checklist

…uce WAF false positives

coderabbitai · 2026-04-19T00:06:31Z

Summary by CodeRabbit

Improvements
- Enhanced admin-panel response filtering to exclude common error/access-denied responses.
- Refined directory scanning to better ignore WAF/edge-provider blocks and error pages.
- Improved clickjacking checks to more reliably detect protective headers and content.
- Strengthened subdomain-takeover detection by requiring stricter combined response criteria before further verification.

Walkthrough

Updated YAML rules across scan and vuln modules to refine HTTP response selection: added content-based regex exclusions, changed some header/content checks to inverted-match semantics, and tightened takeover gating by switching a top-level condition from or to and.

Changes

Cohort / File(s)	Summary
Scan modules `nettacker/modules/scan/admin.yaml`, `nettacker/modules/scan/dir.yaml`	Added `content` response condition with case-insensitive regexes covering CDN/WAF/vendor/error phrases and `reverse: true` to exclude matching response bodies from scan selection.
Clickjacking detection `nettacker/modules/vuln/clickjacking.yaml`	Reworked header/content checks: moved from negative-lookahead patterns to direct detection patterns and set `reverse: true`, inverting pass/fail logic for `X-Frame-Options`, CSP, and related content checks.
Subdomain takeover `nettacker/modules/vuln/subdomain_takeover.yaml`	Changed top-level response gating from `condition_type: or` to `condition_type: and`, requiring both non-403 status and non-matching provider/error content before provider-specific takeover matching.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

arkid15r

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Fix negative matching `#1521`' directly addresses the primary objective of the PR, which is to implement robust negative matching logic across multiple modules.
Description check	✅ Passed	The description is well-related to the changeset, clearly explaining the negative matching improvements, WAF false positive suppression, and specific module updates with concrete examples.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nettacker/modules/scan/admin.yaml`:
- Around line 47-49: The current admin finder condition (content.regex and
reverse: true used under condition_type: and) is too broad and will drop
legitimate admin pages; update content.regex to remove generic phrases like
"Access Denied" and "Error 403 Forbidden" and replace "Webroot" with specific
WAF/CDN signatures (e.g., Akamai, AWS WAF, F5 BIG-IP, Imperva) and anchored
patterns that indicate block pages (examples: "Attention Required! \\|
Cloudflare", "Incapsula incident ID", "Sucuri WebSite Firewall", or "Access
Denied.*reference\\s*#"); keep reverse: true but ensure the regex matches only
clear block-page markers rather than common auth/403 text so protected admin
endpoints aren’t filtered out.

In `@nettacker/modules/vuln/clickjacking.yaml`:
- Around line 37-46: The clickjacking rule should treat absent
X-Frame-Options/Content-Security-Policy headers as a positive finding and the
CSP content check must look for frame-ancestors, so update the rule and
referenced logic: in nettacker/modules/vuln/clickjacking.yaml change the
header/grouping so missing headers don't short-circuit an AND — e.g., make the
header checks an OR group (so absence of either header counts as vulnerable) or
mark each header sub-condition to treat missing key as a match; also replace the
content regex that currently searches for DENY|SAMEORIGIN inside a CSP meta tag
with a pattern that matches real CSP frame-ancestors directives (e.g., look for
"frame-ancestors" followed by 'none', 'self', or host sources), and ensure the
rule's reverse semantics still invert correctly; refer to the
response_conditions_matched function in nettacker/core/lib/http.py when
adjusting how absent headers are evaluated so the YAML behavior aligns with that
function's expectations.

In `@nettacker/modules/vuln/subdomain_takeover.yaml`:
- Around line 36-43: The top-level condition block using condition_type: and is
blocking 403-based takeover detections (e.g., the AWS Bucket Takeover) because
status_code reverse: "403" and broad content regex suppress
iterative_response_match; fix by either (a) removing 403 from the top-level
status_code exclusion and tightening the content regex to only vendor WAF
signatures (e.g., keep Cloudflare|Incapsula|Sucuri and drop generic "Access
Denied"/"Error 403 Forbidden"), or (b) move the WAF/content exclusions into each
provider's iterative_response_match entries so they don't globally block 403
detections; update the condition_type/status_code/content settings accordingly
to ensure the AWS Bucket Takeover sub-rule and iterative_response_match can run
for 403 responses.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 80722f88-6614-4568-915c-4781da0da234

📥 Commits

Reviewing files that changed from the base of the PR and between 50c0be2 and 32e3640.

📒 Files selected for processing (4)

nettacker/modules/scan/admin.yaml
nettacker/modules/scan/dir.yaml
nettacker/modules/vuln/clickjacking.yaml
nettacker/modules/vuln/subdomain_takeover.yaml

coderabbitai · 2026-04-19T00:11:27Z

+            content:
+              regex: (?i)(Cloudflare|Incapsula|Sucuri|Access Denied|Webroot|Error 403 Forbidden)
+              reverse: true


⚠️ Potential issue | 🟡 Minor

Overly generic WAF regex risks filtering out legitimate admin endpoints.

With condition_type: and, this condition (with reverse: true) causes the admin finder to silently drop any response whose body matches these strings. Two concerns worth reconsidering:

Access Denied and Error 403 Forbidden are generic phrases that appear on many real admin/authentication pages (e.g., Tomcat Manager, Jenkins, custom admin panels after failed auth). Since the scan also treats 403 as a hit, many legitimate protected admin endpoints will now be silently hidden.

Webroot is not a common WAF/CDN block-page marker (Webroot is primarily an endpoint AV vendor). Please confirm this was intentional, or replace with more typical WAF signatures (e.g., Akamai, AWS WAF, F5 BIG-IP, ModSecurity, Imperva).

Consider anchoring patterns to phrases that strongly indicate a block page rather than substrings that appear in legitimate responses (e.g., Attention Required! \| Cloudflare, Incapsula incident ID, Sucuri WebSite Firewall, Access Denied.*reference\s*#).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@nettacker/modules/scan/admin.yaml` around lines 47 - 49, The current admin finder condition (content.regex and reverse: true used under condition_type: and) is too broad and will drop legitimate admin pages; update content.regex to remove generic phrases like "Access Denied" and "Error 403 Forbidden" and replace "Webroot" with specific WAF/CDN signatures (e.g., Akamai, AWS WAF, F5 BIG-IP, Imperva) and anchored patterns that indicate block pages (examples: "Attention Required! \\| Cloudflare", "Incapsula incident ID", "Sucuri WebSite Firewall", or "Access Denied.*reference\\s*#"); keep reverse: true but ensure the regex matches only clear block-page markers rather than common auth/403 text so protected admin endpoints aren’t filtered out.

coderabbitai · 2026-04-19T00:11:27Z

            headers:
              x-frame-options:
-                regex: ^((?!SAMEORIGIN|DENY).)+$
-                reverse: false
+                regex: SAMEORIGIN|DENY
+                reverse: true
              Content-Security-Policy:
-                regex: ^((?!frame-ancestors|frame-src).)+$
-                reverse: false
+                regex: frame-ancestors|frame-src
+                reverse: true
            content:
-              regex: ^((?!http-equiv=.*Content-Security-Policy.*content=.*(DENY|SAMEORIGIN)).)+$
-              reverse: false
+              regex: http-equiv=.*Content-Security-Policy.*content=.*(DENY|SAMEORIGIN)
+              reverse: true


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Confirm the except-TypeError path in response_conditions_matched and how missing headers are treated. rg -nP -C8 'def response_conditions_matched' nettacker/core/lib/http.py

Repository: OWASP/Nettacker

Length of output: 803

🏁 Script executed:

sed -n '45,100p' nettacker/core/lib/http.py

Repository: OWASP/Nettacker

Length of output: 2710

🏁 Script executed:

sed -n '45,130p' nettacker/core/lib/http.py

Repository: OWASP/Nettacker

Length of output: 3975

🏁 Script executed:

sed -n '37,46p' nettacker/modules/vuln/clickjacking.yaml

Repository: OWASP/Nettacker

Length of output: 433

Two issues with the inverted clickjacking matching.

Missing headers fail the AND, so the most common vulnerability case isn't detected. In nettacker/core/lib/http.py (response_conditions_matched, lines 95–96), when any header key is absent, the lookup falls back to False, re.findall raises TypeError, and the except branch (line 71–72) sets condition_results["headers"][header] = []. Under condition_type: and, if any [] appears in header results, the entire function returns {}. That means a site shipping no X-Frame-Options and no Content-Security-Policy — the canonical clickjacking-vulnerable configuration — will not be reported. The rule will only fire when both headers are present but misconfigured, which is a very narrow subset. Restructure the condition (e.g., using or at this level, or per-header OR groups) so that "header absent" contributes to a positive finding.

The content regex targets X-Frame-Options tokens inside a CSP meta tag, which don't exist in real code. DENY and SAMEORIGIN are X-Frame-Options directives; CSP uses frame-ancestors 'none' | 'self' | …. Meta tags declaring X-Frame-Options per spec are not valid. The regex http-equiv=.*Content-Security-Policy.*content=.*(DENY|SAMEORIGIN) will never match real markup. Combined with reverse: true, this sub-condition is effectively always true (the regex never matches), making it a no-op rather than a meaningful check. Update to search for actual CSP frame-ancestors syntax instead.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@nettacker/modules/vuln/clickjacking.yaml` around lines 37 - 46, The clickjacking rule should treat absent X-Frame-Options/Content-Security-Policy headers as a positive finding and the CSP content check must look for frame-ancestors, so update the rule and referenced logic: in nettacker/modules/vuln/clickjacking.yaml change the header/grouping so missing headers don't short-circuit an AND — e.g., make the header checks an OR group (so absence of either header counts as vulnerable) or mark each header sub-condition to treat missing key as a match; also replace the content regex that currently searches for DENY|SAMEORIGIN inside a CSP meta tag with a pattern that matches real CSP frame-ancestors directives (e.g., look for "frame-ancestors" followed by 'none', 'self', or host sources), and ensure the rule's reverse semantics still invert correctly; refer to the response_conditions_matched function in nettacker/core/lib/http.py when adjusting how absent headers are evaluated so the YAML behavior aligns with that function's expectations.

coderabbitai · 2026-04-19T00:11:27Z

+          condition_type: and
          conditions:
+            status_code:
+              regex: "403"
+              reverse: true
+            content:
+              regex: (?i)(Cloudflare|Incapsula|Sucuri|Access Denied|Webroot|Error 403 Forbidden)
+              reverse: true


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Confirm iterative_response_match handling path and any pre-filtering contract. rg -nP -C5 'iterative_response_match'

Repository: OWASP/Nettacker

Length of output: 10968

🏁 Script executed:

#!/bin/bash # Read the subdomain_takeover.yaml file around the relevant section head -50 nettacker/modules/vuln/subdomain_takeover.yaml

Repository: OWASP/Nettacker

Length of output: 1969

🏁 Script executed:

#!/bin/bash # Search for AWS bucket takeover to understand if 403 is a legitimate concern rg -A 20 -B 5 'AWS Bucket Takeover|AWS S3' nettacker/modules/vuln/subdomain_takeover.yaml

Repository: OWASP/Nettacker

Length of output: 1182

🏁 Script executed:

#!/bin/bash # Check the response_conditions_matched function to understand what fields it evaluates rg -A 30 'def response_conditions_matched' nettacker/core/lib/http.py

Repository: OWASP/Nettacker

Length of output: 1613

Top-level and condition blocks AWS S3 and other 403-returning takeover detections.

With condition_type: and, these filters become a hard pre-gate: iterative_response_match runs only when status ≠ 403 and content lacks the listed terms. This causes two regressions:

AWS S3 (and other cloud services) legitimately return 403 with "The specified bucket does not exist". The AWS Bucket Takeover sub-rule can never fire under this logic—a direct behavioral regression.

Generic error phrases in the content regex (Access Denied, Error 403 Forbidden) will suppress valid takeovers that happen to include these strings.

The iterative_response_match section does have a separate evaluation path (confirmed in nettacker/core/lib/http.py lines 205–211), but it is gated by lines 198–201: only runs if the top-level condition passes or condition_type == "or". With condition_type: and, a failed pre-filter blocks it entirely.

Recommended fixes:

(a) Remove 403 from the status-code exclusion; tighten the content regex to concrete WAF signatures only (Cloudflare|Incapsula|Sucuri without the generic Access Denied / Error 403 Forbidden), or

(b) Move these WAF exclusions inside each provider's iterative_response_match entry so they don't blanket-suppress 403-based detections.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@nettacker/modules/vuln/subdomain_takeover.yaml` around lines 36 - 43, The top-level condition block using condition_type: and is blocking 403-based takeover detections (e.g., the AWS Bucket Takeover) because status_code reverse: "403" and broad content regex suppress iterative_response_match; fix by either (a) removing 403 from the top-level status_code exclusion and tightening the content regex to only vendor WAF signatures (e.g., keep Cloudflare|Incapsula|Sucuri and drop generic "Access Denied"/"Error 403 Forbidden"), or (b) move the WAF/content exclusions into each provider's iterative_response_match entries so they don't globally block 403 detections; update the condition_type/status_code/content settings accordingly to ensure the AWS Bucket Takeover sub-rule and iterative_response_match can run for 403 responses.

securestep9 · 2026-04-19T00:35:11Z

@coderabbitai full review

coderabbitai · 2026-04-19T00:35:16Z

✅ Actions performed

Full review triggered.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nettacker/modules/scan/dir.yaml`:
- Around line 47-49: The negative content regex under the content block (keys:
regex, reverse) is overly broad and will drop legitimate hits; tighten it by
removing generic terms like "Access Denied", "Error 403 Forbidden", "Webroot"
and vendor names that can appear in benign pages, and replace them with
anchored, block-page specific patterns (e.g., phrases like "Attention Required!
| Cloudflare", "Incapsula incident", "Sucuri WebSite Firewall", or "Access
Denied.*reference\\s*#"); keep reverse: true behavior but update the regex to
only match strong WAF/blockpage signatures so that 200/401/403 directory hits
(condition_type: and) are not suppressed erroneously.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 75e76299-5d46-4a54-bbd3-360ce925747b

📥 Commits

Reviewing files that changed from the base of the PR and between 50c0be2 and 32e3640.

📒 Files selected for processing (4)

nettacker/modules/scan/admin.yaml
nettacker/modules/scan/dir.yaml
nettacker/modules/vuln/clickjacking.yaml
nettacker/modules/vuln/subdomain_takeover.yaml

coderabbitai · 2026-04-19T00:37:31Z

+            content:
+              regex: (?i)(Cloudflare|Incapsula|Sucuri|Access Denied|Webroot|Error 403 Forbidden)
+              reverse: true


⚠️ Potential issue | 🟠 Major

Overly broad negative content regex may hide legitimate directory hits.

With condition_type: and and reverse: true, any 200/401/403 response whose body merely contains strings like Access Denied, Error 403 Forbidden, or Webroot will be silently dropped. Several concerns:

Access Denied / Error 403 Forbidden are extremely common on legitimate protected directories (Apache/Nginx default 403 pages, IIS, Tomcat, Jenkins, many admin panels). Since the scan is explicitly configured to treat 403 as a hit (line 45), this filter will suppress the very signal the scan is looking for.

Webroot is an endpoint AV vendor, not a common CDN/WAF block-page marker; the word also appears in plenty of legitimate web-server documentation/error pages. Likely unintended — consider dropping it or replacing with real WAF signatures (Akamai, AWS WAF, F5 BIG-IP, ModSecurity, Imperva).

The vendor names Cloudflare|Incapsula|Sucuri match any page that merely references them (footers, docs, analytics), not just block pages.

Prefer anchoring to strong block-page phrases, e.g. Attention Required! \| Cloudflare, Incapsula incident ID, Sucuri WebSite Firewall, Access Denied.*reference\s*#.

🔧 Suggested tightening

content: - regex: (?i)(Cloudflare|Incapsula|Sucuri|Access Denied|Webroot|Error 403 Forbidden) + regex: (?i)(Attention Required!\s*\|\s*Cloudflare|Incapsula incident ID|Sucuri WebSite Firewall|Access Denied.*reference\s*#) reverse: true

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

content:

regex: (?i)(Cloudflare|Incapsula|Sucuri|Access Denied|Webroot|Error 403 Forbidden)

reverse: true

content:

regex: (?i)(Attention Required!\s*\|\s*Cloudflare|Incapsula incident ID|Sucuri WebSite Firewall|Access Denied.*reference\s*#)

reverse: true

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@nettacker/modules/scan/dir.yaml` around lines 47 - 49, The negative content regex under the content block (keys: regex, reverse) is overly broad and will drop legitimate hits; tighten it by removing generic terms like "Access Denied", "Error 403 Forbidden", "Webroot" and vendor names that can appear in benign pages, and replace them with anchored, block-page specific patterns (e.g., phrases like "Attention Required! | Cloudflare", "Incapsula incident", "Sucuri WebSite Firewall", or "Access Denied.*reference\\s*#"); keep reverse: true behavior but update the regex to only match strong WAF/blockpage signatures so that 200/401/403 directory hits (condition_type: and) are not suppressed erroneously.

lekanjava added 2 commits April 19, 2026 00:57

Fix: Improve detection accuracy and properly handle negative matches

7f1d1c5

Fix: Add global negative matching to subdomain_takeover module to red…

32e3640

…uce WAF false positives

lekanjava requested review from arkid15r and securestep9 as code owners April 19, 2026 00:06

coderabbitai bot reviewed Apr 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix negative matching #1521#1522

Fix negative matching #1521#1522
lekanjava wants to merge 2 commits intoOWASP:masterfrom
lekanjava:fix-negative-matching

lekanjava commented Apr 19, 2026

Uh oh!

coderabbitai bot commented Apr 19, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 19, 2026

Uh oh!

coderabbitai bot Apr 19, 2026

Uh oh!

coderabbitai bot Apr 19, 2026

Uh oh!

securestep9 commented Apr 19, 2026

Uh oh!

coderabbitai bot commented Apr 19, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

lekanjava commented Apr 19, 2026

Key Changes:

Type of change

Checklist

Uh oh!

coderabbitai bot commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

securestep9 commented Apr 19, 2026

Uh oh!

coderabbitai bot commented Apr 19, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Apr 19, 2026 •

edited

Loading