Conversation
Summary by CodeRabbit
WalkthroughUpdated YAML rules across scan and vuln modules to refine HTTP response selection: added content-based regex exclusions, changed some header/content checks to inverted-match semantics, and tightened takeover gating by switching a top-level condition from Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nettacker/modules/scan/admin.yaml`:
- Around line 47-49: The current admin finder condition (content.regex and
reverse: true used under condition_type: and) is too broad and will drop
legitimate admin pages; update content.regex to remove generic phrases like
"Access Denied" and "Error 403 Forbidden" and replace "Webroot" with specific
WAF/CDN signatures (e.g., Akamai, AWS WAF, F5 BIG-IP, Imperva) and anchored
patterns that indicate block pages (examples: "Attention Required! \\|
Cloudflare", "Incapsula incident ID", "Sucuri WebSite Firewall", or "Access
Denied.*reference\\s*#"); keep reverse: true but ensure the regex matches only
clear block-page markers rather than common auth/403 text so protected admin
endpoints aren’t filtered out.
In `@nettacker/modules/vuln/clickjacking.yaml`:
- Around line 37-46: The clickjacking rule should treat absent
X-Frame-Options/Content-Security-Policy headers as a positive finding and the
CSP content check must look for frame-ancestors, so update the rule and
referenced logic: in nettacker/modules/vuln/clickjacking.yaml change the
header/grouping so missing headers don't short-circuit an AND — e.g., make the
header checks an OR group (so absence of either header counts as vulnerable) or
mark each header sub-condition to treat missing key as a match; also replace the
content regex that currently searches for DENY|SAMEORIGIN inside a CSP meta tag
with a pattern that matches real CSP frame-ancestors directives (e.g., look for
"frame-ancestors" followed by 'none', 'self', or host sources), and ensure the
rule's reverse semantics still invert correctly; refer to the
response_conditions_matched function in nettacker/core/lib/http.py when
adjusting how absent headers are evaluated so the YAML behavior aligns with that
function's expectations.
In `@nettacker/modules/vuln/subdomain_takeover.yaml`:
- Around line 36-43: The top-level condition block using condition_type: and is
blocking 403-based takeover detections (e.g., the AWS Bucket Takeover) because
status_code reverse: "403" and broad content regex suppress
iterative_response_match; fix by either (a) removing 403 from the top-level
status_code exclusion and tightening the content regex to only vendor WAF
signatures (e.g., keep Cloudflare|Incapsula|Sucuri and drop generic "Access
Denied"/"Error 403 Forbidden"), or (b) move the WAF/content exclusions into each
provider's iterative_response_match entries so they don't globally block 403
detections; update the condition_type/status_code/content settings accordingly
to ensure the AWS Bucket Takeover sub-rule and iterative_response_match can run
for 403 responses.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 80722f88-6614-4568-915c-4781da0da234
📒 Files selected for processing (4)
nettacker/modules/scan/admin.yamlnettacker/modules/scan/dir.yamlnettacker/modules/vuln/clickjacking.yamlnettacker/modules/vuln/subdomain_takeover.yaml
| content: | ||
| regex: (?i)(Cloudflare|Incapsula|Sucuri|Access Denied|Webroot|Error 403 Forbidden) | ||
| reverse: true |
There was a problem hiding this comment.
Overly generic WAF regex risks filtering out legitimate admin endpoints.
With condition_type: and, this condition (with reverse: true) causes the admin finder to silently drop any response whose body matches these strings. Two concerns worth reconsidering:
Access DeniedandError 403 Forbiddenare generic phrases that appear on many real admin/authentication pages (e.g., Tomcat Manager, Jenkins, custom admin panels after failed auth). Since the scan also treats403as a hit, many legitimate protected admin endpoints will now be silently hidden.Webrootis not a common WAF/CDN block-page marker (Webroot is primarily an endpoint AV vendor). Please confirm this was intentional, or replace with more typical WAF signatures (e.g.,Akamai,AWS WAF,F5 BIG-IP,ModSecurity,Imperva).
Consider anchoring patterns to phrases that strongly indicate a block page rather than substrings that appear in legitimate responses (e.g., Attention Required! \| Cloudflare, Incapsula incident ID, Sucuri WebSite Firewall, Access Denied.*reference\s*#).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nettacker/modules/scan/admin.yaml` around lines 47 - 49, The current admin
finder condition (content.regex and reverse: true used under condition_type:
and) is too broad and will drop legitimate admin pages; update content.regex to
remove generic phrases like "Access Denied" and "Error 403 Forbidden" and
replace "Webroot" with specific WAF/CDN signatures (e.g., Akamai, AWS WAF, F5
BIG-IP, Imperva) and anchored patterns that indicate block pages (examples:
"Attention Required! \\| Cloudflare", "Incapsula incident ID", "Sucuri WebSite
Firewall", or "Access Denied.*reference\\s*#"); keep reverse: true but ensure
the regex matches only clear block-page markers rather than common auth/403 text
so protected admin endpoints aren’t filtered out.
| headers: | ||
| x-frame-options: | ||
| regex: ^((?!SAMEORIGIN|DENY).)+$ | ||
| reverse: false | ||
| regex: SAMEORIGIN|DENY | ||
| reverse: true | ||
| Content-Security-Policy: | ||
| regex: ^((?!frame-ancestors|frame-src).)+$ | ||
| reverse: false | ||
| regex: frame-ancestors|frame-src | ||
| reverse: true | ||
| content: | ||
| regex: ^((?!http-equiv=.*Content-Security-Policy.*content=.*(DENY|SAMEORIGIN)).)+$ | ||
| reverse: false | ||
| regex: http-equiv=.*Content-Security-Policy.*content=.*(DENY|SAMEORIGIN) | ||
| reverse: true |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Confirm the except-TypeError path in response_conditions_matched and how missing headers are treated.
rg -nP -C8 'def response_conditions_matched' nettacker/core/lib/http.pyRepository: OWASP/Nettacker
Length of output: 803
🏁 Script executed:
sed -n '45,100p' nettacker/core/lib/http.pyRepository: OWASP/Nettacker
Length of output: 2710
🏁 Script executed:
sed -n '45,130p' nettacker/core/lib/http.pyRepository: OWASP/Nettacker
Length of output: 3975
🏁 Script executed:
sed -n '37,46p' nettacker/modules/vuln/clickjacking.yamlRepository: OWASP/Nettacker
Length of output: 433
Two issues with the inverted clickjacking matching.
-
Missing headers fail the AND, so the most common vulnerability case isn't detected. In
nettacker/core/lib/http.py(response_conditions_matched, lines 95–96), when any header key is absent, the lookup falls back toFalse,re.findallraisesTypeError, and theexceptbranch (line 71–72) setscondition_results["headers"][header] = []. Undercondition_type: and, if any[]appears in header results, the entire function returns{}. That means a site shipping noX-Frame-Optionsand noContent-Security-Policy— the canonical clickjacking-vulnerable configuration — will not be reported. The rule will only fire when both headers are present but misconfigured, which is a very narrow subset. Restructure the condition (e.g., usingorat this level, or per-header OR groups) so that "header absent" contributes to a positive finding. -
The content regex targets X-Frame-Options tokens inside a CSP meta tag, which don't exist in real code.
DENYandSAMEORIGINare X-Frame-Options directives; CSP usesframe-ancestors 'none' | 'self' | …. Meta tags declaring X-Frame-Options per spec are not valid. The regexhttp-equiv=.*Content-Security-Policy.*content=.*(DENY|SAMEORIGIN)will never match real markup. Combined withreverse: true, this sub-condition is effectively always true (the regex never matches), making it a no-op rather than a meaningful check. Update to search for actual CSP frame-ancestors syntax instead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nettacker/modules/vuln/clickjacking.yaml` around lines 37 - 46, The
clickjacking rule should treat absent X-Frame-Options/Content-Security-Policy
headers as a positive finding and the CSP content check must look for
frame-ancestors, so update the rule and referenced logic: in
nettacker/modules/vuln/clickjacking.yaml change the header/grouping so missing
headers don't short-circuit an AND — e.g., make the header checks an OR group
(so absence of either header counts as vulnerable) or mark each header
sub-condition to treat missing key as a match; also replace the content regex
that currently searches for DENY|SAMEORIGIN inside a CSP meta tag with a pattern
that matches real CSP frame-ancestors directives (e.g., look for
"frame-ancestors" followed by 'none', 'self', or host sources), and ensure the
rule's reverse semantics still invert correctly; refer to the
response_conditions_matched function in nettacker/core/lib/http.py when
adjusting how absent headers are evaluated so the YAML behavior aligns with that
function's expectations.
| condition_type: and | ||
| conditions: | ||
| status_code: | ||
| regex: "403" | ||
| reverse: true | ||
| content: | ||
| regex: (?i)(Cloudflare|Incapsula|Sucuri|Access Denied|Webroot|Error 403 Forbidden) | ||
| reverse: true |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Confirm iterative_response_match handling path and any pre-filtering contract.
rg -nP -C5 'iterative_response_match'Repository: OWASP/Nettacker
Length of output: 10968
🏁 Script executed:
#!/bin/bash
# Read the subdomain_takeover.yaml file around the relevant section
head -50 nettacker/modules/vuln/subdomain_takeover.yamlRepository: OWASP/Nettacker
Length of output: 1969
🏁 Script executed:
#!/bin/bash
# Search for AWS bucket takeover to understand if 403 is a legitimate concern
rg -A 20 -B 5 'AWS Bucket Takeover|AWS S3' nettacker/modules/vuln/subdomain_takeover.yamlRepository: OWASP/Nettacker
Length of output: 1182
🏁 Script executed:
#!/bin/bash
# Check the response_conditions_matched function to understand what fields it evaluates
rg -A 30 'def response_conditions_matched' nettacker/core/lib/http.pyRepository: OWASP/Nettacker
Length of output: 1613
Top-level and condition blocks AWS S3 and other 403-returning takeover detections.
With condition_type: and, these filters become a hard pre-gate: iterative_response_match runs only when status ≠ 403 and content lacks the listed terms. This causes two regressions:
- AWS S3 (and other cloud services) legitimately return 403 with "The specified bucket does not exist". The
AWS Bucket Takeoversub-rule can never fire under this logic—a direct behavioral regression. - Generic error phrases in the content regex (
Access Denied,Error 403 Forbidden) will suppress valid takeovers that happen to include these strings.
The iterative_response_match section does have a separate evaluation path (confirmed in nettacker/core/lib/http.py lines 205–211), but it is gated by lines 198–201: only runs if the top-level condition passes or condition_type == "or". With condition_type: and, a failed pre-filter blocks it entirely.
Recommended fixes:
- (a) Remove
403from the status-code exclusion; tighten the content regex to concrete WAF signatures only (Cloudflare|Incapsula|Sucuriwithout the genericAccess Denied/Error 403 Forbidden), or - (b) Move these WAF exclusions inside each provider's
iterative_response_matchentry so they don't blanket-suppress 403-based detections.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nettacker/modules/vuln/subdomain_takeover.yaml` around lines 36 - 43, The
top-level condition block using condition_type: and is blocking 403-based
takeover detections (e.g., the AWS Bucket Takeover) because status_code reverse:
"403" and broad content regex suppress iterative_response_match; fix by either
(a) removing 403 from the top-level status_code exclusion and tightening the
content regex to only vendor WAF signatures (e.g., keep
Cloudflare|Incapsula|Sucuri and drop generic "Access Denied"/"Error 403
Forbidden"), or (b) move the WAF/content exclusions into each provider's
iterative_response_match entries so they don't globally block 403 detections;
update the condition_type/status_code/content settings accordingly to ensure the
AWS Bucket Takeover sub-rule and iterative_response_match can run for 403
responses.
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@nettacker/modules/scan/dir.yaml`:
- Around line 47-49: The negative content regex under the content block (keys:
regex, reverse) is overly broad and will drop legitimate hits; tighten it by
removing generic terms like "Access Denied", "Error 403 Forbidden", "Webroot"
and vendor names that can appear in benign pages, and replace them with
anchored, block-page specific patterns (e.g., phrases like "Attention Required!
| Cloudflare", "Incapsula incident", "Sucuri WebSite Firewall", or "Access
Denied.*reference\\s*#"); keep reverse: true behavior but update the regex to
only match strong WAF/blockpage signatures so that 200/401/403 directory hits
(condition_type: and) are not suppressed erroneously.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 75e76299-5d46-4a54-bbd3-360ce925747b
📒 Files selected for processing (4)
nettacker/modules/scan/admin.yamlnettacker/modules/scan/dir.yamlnettacker/modules/vuln/clickjacking.yamlnettacker/modules/vuln/subdomain_takeover.yaml
| content: | ||
| regex: (?i)(Cloudflare|Incapsula|Sucuri|Access Denied|Webroot|Error 403 Forbidden) | ||
| reverse: true |
There was a problem hiding this comment.
Overly broad negative content regex may hide legitimate directory hits.
With condition_type: and and reverse: true, any 200/401/403 response whose body merely contains strings like Access Denied, Error 403 Forbidden, or Webroot will be silently dropped. Several concerns:
Access Denied/Error 403 Forbiddenare extremely common on legitimate protected directories (Apache/Nginx default 403 pages, IIS, Tomcat, Jenkins, many admin panels). Since the scan is explicitly configured to treat403as a hit (line 45), this filter will suppress the very signal the scan is looking for.Webrootis an endpoint AV vendor, not a common CDN/WAF block-page marker; the word also appears in plenty of legitimate web-server documentation/error pages. Likely unintended — consider dropping it or replacing with real WAF signatures (Akamai,AWS WAF,F5 BIG-IP,ModSecurity,Imperva).- The vendor names
Cloudflare|Incapsula|Sucurimatch any page that merely references them (footers, docs, analytics), not just block pages.
Prefer anchoring to strong block-page phrases, e.g. Attention Required! \| Cloudflare, Incapsula incident ID, Sucuri WebSite Firewall, Access Denied.*reference\s*#.
🔧 Suggested tightening
content:
- regex: (?i)(Cloudflare|Incapsula|Sucuri|Access Denied|Webroot|Error 403 Forbidden)
+ regex: (?i)(Attention Required!\s*\|\s*Cloudflare|Incapsula incident ID|Sucuri WebSite Firewall|Access Denied.*reference\s*#)
reverse: true📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| content: | |
| regex: (?i)(Cloudflare|Incapsula|Sucuri|Access Denied|Webroot|Error 403 Forbidden) | |
| reverse: true | |
| content: | |
| regex: (?i)(Attention Required!\s*\|\s*Cloudflare|Incapsula incident ID|Sucuri WebSite Firewall|Access Denied.*reference\s*#) | |
| reverse: true |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@nettacker/modules/scan/dir.yaml` around lines 47 - 49, The negative content
regex under the content block (keys: regex, reverse) is overly broad and will
drop legitimate hits; tighten it by removing generic terms like "Access Denied",
"Error 403 Forbidden", "Webroot" and vendor names that can appear in benign
pages, and replace them with anchored, block-page specific patterns (e.g.,
phrases like "Attention Required! | Cloudflare", "Incapsula incident", "Sucuri
WebSite Firewall", or "Access Denied.*reference\\s*#"); keep reverse: true
behavior but update the regex to only match strong WAF/blockpage signatures so
that 200/401/403 directory hits (condition_type: and) are not suppressed
erroneously.
This PR significantly improves detection accuracy and reduces false positives for several core modules by implementing robust negative matching and WAF filtering logic for #1521.
Key Changes:
clickjacking.yamlto utilize Nettacker's nativereverse: truefunctionality. This removes complex, computationally expensive negative lookahead regex patterns and improves maintainability.admin.yamlanddir.yaml. This prevents these modules from incorrectly reporting a "Found" directory when a Web Application Firewall (like Cloudflare, Incapsula, or Sucuri) returns a generic403 Forbiddenblock page.subdomain_takeover.yamlwith global negative conditions. The module now ignores403status codes and specific WAF-related content strings, ensuring that blocked requests do not trigger unintended takeover signatures.These updates help ensure that Nettacker provides more reliable results when scanning modern infrastructure protected by security proxies.
Fixes # (link your issue here)
Type of change
Checklist
make pre-commitand confirm it didn't generate any warnings/changesmake testand I confirm all tests passed locallydocs/folder