VPR-141 feat(healthchecks): add /health endpoints and UI dashboard#159
VPR-141 feat(healthchecks): add /health endpoints and UI dashboard#159
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #159 +/- ##
==========================================
- Coverage 43.24% 42.89% -0.35%
==========================================
Files 862 873 +11
Lines 50405 51007 +602
Branches 4706 4759 +53
==========================================
+ Hits 21796 21879 +83
- Misses 28086 28605 +519
Partials 523 523
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Pull request overview
Adds operational health checking to the web app: anonymous liveness, IP-gated readiness details, and an internal HealthChecks.UI dashboard with UC Davis branding and reduced external probe traffic via adaptive polling.
Changes:
- Introduces
/healthand/health/detailendpoints plus the/healthchecksUI, including IP allowlisting and CSP bypass for the UI bundle. - Adds multiple new health checks (DB contexts, disk space, LDAP, SMTP, CAS/VMACs HTTP probes, AWS SSM) and an adaptive polling decorator to reduce probe frequency when healthy.
- Updates Jenkins deploy stages to poll
/healthafter deploy; adds UI branding assets and a small injected JS enhancer.
Reviewed changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| web/wwwroot/js/healthchecks-ui-extras.js | Injected UI script to humanize duration cells and show a campus-status banner. |
| web/wwwroot/healthchecks-ui-logo.png | New logo asset for HealthChecks.UI branding. |
| web/wwwroot/css/healthchecks-ui-branding.css | UC Davis palette + UI CSS tweaks (contrast, layout, banner styling). |
| web/appsettings.json | Expands InternalAllowlist to CIDR ranges for health detail/UI access. |
| web/Viper.csproj | Adds DotNetDiag HealthChecks.UI packages and EF health checks package reference. |
| web/Program.cs | Hooks health check DI + pipeline wiring; conditionally applies CSP outside UI paths. |
| web/Classes/HealthChecks/SmtpHealthCheck.cs | MailKit-based SMTP reachability/TLS probe. |
| web/Classes/HealthChecks/LdapHealthCheck.cs | Real LDAPS bind probe for directory health. |
| web/Classes/HealthChecks/HttpEndpointHealthCheck.cs | Generic HTTP endpoint reachability probe. |
| web/Classes/HealthChecks/HealthCheckExtensions.cs | Centralized health checks registration + endpoint/UI mapping + UI HTML injection. |
| web/Classes/HealthChecks/DiskSpaceHealthCheck.cs | Drive space (and optional writability) checks for app/photos/CMS/log paths. |
| web/Classes/HealthChecks/AwsSsmHealthCheck.cs | AWS SSM reachability probe using a lightweight DescribeParameters call. |
| web/Classes/HealthChecks/AdaptivePollingHealthCheck.cs | Caches health results with status-dependent TTLs to reduce probe load. |
| JenkinsFile | Adds post-deploy /health polling for test and prod. |
There was a problem hiding this comment.
Pull request overview
Adds first-class health check endpoints and an operator-facing HealthChecks.UI dashboard to VIPER, including custom probes (DB, disk, LDAP/CAS/SMTP/SSM) and Jenkins post-deploy verification.
Changes:
- Introduces
/healthliveness,/health/detailreadiness JSON, and/healthchecksUI with IP allowlisting. - Adds multiple custom
IHealthCheckimplementations plus an adaptive polling decorator to reduce external traffic. - Updates Jenkins deploy stages to poll
/healthafter deployment.
Reviewed changes
Copilot reviewed 13 out of 14 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| web/wwwroot/js/healthchecks-ui-extras.js | UI-side DOM tweaks (duration humanizer + campus-status banner). |
| web/wwwroot/healthchecks-ui-logo.png | Adds UC Davis-branded logo asset for the dashboard. |
| web/wwwroot/css/healthchecks-ui-branding.css | Custom palette/branding + minor UI layout/accessibility tweaks. |
| web/appsettings.json | Expands InternalAllowlist to CIDR ranges for readiness/UI access. |
| web/Viper.csproj | Adds HealthChecks.UI + EF Core health check package references. |
| web/Program.cs | Hooks health check DI + pipeline wiring; skips CSP on UI paths. |
| web/Classes/HealthChecks/SmtpHealthCheck.cs | Adds SMTP relay probe via MailKit connect/noop/disconnect. |
| web/Classes/HealthChecks/LdapHealthCheck.cs | Adds LDAPS bind probe matching existing LDAP service settings. |
| web/Classes/HealthChecks/HttpEndpointHealthCheck.cs | Adds generic HTTP reachability probe for CAS/VMACs. |
| web/Classes/HealthChecks/HealthCheckExtensions.cs | Centralizes health check registration, endpoint mapping, UI config, and response-body script injection. |
| web/Classes/HealthChecks/DiskSpaceHealthCheck.cs | Adds disk free-space (and optional writability) probe for key volumes. |
| web/Classes/HealthChecks/AwsSsmHealthCheck.cs | Adds lightweight SSM reachability probe. |
| web/Classes/HealthChecks/AdaptivePollingHealthCheck.cs | Adds status-based caching to reduce expensive probe frequency. |
| JenkinsFile | Adds post-deploy /health polling for test and prod stages. |
Bundle ReportChanges will increase total bundle size by 2.92kB (0.14%) ⬆️. This is within the configured threshold ✅ Detailed changes
Affected Assets, Files, and Routes:view changes for bundle: viper-frontend-esmAssets Changed:
|
80e474c to
24f416c
Compare
There was a problem hiding this comment.
Pull request overview
Adds first-class health checking to VIPER, including liveness/readiness endpoints for deploy automation and an IP-gated HealthChecks.UI dashboard tailored to campus ops needs.
Changes:
- Introduces
/health(anonymous liveness) and/health/detail(IP-gated readiness with tagged checks) plus HealthChecks.UI at/healthchecks. - Adds health check implementations (LDAP, SMTP, HTTP endpoint probes, disk space, AWS SSM) and an adaptive polling decorator to reduce external probe traffic.
- Updates Jenkins deploy stages to poll
/2/healthpost-deploy; adds UC Davis branding + UI tweaks (duration humanizer + campus-status banner).
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| web/wwwroot/js/healthchecks-ui-extras.js | Injected UI tweaks (duration humanizer, campus-status banner) via MutationObserver. |
| web/wwwroot/css/healthchecks-ui-branding.css | UC Davis palette + UI readability adjustments + campus-status banner styling. |
| web/appsettings.json | Replaces a single internal allowlisted IP with CIDR ranges for staff + infra. |
| web/Viper.csproj | Adds DotNetDiag HealthChecks.UI packages and EF Core health check package. |
| web/Program.cs | Hooks health check DI + pipeline wiring; skips CSP on HealthChecks.UI paths. |
| web/Classes/HealthChecks/SmtpHealthCheck.cs | MailKit-based SMTP reachability probe. |
| web/Classes/HealthChecks/LdapHealthCheck.cs | Real LDAPS bind probe using existing LDAP service credentials. |
| web/Classes/HealthChecks/HttpEndpointHealthCheck.cs | HTTP(S) reachability probe (treats non-5xx as healthy). |
| web/Classes/HealthChecks/HealthCheckExtensions.cs | Centralizes health check registration, endpoint mapping, UI wiring, and IP gating. |
| web/Classes/HealthChecks/DiskSpaceHealthCheck.cs | Disk free-space (and optional writability) probe for key volumes/paths. |
| web/Classes/HealthChecks/AwsSsmHealthCheck.cs | AWS SSM reachability probe via DescribeParameters. |
| web/Classes/HealthChecks/AdaptivePollingHealthCheck.cs | Caches healthy vs unhealthy results for different durations to reduce probe load. |
| JenkinsFile | Adds post-deploy polling of /2/health for TEST and PROD. |
There was a problem hiding this comment.
Pull request overview
Adds operational health monitoring to VIPER by introducing anonymous liveness and IP-gated readiness endpoints, plus a branded HealthChecks.UI dashboard tailored for the campus environment (path base, reduced external probing, and operator-focused UI tweaks).
Changes:
- Adds
/healthliveness and/health/detailreadiness endpoints plus a HealthChecks.UI dashboard (IP-restricted, CSP-exempted for UI paths). - Implements custom health checks (LDAP bind, SMTP connect, HTTP probe, AWS SSM probe, disk space) and an adaptive polling decorator to reduce external traffic.
- Updates Jenkins deploy stages to poll
/healthafter deployment and updates allowlists to CIDR ranges.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| web/wwwroot/js/healthchecks-ui-extras.js | Injected client-side UI enhancements (duration formatting + campus-status banner). |
| web/wwwroot/css/healthchecks-ui-branding.css | HealthChecks.UI UC Davis branding + accessibility/UX tweaks. |
| web/appsettings.json | Replaces single internal IP with CIDR-based allowlist ranges. |
| web/Viper.csproj | Adds DotNetDiag HealthChecks.UI packages and EF Core health check package. |
| web/Program.cs | Wires health checks via extensions and skips CSP for HealthChecks.UI paths. |
| web/Classes/HealthChecks/SmtpHealthCheck.cs | New MailKit-based SMTP probe health check. |
| web/Classes/HealthChecks/LdapHealthCheck.cs | New LDAPS bind health check (Windows-only). |
| web/Classes/HealthChecks/HttpEndpointHealthCheck.cs | New HTTP reachability probe health check. |
| web/Classes/HealthChecks/HealthCheckExtensions.cs | Central DI + pipeline wiring for health endpoints/UI, IP gating, and script injection. |
| web/Classes/HealthChecks/DiskSpaceHealthCheck.cs | New disk space (and optional writability) health check. |
| web/Classes/HealthChecks/AwsSsmHealthCheck.cs | New AWS SSM reachability health check. |
| web/Classes/HealthChecks/AdaptivePollingHealthCheck.cs | New decorator to cache healthy results longer than unhealthy ones. |
| JenkinsFile | Adds post-deploy /health polling for test and prod stages. |
4752b79 to
a465446
Compare
- /health (anonymous liveness for Jenkins) + /health/detail (tagged "ready", IP-gated to SVM /20 and infra /24, not CAS-gated so it stays reachable when auth is degraded). - HealthChecks.UI dashboard at /healthchecks with UC Davis branding, duration humanizer, and a campus-status banner that appears when any campus-* check is non-healthy. - Adaptive polling decorator on campus checks (LDAP/CAS/SMTP/VMACs): healthy results cached 1 hour, failures re-probe every 5 min (one UI poll cycle). Cuts external traffic from 12/hour to 1/hour per instance while healthy. - Real LDAPS bind, MailKit SMTP connect, AWS SSM probe, disk checks for app/photos/CMS/logs, EF DbContext checks for all contexts. - Adopts DotNetDiag.HealthChecks.UI 10.0.7 (fork of abandoned Xabaril packages; upstream does not build on .NET 10). Pinned exactly. - Jenkins Deploy stages now poll /health post-deploy.
a465446 to
77f2c9f
Compare
- F5 forwards client IP via X-Forwarded-For, so traffic from 192.168.56.0/24 hosts hitting the public URL gates on the egress IP, not the internal interface - the SVM infra entry never matched in practice. - 169.237.251.0/24 covers the campus VPN pool so on-VPN admins can reach /health/detail and /healthchecks.
864b4a4 to
462496d
Compare
- Cloudflare fronts vetmed.ucdavis.edu, so the proxy chain is User -> CF -> F5 -> app. Without trusting CF as a known proxy, UseForwardedHeaders ignores XFF and RemoteIpAddress stays at the CF edge - breaking every IP-based allowlist in the app, not just health checks. - Fetches CF's published v4/v6 CIDRs from cloudflare.com at startup (cached for the process lifetime) and adds them to KnownIPNetworks. Falls back to a hardcoded snapshot on fetch failure so a CF blip during deploy doesn't break startup. - Bumps ForwardLimit to 2 so the chain walk continues through both proxy hops (F5 and CF) and lands on the real client IP rather than stopping at the CF edge.
1058a5c to
e2e1665
Compare
|
@bsedwards The detailed health check is at https://secure-test.vetmed.ucdavis.edu/2/healthchecks It is IP restricted to:
|
bsedwards
left a comment
There was a problem hiding this comment.
A couple things:
- Cloudflare is only in front of secure-test at the moment. We are working on getting internal NAT'ed IPs not routed through Cloudflare so we can "see" the internal IP and not the F5's external IP. Cloudflare will be in front of Prod servers at some point, but Ops is still working out the kinks.
- Can we either restrict the detail and UI health check endpoints to just developer and system (e.g. Jenkins) IPs, or require some other token based or basic authentication? Allowing access to entire networks is overly broad.
- UI collector now self-calls via localhost so the request bypasses Cloudflare/F5; the public URL's NAT'd source IP fails the narrowed allowlist.
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughIssue: lack of comprehensive health monitoring and proxy-aware forwarding. Fix: adds multiple health checks (disk, AWS SSM, HTTP, LDAP, SMTP), adaptive polling, HealthChecks.UI and branding, Cloudflare CIDR fetching for ForwardedHeaders, Jenkins post-deploy health probes, and updated IP allowlist. Changes
Sequence Diagram(s)sequenceDiagram
participant Startup as Application Startup
participant CF as Cloudflare Endpoint
participant Registry as HealthCheck Registry
participant Middleware as ForwardedHeaders & HealthEndpoints
participant Jenkins as Jenkins Pipeline
Startup->>CF: FetchOrFallback(logger) (ips-v4, ips-v6)
alt fetch success
CF-->>Startup: CIDR list
else fetch failure / timeout
CF-->>Startup: Hardcoded fallback CIDRs
end
Startup->>Registry: AddViperHealthChecks() (register Disk, AWS SSM, HTTP, LDAP, SMTP, Adaptive)
Registry-->>Startup: Health checks configured
Startup->>Middleware: Configure ForwardedHeaders(ForwardLimit=2, KnownIPNetworks=CF CIDRs)
Startup->>Middleware: UseViperHealthChecks() (map /health, /health/detail, UI gating)
Jenkins->>Middleware: Post-deploy probe /2/health (retry loop)
alt 200 received
Middleware-->>Jenkins: 200 OK (deployment healthy)
else timeout/exhausted
Middleware-->>Jenkins: non-200 / failure (pipeline stage fails)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 17 minutes and 49 seconds.Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@JenkinsFile`:
- Around line 146-165: The JenkinsFile contains duplicated PowerShell
health-check blocks; extract them into a single reusable Groovy function (e.g.,
verifyHealth(String url)) that wraps the existing powershell step logic and
accepts the URL as a parameter, then replace the two inline powershell blocks
with calls to verifyHealth("<env-specific-url>"); ensure the function preserves
the maxAttempts, delay logic, try/catch and exit codes exactly and keep the
powershell label 'Verify /health returns 200' so behavior and logs remain
unchanged.
In `@web/Classes/CloudflareNetworks.cs`:
- Around line 40-57: The current catch in FetchOrFallback only handles
HttpRequestException and TaskCanceledException, but DNS/socket failures can
surface as other exception types; change the exception handling on
FetchOrFallback to catch broader failures (e.g., catch Exception ex or add
SocketException) so any network/DNS/socket error is logged via logger.Warn and
the method returns HardcodedFallback; keep the existing log message and return
behavior, and ensure the method still uses HardcodedFallback when any
network-related exception occurs.
In `@web/Classes/HealthChecks/AwsSsmHealthCheck.cs`:
- Around line 36-46: The health check in AwsSsmHealthCheck currently calls
DescribeParametersAsync which requires different IAM rights than the config
loader; change the probe to call GetParametersByPathAsync (or GetParameterAsync
for a known key) against the same path/key your configuration loader uses so the
health check uses the same IAM permissions. In practice, inside the health check
method replace the DescribeParametersAsync call with a call to
client.GetParametersByPathAsync(new GetParametersByPathRequest { Path = <your
configured path prefix>, MaxResults = 1 }, cancellationToken) (or
client.GetParameterAsync for a single known parameter), ensure you pass
cancellationToken, and keep the existing exception handling that checks
_healthyWhenMissing and returns HealthCheckResult accordingly; update references
in AwsSsmHealthCheck to use the configuration path field (e.g. the class’s
path/prefix member) along with existing _region and _healthyWhenMissing.
In `@web/Classes/HealthChecks/HealthCheckExtensions.cs`:
- Around line 139-143: Wrap the call that constructs a Uri from
EmailSettings:BaseUrl in a safe check inside
AddViperHealthChecks/HealthCheckExtensions: instead of directly calling new
Uri(baseUrl).AbsolutePath, use Uri.TryCreate to validate the baseUrl (or catch
UriFormatException around new Uri) and if it fails log a warning and fall back
to a safe default path (e.g., "/") before continuing to register the health
check; update the code around the new Uri(baseUrl).AbsolutePath reference so
malformed EmailSettings:BaseUrl no longer crashes startup and the
HttpEndpointHealthCheck receives a validated/fallback path.
- Around line 194-203: HealthCheckExtensions.cs currently constructs a Uri from
configuration["EmailSettings:BaseUrl"] which can throw UriFormatException on
malformed/scheme-less values; change the logic that computes healthEndpointUrl
to validate baseUrl using Uri.TryCreate(..., UriKind.Absolute, out var uri)
(matching the pattern used in VerificationService.cs) and when TryCreate fails
or baseUrl is null/whitespace, set healthEndpointUrl to "/health/detail"; when
TryCreate succeeds use uri.AbsolutePath.TrimEnd('/') to build
$"http://localhost{pathBase}/health/detail".
In `@web/Classes/HealthChecks/HttpEndpointHealthCheck.cs`:
- Around line 39-43: The health check in HttpEndpointHealthCheck currently calls
client.GetAsync(_url, cancellationToken) which buffers the entire response body;
change the call in the CheckHealthAsync method to use the overload that includes
HttpCompletionOption.ResponseHeadersRead so only headers are read
(client.GetAsync(_url, HttpCompletionOption.ResponseHeadersRead,
cancellationToken)), preserving the existing response disposal and status code
logic so the probe returns based on StatusCode without downloading the full
body.
In `@web/wwwroot/css/healthchecks-ui-branding.css`:
- Around line 49-55: Update the decorative left border in the
`#campus-status-banner` rule to use rem units instead of pixels; replace the
hardcoded "4px" value for "border-left" with an equivalent rem measurement
(e.g., 0.25rem or whatever matches your base font sizing) so the selector
"#campus-status-banner" follows the project's sizing guideline using rem units.
In `@web/wwwroot/js/healthchecks-ui-extras.js`:
- Around line 43-54: The hasUnhealthyCampusCheck function uses a fragile DOM
selector (row.querySelector(".hc-status .material-icons") and comparing
icon.textContent to "check_circle") that is tightly coupled to Xabaril UI
internals; add an inline comment immediately above this selector explaining that
this check depends on the current forked UI structure/icon name, that changes
upstream could break it, and mark a TODO to replace with a more robust status
API or data-attribute if/when available (include function name
hasUnhealthyCampusCheck and the selector string in the comment for clarity).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 60ec9da2-248f-413f-bcda-83460be8a905
📒 Files selected for processing (14)
JenkinsFileweb/Classes/CloudflareNetworks.csweb/Classes/HealthChecks/AdaptivePollingHealthCheck.csweb/Classes/HealthChecks/AwsSsmHealthCheck.csweb/Classes/HealthChecks/DiskSpaceHealthCheck.csweb/Classes/HealthChecks/HealthCheckExtensions.csweb/Classes/HealthChecks/HttpEndpointHealthCheck.csweb/Classes/HealthChecks/LdapHealthCheck.csweb/Classes/HealthChecks/SmtpHealthCheck.csweb/Program.csweb/Viper.csprojweb/appsettings.jsonweb/wwwroot/css/healthchecks-ui-branding.cssweb/wwwroot/js/healthchecks-ui-extras.js
| powershell label: 'Verify /health returns 200', script: ''' | ||
| $url = "https://secure.vetmed.ucdavis.edu/2/health" | ||
| $maxAttempts = 15 | ||
| for ($i = 1; $i -le $maxAttempts; $i++) { | ||
| $delay = if ($i -le 5) { 2 } else { 4 } | ||
| try { | ||
| $response = Invoke-WebRequest -Uri $url -UseBasicParsing -TimeoutSec 10 | ||
| if ($response.StatusCode -eq 200) { | ||
| Write-Host "Attempt ${i}: $url returned 200 OK" | ||
| exit 0 | ||
| } | ||
| Write-Host "Attempt ${i}: $url returned $($response.StatusCode)" | ||
| } catch { | ||
| Write-Host "Attempt ${i}: $($_.Exception.Message)" | ||
| } | ||
| if ($i -lt $maxAttempts) { Start-Sleep -Seconds $delay } | ||
| } | ||
| Write-Error "Health check at $url failed after $maxAttempts attempts" | ||
| exit 1 | ||
| ''' |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | 💤 Low value
Consider extracting duplicated health-check logic.
The test and prod PowerShell blocks are identical except for the URL. A shared function would reduce maintenance burden.
♻️ Example consolidation
def verifyHealth(String url) {
powershell label: 'Verify /health returns 200', script: """
\$url = "$url"
\$maxAttempts = 15
for (\$i = 1; \$i -le \$maxAttempts; \$i++) {
\$delay = if (\$i -le 5) { 2 } else { 4 }
try {
\$response = Invoke-WebRequest -Uri \$url -UseBasicParsing -TimeoutSec 10
if (\$response.StatusCode -eq 200) {
Write-Host "Attempt \${\$i}: \$url returned 200 OK"
exit 0
}
Write-Host "Attempt \${\$i}: \$url returned \$(\$response.StatusCode)"
} catch {
Write-Host "Attempt \${\$i}: \$(\$_.Exception.Message)"
}
if (\$i -lt \$maxAttempts) { Start-Sleep -Seconds \$delay }
}
Write-Error "Health check at \$url failed after \$maxAttempts attempts"
exit 1
"""
}Then call verifyHealth("https://secure-test.vetmed.ucdavis.edu/2/health") in each stage.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@JenkinsFile` around lines 146 - 165, The JenkinsFile contains duplicated
PowerShell health-check blocks; extract them into a single reusable Groovy
function (e.g., verifyHealth(String url)) that wraps the existing powershell
step logic and accepts the URL as a parameter, then replace the two inline
powershell blocks with calls to verifyHealth("<env-specific-url>"); ensure the
function preserves the maxAttempts, delay logic, try/catch and exit codes
exactly and keep the powershell label 'Verify /health returns 200' so behavior
and logs remain unchanged.
| public static IReadOnlyList<string> FetchOrFallback(NLog.Logger logger) | ||
| { | ||
| try | ||
| { | ||
| using var http = new HttpClient { Timeout = TimeSpan.FromSeconds(5) }; | ||
| var v4 = http.GetStringAsync("https://www.cloudflare.com/ips-v4/").GetAwaiter().GetResult(); | ||
| var v6 = http.GetStringAsync("https://www.cloudflare.com/ips-v6/").GetAwaiter().GetResult(); | ||
| var cidrs = (v4 + "\n" + v6) | ||
| .Split('\n', StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries); | ||
| logger.Info("Fetched {Count} Cloudflare networks from cloudflare.com", cidrs.Length); | ||
| return cidrs; | ||
| } | ||
| catch (Exception ex) when (ex is HttpRequestException or TaskCanceledException) | ||
| { | ||
| logger.Warn(ex, "Failed to fetch Cloudflare IP ranges; using hardcoded fallback ({Count} entries)", HardcodedFallback.Length); | ||
| return HardcodedFallback; | ||
| } | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | 💤 Low value
LGTM. The synchronous blocking via GetAwaiter().GetResult() is appropriate here since this runs once at startup before the host starts. The 5-second timeout and hardcoded fallback provide resilience.
One edge case: SocketException or DNS failures may not always be wrapped in HttpRequestException depending on the runtime. Consider broadening the catch or adding a general fallback.
♻️ Broader exception handling
- catch (Exception ex) when (ex is HttpRequestException or TaskCanceledException)
+ catch (Exception ex) when (ex is HttpRequestException or TaskCanceledException or System.Net.Sockets.SocketException)Or use a general catch that still logs and falls back gracefully:
- catch (Exception ex) when (ex is HttpRequestException or TaskCanceledException)
+ catch (Exception ex)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@web/Classes/CloudflareNetworks.cs` around lines 40 - 57, The current catch in
FetchOrFallback only handles HttpRequestException and TaskCanceledException, but
DNS/socket failures can surface as other exception types; change the exception
handling on FetchOrFallback to catch broader failures (e.g., catch Exception ex
or add SocketException) so any network/DNS/socket error is logged via
logger.Warn and the method returns HardcodedFallback; keep the existing log
message and return behavior, and ensure the method still uses HardcodedFallback
when any network-related exception occurs.
| using var client = new AmazonSimpleSystemsManagementClient(_region); | ||
| await client.DescribeParametersAsync( | ||
| new DescribeParametersRequest { MaxResults = 1 }, | ||
| cancellationToken); | ||
| return HealthCheckResult.Healthy("AWS SSM reachable."); | ||
| } | ||
| catch (AmazonServiceException ex) | ||
| { | ||
| return _healthyWhenMissing | ||
| ? HealthCheckResult.Healthy("AWS SSM not configured (skipped).") | ||
| : HealthCheckResult.Unhealthy($"AWS SSM unreachable: {ex.ErrorCode}."); |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
What AWS SSM API operations does Amazon.Extensions.Configuration.SystemsManager use to load configuration, and how do their IAM permissions differ from DescribeParameters?
💡 Result:
Amazon.Extensions.Configuration.SystemsManager primarily uses the AWS SSM API operation GetParametersByPath (with Recursive=true and WithDecryption=true) to load configuration parameters from Parameter Store when a path prefix is specified (the typical usage via AddSystemsManager(path)). It may also use GetParameters (batched) if specific ParameterNames are provided, or GetParameter for Secrets Manager-referenced paths. The corresponding IAM permission is ssm:GetParametersByPath on the resource ARN matching the path prefix (e.g., "arn:aws:ssm:::parameter/myapp/"). This differs from ssm:DescribeParameters (used by the DescribeParameters API), which only lists metadata about parameters (names, types, etc.) but does not return parameter values. ssm:DescribeParameters requires no resource-level restrictions (often "Resource": "") and cannot access SecureString values or decrypt them. In contrast, ssm:GetParametersByPath directly retrieves and decrypts values but requires permission on the specific parameter resources matching the path, providing finer-grained control over value access.
Citations:
- 1: https://github.com/aws/aws-dotnet-extensions-configuration
- 2: https://github.com/aws/aws-dotnet-extensions-configuration/blob/master/src/Amazon.Extensions.Configuration.SystemsManager/Internal/SystemsManagerProcessor.cs
- 3: https://docs.aws.amazon.com/systems-manager/latest/userguide/sysman-paramstore-access.html
- 4: https://docs.aws.amazon.com/systems-manager/latest/APIReference/API_GetParametersByPath.html
Use the same SSM operation the configuration loader uses.
The health check calls DescribeParameters to validate AWS SSM reachability, but the app loads configuration via GetParametersByPath. These require different IAM permissions (ssm:DescribeParameters vs. ssm:GetParametersByPath). A least-privilege role with only read access to the config path will fail this readiness check even though startup succeeds.
Probe an actual parameter from the configured path instead—either query a known key with GetParameter or use GetParametersByPath with the same path prefix the configuration loader uses.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@web/Classes/HealthChecks/AwsSsmHealthCheck.cs` around lines 36 - 46, The
health check in AwsSsmHealthCheck currently calls DescribeParametersAsync which
requires different IAM rights than the config loader; change the probe to call
GetParametersByPathAsync (or GetParameterAsync for a known key) against the same
path/key your configuration loader uses so the health check uses the same IAM
permissions. In practice, inside the health check method replace the
DescribeParametersAsync call with a call to client.GetParametersByPathAsync(new
GetParametersByPathRequest { Path = <your configured path prefix>, MaxResults =
1 }, cancellationToken) (or client.GetParameterAsync for a single known
parameter), ensure you pass cancellationToken, and keep the existing exception
handling that checks _healthyWhenMissing and returns HealthCheckResult
accordingly; update references in AwsSsmHealthCheck to use the configuration
path field (e.g. the class’s path/prefix member) along with existing _region and
_healthyWhenMissing.
| var baseUrl = configuration["EmailSettings:BaseUrl"]?.TrimEnd('/'); | ||
| string healthEndpointUrl; | ||
| if (string.IsNullOrWhiteSpace(baseUrl)) | ||
| { | ||
| healthEndpointUrl = "/health/detail"; | ||
| } | ||
| else | ||
| { | ||
| var pathBase = new Uri(baseUrl).AbsolutePath.TrimEnd('/'); | ||
| healthEndpointUrl = $"http://localhost{pathBase}/health/detail"; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
fd -i 'appsettings*.json' . -x rg -n '"BaseUrl"\s*:' {}Repository: ucdavis/VIPER
Length of output: 39
🏁 Script executed:
# Find and examine the HealthCheckExtensions.cs file
fd -t f 'HealthCheckExtensions.cs' . -x wc -lRepository: ucdavis/VIPER
Length of output: 113
🏁 Script executed:
# Read the file around lines 194-203
fd -t f 'HealthCheckExtensions.cs' . -x cat -n | sed -n '180,220p'Repository: ucdavis/VIPER
Length of output: 2230
🏁 Script executed:
# Search for EmailSettings configuration patterns in the codebase
rg -i 'emailsettings|baseurl' --type cs -B 2 -A 2Repository: ucdavis/VIPER
Length of output: 31888
🏁 Script executed:
# Check if Uri.TryCreate is already used elsewhere in the codebase
rg 'Uri\.TryCreate' --type csRepository: ucdavis/VIPER
Length of output: 898
🏁 Script executed:
# Check if new Uri() is used elsewhere without Try-pattern
rg 'new Uri\(' --type cs -B 1 -A 1 | head -40Repository: ucdavis/VIPER
Length of output: 2388
🏁 Script executed:
# Confirm UriFormatException is the thrown exception for invalid URIs
python3 <<'EOF'
# Verify Uri constructor behavior
try:
from urllib.parse import urlparse
# Simulate what happens with scheme-less or malformed URLs
test_urls = [
"localhost:5000", # scheme-less
"///invalid", # invalid format
"http://valid.com" # valid
]
print("Testing invalid URIs that would fail in .NET Uri constructor:")
for url in test_urls:
try:
# This shows why Uri(url) without scheme check fails
result = urlparse(url)
if not result.scheme:
print(f" '{url}' → scheme-less (Uri constructor would throw)")
else:
print(f" '{url}' → valid")
except Exception as e:
print(f" '{url}' → error: {e}")
EOFRepository: ucdavis/VIPER
Length of output: 170
🏁 Script executed:
# Check appsettings files to see if BaseUrl is ever explicitly set
find . -name 'appsettings*.json' -o -name 'appsettings*.Development.json' | xargs cat 2>/dev/null | grep -i baseurl || echo "No BaseUrl found in config files"Repository: ucdavis/VIPER
Length of output: 325
Guard BaseUrl parsing before service registration.
new Uri(baseUrl) throws UriFormatException during app startup when EmailSettings:BaseUrl is malformed or scheme-less. Use Uri.TryCreate(..., UriKind.Absolute, out var uri) and fall back to "/health/detail" when the value is invalid, matching the pattern already in VerificationService.cs.
Suggested fix
- else
- {
- var pathBase = new Uri(baseUrl).AbsolutePath.TrimEnd('/');
- healthEndpointUrl = $"http://localhost{pathBase}/health/detail";
- }
+ else if (Uri.TryCreate(baseUrl, UriKind.Absolute, out var uri))
+ {
+ var pathBase = uri.AbsolutePath.TrimEnd('/');
+ healthEndpointUrl = $"http://localhost{pathBase}/health/detail";
+ }
+ else
+ {
+ healthEndpointUrl = "/health/detail";
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@web/Classes/HealthChecks/HealthCheckExtensions.cs` around lines 194 - 203,
HealthCheckExtensions.cs currently constructs a Uri from
configuration["EmailSettings:BaseUrl"] which can throw UriFormatException on
malformed/scheme-less values; change the logic that computes healthEndpointUrl
to validate baseUrl using Uri.TryCreate(..., UriKind.Absolute, out var uri)
(matching the pattern used in VerificationService.cs) and when TryCreate fails
or baseUrl is null/whitespace, set healthEndpointUrl to "/health/detail"; when
TryCreate succeeds use uri.AbsolutePath.TrimEnd('/') to build
$"http://localhost{pathBase}/health/detail".
| using var response = await client.GetAsync(_url, cancellationToken); | ||
| var code = (int)response.StatusCode; | ||
| return code < 500 | ||
| ? HealthCheckResult.Healthy($"{_displayName} reachable (HTTP {code}).") | ||
| : HealthCheckResult.Unhealthy($"{_displayName} returned HTTP {code}."); |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In .NET HttpClient, does GetAsync(string, CancellationToken) buffer the full response body by default, and is HttpCompletionOption.ResponseHeadersRead the right option for a health check that only needs the status code?
💡 Result:
Yes—by default, HttpClient’s GetAsync (without HttpCompletionOption.ResponseHeadersRead) buffers the entire response body before completing the task, reading all response content into memory. HttpCompletionOption.ResponseHeadersRead is the right option for a health check that only needs the status code/headers (it completes once the response headers are available and does not read/buffer the content yet). Practical implication: if you use ResponseHeadersRead, you should validate the status code immediately (e.g., response.EnsureSuccessStatusCode) and avoid reading the body unless you truly need it; if you do need to read the body, you’ll need to apply your own timeout/cancellation to the content read separately because HttpClient.Timeout (and the overload’s cancellationToken) only applies up to the point headers are read when using ResponseHeadersRead.
Citations:
- 1: https://learn.microsoft.com/en-us/dotnet/fundamentals/runtime-libraries/system-net-http-httpclient
- 2: https://learn.microsoft.com/en-us/dotnet/api/system.net.http.httpcompletionoption?view=net-7.0
- 3: https://learn.microsoft.com/en-us/dotnet/api/system.net.http.httpcompletionoption?view=net-10.0
Use HttpCompletionOption.ResponseHeadersRead for status-only probes.
GetAsync(_url, cancellationToken) buffers the full response body by default. This health check only reads StatusCode, so a slow or large body unnecessarily delays the probe, creating false negatives and wasting bandwidth on every poll.
Suggested fix
- using var response = await client.GetAsync(_url, cancellationToken);
+ using var response = await client.GetAsync(
+ _url,
+ HttpCompletionOption.ResponseHeadersRead,
+ cancellationToken);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| using var response = await client.GetAsync(_url, cancellationToken); | |
| var code = (int)response.StatusCode; | |
| return code < 500 | |
| ? HealthCheckResult.Healthy($"{_displayName} reachable (HTTP {code}).") | |
| : HealthCheckResult.Unhealthy($"{_displayName} returned HTTP {code}."); | |
| using var response = await client.GetAsync( | |
| _url, | |
| HttpCompletionOption.ResponseHeadersRead, | |
| cancellationToken); | |
| var code = (int)response.StatusCode; | |
| return code < 500 | |
| ? HealthCheckResult.Healthy($"{_displayName} reachable (HTTP {code}).") | |
| : HealthCheckResult.Unhealthy($"{_displayName} returned HTTP {code}."); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@web/Classes/HealthChecks/HttpEndpointHealthCheck.cs` around lines 39 - 43,
The health check in HttpEndpointHealthCheck currently calls
client.GetAsync(_url, cancellationToken) which buffers the entire response body;
change the call in the CheckHealthAsync method to use the overload that includes
HttpCompletionOption.ResponseHeadersRead so only headers are read
(client.GetAsync(_url, HttpCompletionOption.ResponseHeadersRead,
cancellationToken)), preserving the existing response disposal and status code
logic so the probe returns based on StatusCode without downloading the full
body.
| #campus-status-banner { | ||
| margin: 0 2rem 1rem 2rem; | ||
| padding: 0.75rem 1rem; | ||
| background-color: #fff8e1; | ||
| border-left: 4px solid var(--warningColor); | ||
| color: var(--darkColor); | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | 💤 Low value
Minor: Line 53 uses 4px for border-left. Per coding guidelines, prefer rem for sizing.
Given this is a decorative border for a third-party UI and unlikely to cause issues, this is very low priority.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@web/wwwroot/css/healthchecks-ui-branding.css` around lines 49 - 55, Update
the decorative left border in the `#campus-status-banner` rule to use rem units
instead of pixels; replace the hardcoded "4px" value for "border-left" with an
equivalent rem measurement (e.g., 0.25rem or whatever matches your base font
sizing) so the selector "#campus-status-banner" follows the project's sizing
guideline using rem units.
| function hasUnhealthyCampusCheck() { | ||
| const rows = document.querySelectorAll("tr"); | ||
| for (const row of rows) { | ||
| const nameCell = row.querySelector("td"); | ||
| if (!nameCell) continue; | ||
| if (!nameCell.textContent.trim().startsWith("campus-")) continue; | ||
| const icon = row.querySelector(".hc-status .material-icons"); | ||
| // check_circle = Healthy; anything else (error, warning, cancel) is a problem | ||
| if (icon && icon.textContent.trim() !== "check_circle") return true; | ||
| } | ||
| return false; | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | 💤 Low value
Fragile selector coupling. The check for .hc-status .material-icons with text check_circle is tightly coupled to the Xabaril UI's internal DOM structure. If the fork or upstream changes icon names or structure, this will silently fail.
Acceptable for now since you control the pinned fork version. Add a comment noting this coupling.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@web/wwwroot/js/healthchecks-ui-extras.js` around lines 43 - 54, The
hasUnhealthyCampusCheck function uses a fragile DOM selector
(row.querySelector(".hc-status .material-icons") and comparing icon.textContent
to "check_circle") that is tightly coupled to Xabaril UI internals; add an
inline comment immediately above this selector explaining that this check
depends on the current forked UI structure/icon name, that changes upstream
could break it, and mark a TODO to replace with a more robust status API or
data-attribute if/when available (include function name hasUnhealthyCampusCheck
and the selector string in the comment for clarity).
The localhost loopback approach hit IIS host-header routing: requests with Host: localhost don't match the /2 site binding and get a 404. Going back to the public BaseUrl works because internal DNS resolves to the F5, keeping the self-call inside 192.168.56.0/24 - narrower than the original /20 + VPN /24 that was flagged in review.
… token The /health/detail endpoint must stay IP-allowlisted, but the in-app UI collector also has to reach it without widening the list. On every poll, a delegating handler stamps a process-unique token header that the endpoint filter recognizes; the IP check still applies to all other callers, so the allowlist can stay scoped to dev IPs only.
Summary by CodeRabbit
New Features
Improvements