Conversation
✅ Deploy Preview for vcluster-docs-site ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
| architecture. Those can be applied with modifications to your actual use cases. | ||
| ::: | ||
|
|
||
| This guide explains how to configure a monitoring architecture with Prometheus |
There was a problem hiding this comment.
[Google.OxfordComma] Use the Oxford comma in 'This guide explains how to configure a monitoring architecture with Prometheus to collect workload metrics from across multiple virtual clusters and aggregate by cluster, project and'.
| - A Prometheus Operator (to scrape virtual cluster own metrics via `ServiceMonitors`) and a Prometheus Agent (remote_writer) per Cluster | ||
| - A Prometheus Agent (remote_writer) per virtual cluster with private nodes (Private Nodes Tenancy Model). | ||
|
|
||
| ## Configuration Prerequisites |
There was a problem hiding this comment.
[Google.Headings] 'Configuration Prerequisites' should use sentence-style capitalization.
| ## Configuration Prerequisites | ||
|
|
||
| The reachable central prometheus must be configured as a remote write receiver. | ||
| E.g. following helm values would suffice for that: |
There was a problem hiding this comment.
[Loft.capitalize-helm-project] 'Helm' should be capitalized when referring to the project.
| ## Configuration Prerequisites | ||
|
|
||
| The reachable central prometheus must be configured as a remote write receiver. | ||
| E.g. following helm values would suffice for that: |
There was a problem hiding this comment.
🚫 [vale] reported by reviewdog 🐶
[Vale.Terms] Use 'Helm' instead of 'helm'.
| **2. Configure Helm values:** | ||
|
|
||
| Save the following as `prometheus-virtualcluster-values.yaml` and set the name of the virtual | ||
| cluster. This is necessary in order to be able to aggregate any workload |
There was a problem hiding this comment.
[Google.WordList] Use 'to' instead of 'in order to'.
|
|
||
| The vCluster Platform agent emits a set of custom metrics carrying information | ||
| about virtual clusters as labels. These metrics always return `1` and can | ||
| therefore be joined via PromQL in order to make those labels available for |
There was a problem hiding this comment.
[Google.WordList] Use 'to' instead of 'in order to'.
|
|
||
| Following labels are attached: | ||
|
|
||
| - `kind`: `VirtualClusterInstance` |
There was a problem hiding this comment.
[Loft.kubernetes-api-kinds] Kubernetes/Platform API kinds like 'VirtualClusterInstance' should not use backticks. Write them as plain text (e.g., StatefulSet not StatefulSet).
|
|
||
| ### Latency | ||
|
|
||
| #### kube-apiserver request latency (p99, by verb) (virtual cluster only) |
There was a problem hiding this comment.
[Google.Headings] 'kube-apiserver request latency (p99, by verb) (virtual cluster only)' should use sentence-style capitalization.
| type (GET, LIST, PUT, POST, PATCH, DELETE, WATCH). The p99 captures outliers | ||
| that averages hide. WATCH is expected to show 60s (long-poll). | ||
|
|
||
| #### kube-apiserver request latency (p95, non-WATCH) (virtual cluster only) |
There was a problem hiding this comment.
[Google.Headings] 'kube-apiserver request latency (p95, non-WATCH) (virtual cluster only)' should use sentence-style capitalization.
| **Why:** Excludes long-running connections to focus on latency for synchronous | ||
| API calls. | ||
|
|
||
| #### etcd backend latency (p99, by operation) (virtual cluster only) |
There was a problem hiding this comment.
[Google.Headings] 'etcd backend latency (p99, by operation) (virtual cluster only)' should use sentence-style capitalization.
fc72f2a to
7101e08
Compare
|
|
||
| ### Traffic | ||
|
|
||
| #### kube-apiserver request rate (by verb) (virtual cluster only) |
There was a problem hiding this comment.
[Google.Headings] 'kube-apiserver request rate (by verb) (virtual cluster only)' should use sentence-style capitalization.
|
|
||
| **Why:** The most fundamental measure of cluster workload. Shows how many requests per second the API server handles, broken down by verb. | ||
|
|
||
| #### kube-apiserver request rate (by resource) (virtual cluster only) |
There was a problem hiding this comment.
[Google.Headings] 'kube-apiserver request rate (by resource) (virtual cluster only)' should use sentence-style capitalization.
|
|
||
| **Why:** Measures platform-level traffic through the gateway, split by Kubernetes API proxy, auth, and UI. | ||
|
|
||
| #### REST client outbound request rate (by code) (virtual cluster only) |
There was a problem hiding this comment.
[Google.Headings] 'REST client outbound request rate (by code) (virtual cluster only)' should use sentence-style capitalization.
|
|
||
| ### Errors | ||
|
|
||
| #### kube-apiserver error rate (4xx/5xx, by code) (virtual cluster only) |
There was a problem hiding this comment.
[Google.Headings] 'kube-apiserver error rate (4xx/5xx, by code) (virtual cluster only)' should use sentence-style capitalization.
|
|
||
| **Why:** HTTP-level error rates. | ||
|
|
||
| #### kube-apiserver error ratio (errors / total) (virtual cluster only) |
There was a problem hiding this comment.
[Google.Headings] 'kube-apiserver error ratio (errors / total) (virtual cluster only)' should use sentence-style capitalization.
|
|
||
| **Why:** Container runtime failures (image pulls, container create/start failures). | ||
|
|
||
| #### REST client error rate (outbound 5xx) (virtual cluster only) |
There was a problem hiding this comment.
[Google.Headings] 'REST client error rate (outbound 5xx) (virtual cluster only)' should use sentence-style capitalization.
|
|
||
| **Why:** Shows which pods are being throttled by cgroup CPU limits. | ||
|
|
||
| #### kube-apiserver inflight requests (virtual cluster only) |
There was a problem hiding this comment.
[Google.Headings] 'kube-apiserver inflight requests (virtual cluster only)' should use sentence-style capitalization.
|
|
||
| **Why:** Shows current request concurrency for mutating vs read-only. When this approaches flow control limits, requests start queuing. | ||
|
|
||
| #### kube-apiserver flow-control queue depth (virtual cluster only) |
There was a problem hiding this comment.
[Google.Headings] 'kube-apiserver flow-control queue depth (virtual cluster only)' should use sentence-style capitalization.
|
|
||
| **Why:** Controller work queues. Growing depth = controllers can't keep up with the event rate. | ||
|
|
||
| #### WATCH connection count (long-running requests) (virtual cluster only) |
There was a problem hiding this comment.
[Google.Headings] 'WATCH connection count (long-running requests) (virtual cluster only)' should use sentence-style capitalization.
|
|
||
| **Why:** Overall node memory pressure. | ||
|
|
||
| #### Filesystem usage (by PVC / volume) (vCluster Platform only) |
There was a problem hiding this comment.
[Google.Headings] 'Filesystem usage (by PVC / volume) (vCluster Platform only)' should use sentence-style capitalization.
WIP
Content Description
Preview Link
Internal Reference
Closes DOC-
AI review: mention
@claudein a comment to request a review or changes. See CONTRIBUTING.md for available commands.@netlify /docs