Conversation
Signed-off-by: Jirka Kremser <jiri.kremser@gmail.com>
There was a problem hiding this comment.
Pull request overview
Adds an automated workflow and initial generated JSON schemas for Kedify CRDs so tools like kubeconform can validate CR manifests out-of-cluster.
Changes:
- Add initial JSON schema files under
crd-schemas/for multiplev1alpha1CRDs. - Add a GitHub Action workflow to regenerate schemas from
kedify-agent/files/*and open an automated PR. - Add a Python conversion script + Python requirements to generate JSON schemas from CRD OpenAPI.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
crd-schemas/scalingpolicy_v1alpha1.json |
Adds generated JSON schema for ScalingPolicy CRD. |
crd-schemas/podresourceprofile_v1alpha1.json |
Adds generated JSON schema for PodResourceProfile CRD. |
crd-schemas/podresourceautoscaler_v1alpha1.json |
Adds generated JSON schema for PodResourceAutoscaler CRD. |
crd-schemas/kedifyconfiguration_v1alpha1.json |
Adds generated JSON schema for KedifyConfiguration CRD. |
crd-schemas/distributedscaledobject_v1alpha1.json |
Adds generated JSON schema for DistributedScaledObject CRD. |
.github/workflows/crd-schemas.yaml |
New workflow to regenerate schemas and open a PR with updates. |
.github/scripts/requirements.txt |
Adds Python deps for schema generation. |
.github/scripts/openapi2jsonschema.py |
Adds CRD OpenAPI → JSON Schema conversion script used by the workflow. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| delete-branch: true | ||
| base: main | ||
| signoff: true | ||
| token: ${{ secrets.PAT_TOKEN }} |
There was a problem hiding this comment.
The workflow uses a long-lived PAT (secrets.PAT_TOKEN) for create-pull-request. Unless this PR creation needs elevated scopes beyond what GITHUB_TOKEN provides, prefer secrets.GITHUB_TOKEN (as in .github/workflows/helm-docs.yaml) to reduce credential risk and avoid requiring an extra secret for a maintenance workflow.
| token: ${{ secrets.PAT_TOKEN }} | |
| token: ${{ secrets.GITHUB_TOKEN }} |
| build-helm-doc: | ||
| permissions: | ||
| contents: write # for peter-evans/create-pull-request to create branch | ||
| pull-requests: write # for peter-evans/create-pull-request to create a PR | ||
| name: Update Helm Doc |
There was a problem hiding this comment.
Job id/name appear copy-pasted from the Helm docs workflow (build-helm-doc, Update Helm Doc) and don’t match what this workflow does. Renaming them to reflect CRD schema generation will make the Actions UI and logs less confusing.
| build-helm-doc: | |
| permissions: | |
| contents: write # for peter-evans/create-pull-request to create branch | |
| pull-requests: write # for peter-evans/create-pull-request to create a PR | |
| name: Update Helm Doc | |
| build-crd-schemas: | |
| permissions: | |
| contents: write # for peter-evans/create-pull-request to create branch | |
| pull-requests: write # for peter-evans/create-pull-request to create a PR | |
| name: Update CRD Schemas |
| - name: generate schemas for Kedify CRDs | ||
| run: | | ||
| python3 -m pip install -r .github/scripts/requirements.txt | ||
| CONVERSION_SCRIPT="${PWD}/.github/scripts/openapi2jsonschema.py" | ||
| pushd ./crd-schemas | ||
| for file in ../kedify-agent/files/* ; do | ||
| if [[ -f "$file" ]]; then | ||
| ${CONVERSION_SCRIPT} $file | ||
| fi | ||
| done |
There was a problem hiding this comment.
openapi2jsonschema.py only adds additionalProperties: false at the root when DENY_ROOT_ADDITIONAL_PROPERTIES is set. In this workflow that env var isn’t set, so kubeconform will allow unknown top-level fields in CR instances (e.g., typos next to spec/status). If strict validation is desired, export DENY_ROOT_ADDITIONAL_PROPERTIES=1 for the generation step (or change the script default).
| import ssl | ||
| ssl._create_default_https_context = ssl._create_unverified_context |
There was a problem hiding this comment.
The script supports disabling TLS certificate validation via DISABLE_SSL_CERT_VALIDATION, which makes URL-based schema generation vulnerable to MITM and can hide configuration mistakes. Consider removing this option, or at least scoping it to an explicit CLI flag and printing a prominent warning when it’s enabled.
| import ssl | |
| ssl._create_default_https_context = ssl._create_unverified_context | |
| print( | |
| "WARNING: DISABLE_SSL_CERT_VALIDATION is set but is no longer supported; " | |
| "TLS certificate validation remains enabled.", | |
| file=sys.stderr, | |
| ) |
| pyyaml | ||
| pytest |
There was a problem hiding this comment.
Python dependencies are unpinned, which can make the workflow non-reproducible and introduces avoidable supply-chain drift over time. Consider pinning versions (or at least upper-bounding) for pyyaml/pytest, and installing only what the workflow needs if tests aren’t being run here.
| pyyaml | |
| pytest | |
| pyyaml==6.0.2 | |
| pytest==8.3.3 |
| "description": "Target is the reference to a workload whose pods should be modified\nif target is specified, don't use the selector", | ||
| "properties": { | ||
| "kind": { | ||
| "description": "Kind specifies the target workload whose containers will be obtaining the resource updates\nCaveat: if set to 'scaledobject' the ScaledObject has to have its\n '.spec.scaleTargetRef.kind' equal to 'deployment' or 'statefulset' so that\n we can reach to its pods transitively.DeepCopy In other words not every\n scalable CR will work here. For MachindeDeployments this feature doesn't\n make sense.", |
There was a problem hiding this comment.
Typo in the generated user-facing description: MachindeDeployments should be MachineDeployments. Since these schemas are generated, the fix likely belongs in the source CRD/OpenAPI description so future regenerations don’t reintroduce it.
| "description": "Kind specifies the target workload whose containers will be obtaining the resource updates\nCaveat: if set to 'scaledobject' the ScaledObject has to have its\n '.spec.scaleTargetRef.kind' equal to 'deployment' or 'statefulset' so that\n we can reach to its pods transitively.DeepCopy In other words not every\n scalable CR will work here. For MachindeDeployments this feature doesn't\n make sense.", | |
| "description": "Kind specifies the target workload whose containers will be obtaining the resource updates\nCaveat: if set to 'scaledobject' the ScaledObject has to have its\n '.spec.scaleTargetRef.kind' equal to 'deployment' or 'statefulset' so that\n we can reach to its pods transitively.DeepCopy In other words not every\n scalable CR will work here. For MachineDeployments this feature doesn't\n make sense.", |
| "properties": { | ||
| "after": { | ||
| "default": "containerReady", | ||
| "description": "After specifies the event or the state pod/container needs to be in when we start counting (see Delay)\n\nAllowed values are:\n- 'containerReady': specifies whether the container is currently passing its\n readiness check. The value will change as readiness probes keep executing.\n If no readiness probes are specified, this field defaults to true once the\n container is fully started.\n field: pod.status.containerStatuses.ready\n time: pod.status.containerStatuses.state.running.startedAt\n\n- 'containerStarted': indicates whether the container has finished its\n postStart lifecycle hook and passed its startup probe. Initialized as\n false, becomes true after startupProbe is considered successful. Resets\n to false when the container is restarted, or if kubelet loses state\n temporarily. In both cases, startup probes will run again. Is always\n true when no startupProbe is defined and container is running and has\n passed the postStart lifecycle hook. The null value must be treated the\n same as false.\n\t field: pod.status.containerStatuses.started\n\t time: pod.status.containerStatuses.state.running.startedAt\n\n- 'podReady': means the pod is able to service requests and should be added\n to the load balancing pools of all matching services\n field: pod.status.conditions[?(.type=='Ready')].status\n time: pod.status.conditions[?(.type=='Ready')].lastTransitionTime\n\n- 'podScheduled': represents status of the scheduling process for this pod\n field: pod.status.conditions[?(.type=='PodScheduled')].status\n time: pod.status.conditions[?(.type=='PodScheduled')].lastTransitionTime\n\n- 'podRunning' means the pod has been bound to a node and all the\n containers have been started. At least one container is still running\n or is in the process of being restarted.\n field: pod.status.phase\n time: pod.status.startTime.time\n\n- 'deactivated' This is specific to KEDA and works only with\n .target.kind == scaledobject.\n It denotes a state of a ScaledObject whose triggers are not active. This\n is useful for throttling the CPU for a workloads that are not being\n actively used - stand-by mode.\n field: scaledobject.conditions[?(.type=='Active')].status\n time: scaledobject.conditions[?(.type=='Active')].lastTransitionTime\n when the aforementioned field goes from true (active) -> false (passive)\n\n- 'activated' Dual event to 'deactivated'.\n field: scaledobject.conditions[?(.type=='Active')].status\n time: time: scaledobject.conditions[?(.type=='Active')].lastTransitionTime\n when the aforementioned field goes from true (passive) -> false (active)", |
There was a problem hiding this comment.
The 'activated' trigger description appears inconsistent: it says the field goes from true (passive) -> false (active) and also contains time: time:. This is user-facing documentation and may mislead consumers; please correct it in the source CRD/OpenAPI description before regenerating schemas.
| "description": "After specifies the event or the state pod/container needs to be in when we start counting (see Delay)\n\nAllowed values are:\n- 'containerReady': specifies whether the container is currently passing its\n readiness check. The value will change as readiness probes keep executing.\n If no readiness probes are specified, this field defaults to true once the\n container is fully started.\n field: pod.status.containerStatuses.ready\n time: pod.status.containerStatuses.state.running.startedAt\n\n- 'containerStarted': indicates whether the container has finished its\n postStart lifecycle hook and passed its startup probe. Initialized as\n false, becomes true after startupProbe is considered successful. Resets\n to false when the container is restarted, or if kubelet loses state\n temporarily. In both cases, startup probes will run again. Is always\n true when no startupProbe is defined and container is running and has\n passed the postStart lifecycle hook. The null value must be treated the\n same as false.\n\t field: pod.status.containerStatuses.started\n\t time: pod.status.containerStatuses.state.running.startedAt\n\n- 'podReady': means the pod is able to service requests and should be added\n to the load balancing pools of all matching services\n field: pod.status.conditions[?(.type=='Ready')].status\n time: pod.status.conditions[?(.type=='Ready')].lastTransitionTime\n\n- 'podScheduled': represents status of the scheduling process for this pod\n field: pod.status.conditions[?(.type=='PodScheduled')].status\n time: pod.status.conditions[?(.type=='PodScheduled')].lastTransitionTime\n\n- 'podRunning' means the pod has been bound to a node and all the\n containers have been started. At least one container is still running\n or is in the process of being restarted.\n field: pod.status.phase\n time: pod.status.startTime.time\n\n- 'deactivated' This is specific to KEDA and works only with\n .target.kind == scaledobject.\n It denotes a state of a ScaledObject whose triggers are not active. This\n is useful for throttling the CPU for a workloads that are not being\n actively used - stand-by mode.\n field: scaledobject.conditions[?(.type=='Active')].status\n time: scaledobject.conditions[?(.type=='Active')].lastTransitionTime\n when the aforementioned field goes from true (active) -> false (passive)\n\n- 'activated' Dual event to 'deactivated'.\n field: scaledobject.conditions[?(.type=='Active')].status\n time: time: scaledobject.conditions[?(.type=='Active')].lastTransitionTime\n when the aforementioned field goes from true (passive) -> false (active)", | |
| "description": "After specifies the event or the state pod/container needs to be in when we start counting (see Delay)\n\nAllowed values are:\n- 'containerReady': specifies whether the container is currently passing its\n readiness check. The value will change as readiness probes keep executing.\n If no readiness probes are specified, this field defaults to true once the\n container is fully started.\n field: pod.status.containerStatuses.ready\n time: pod.status.containerStatuses.state.running.startedAt\n\n- 'containerStarted': indicates whether the container has finished its\n postStart lifecycle hook and passed its startup probe. Initialized as\n false, becomes true after startupProbe is considered successful. Resets\n to false when the container is restarted, or if kubelet loses state\n temporarily. In both cases, startup probes will run again. Is always\n true when no startupProbe is defined and container is running and has\n passed the postStart lifecycle hook. The null value must be treated the\n same as false.\n\t field: pod.status.containerStatuses.started\n\t time: pod.status.containerStatuses.state.running.startedAt\n\n- 'podReady': means the pod is able to service requests and should be added\n to the load balancing pools of all matching services\n field: pod.status.conditions[?(.type=='Ready')].status\n time: pod.status.conditions[?(.type=='Ready')].lastTransitionTime\n\n- 'podScheduled': represents status of the scheduling process for this pod\n field: pod.status.conditions[?(.type=='PodScheduled')].status\n time: pod.status.conditions[?(.type=='PodScheduled')].lastTransitionTime\n\n- 'podRunning' means the pod has been bound to a node and all the\n containers have been started. At least one container is still running\n or is in the process of being restarted.\n field: pod.status.phase\n time: pod.status.startTime.time\n\n- 'deactivated' This is specific to KEDA and works only with\n .target.kind == scaledobject.\n It denotes a state of a ScaledObject whose triggers are not active. This\n is useful for throttling the CPU for a workloads that are not being\n actively used - stand-by mode.\n field: scaledobject.conditions[?(.type=='Active')].status\n time: scaledobject.conditions[?(.type=='Active')].lastTransitionTime\n when the aforementioned field goes from true (active) -> false (passive)\n\n- 'activated' Dual event to 'deactivated'.\n field: scaledobject.conditions[?(.type=='Active')].status\n time: scaledobject.conditions[?(.type=='Active')].lastTransitionTime\n when the aforementioned field goes from false (passive) -> true (active)", |
Signed-off-by: Jirka Kremser <jiri.kremser@gmail.com>
Signed-off-by: Jirka Kremser <jiri.kremser@gmail.com>
then it can be used as:
the command can also accept an url or even a scheme where crd name and version are path of the url, like in here:
notes: