Recommendation: Yes, it's feasible and beneficial to generate documentation directly from CRDs instead of maintaining separate YAML configuration files.
Benefits:
- Single source of truth (CRDs)
- No manual sync required
- Descriptions stay current with API changes
- Reduced maintenance burden
- Automatic consistency
Trade-offs:
- Less control over documentation formatting
- Need to enhance CRD descriptions for documentation quality
- Examples must be stored separately or embedded in CRDs
config/resources/*.yaml (Human-maintained)
↓
python/resources.py (Generation)
↓
input/resources/*.md (Generated docs)
YAML configs contain:
- Resource descriptions
- Property descriptions
- Examples
- Grouping (frequently-used, advanced)
- Cross-references (related_resources, related_concepts, links)
- Default values
- Updatable flags
- Platform-specific notes
CRDs contain:
- OpenAPI schema
- Property types
- Descriptions (NOW AVAILABLE!)
- Required fields
- Validation rules
- Default values (in some cases)
crds/*.yaml (Single source of truth)
↓
python/resources.py (Enhanced generation)
↓
input/resources/*.md (Generated docs)
What needs to be added to CRDs:
- Examples (as annotations or separate files)
- Documentation metadata (grouping, cross-references)
- Enhanced descriptions where needed
crds/*.yaml (Technical definitions + descriptions)
+
config/resources/metadata.yaml (Documentation metadata only)
↓
python/resources.py (Enhanced generation)
↓
input/resources/*.md (Generated docs)
Metadata file would contain:
- Examples
- Cross-references (related_resources, related_concepts)
- External links
- Property grouping (frequently-used, advanced)
- Overview text
-
Resource-level descriptions:
openAPIV3Schema: description: |- A site is a place on the network where application workloads are running. Sites are joined by links.
-
Property descriptions:
linkAccess: description: |- Configure external access for links from remote sites...
-
Property types:
string,boolean,integer,object,array -
Property formats:
duration,date-time -
Required fields:
required: - routingKey - port
-
Validation constraints:
enum,pattern,minimum,maximum -
Default values (in some properties)
-
Nested object structures
- Examples: No standard place for usage examples
- Property grouping: No "frequently-used" vs "advanced" distinction
- Cross-references: No links to related resources/concepts
- External links: No references to external documentation
- Updatable flags: No indication if property can be changed after creation
- Platform-specific notes: No Kubernetes vs Docker/Podman distinctions
- Choice descriptions: Enum values lack detailed descriptions
Add documentation metadata as annotations:
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: sites.skupper.io
annotations:
skupper.io/examples: |
- description: A minimal site
yaml: |
apiVersion: skupper.io/v2alpha1
kind: Site
...
skupper.io/related-resources: "link,listener,connector"
skupper.io/related-concepts: "network,platform"
spec:
versions:
- name: v2alpha1
schema:
openAPIV3Schema:
properties:
spec:
properties:
linkAccess:
description: ...
x-skupper-group: frequently-used
x-skupper-updatable: true
x-skupper-related-concepts: "link"Pros:
- Keeps everything in CRD files
- Uses Kubernetes extension mechanism (
x-prefix) - Single source of truth
Cons:
- CRD files become larger
- Non-standard use of annotations
- Harder to edit/maintain
Keep CRDs clean, add minimal metadata files:
# config/resources/metadata/site.yaml
name: Site
examples:
- description: A minimal site
yaml: |
apiVersion: skupper.io/v2alpha1
kind: Site
...
related_resources: [link, listener, connector]
related_concepts: [network, platform]
links: [skupper/site-configuration]
properties:
linkAccess:
group: frequently-used
updatable: true
related_concepts: [link]
links: [skupper/site-linking]
ha:
updatable: true
platforms: [Kubernetes]
links: [skupper/high-availability]Pros:
- CRDs stay clean and standard
- Easy to edit metadata
- Clear separation of concerns
- Smaller files
Cons:
- Two files to maintain (but much simpler than current YAML)
- Need to merge data during generation
crds/
skupper_site_crd.yaml
skupper_connector_crd.yaml
...
config/resources/
examples/
site.yaml # Just examples
connector.yaml
metadata/
site.yaml # Just metadata (grouping, links)
connector.yaml
Pros:
- Very clean separation
- Examples easy to test
- Metadata minimal
Cons:
- Three places to look (CRD, examples, metadata)
- Review all CRD descriptions for completeness
- Add missing descriptions
- Ensure descriptions are documentation-quality
- Add choice descriptions for enum values
- Create
config/resources/metadata/directory - For each resource, create a metadata file with:
- Examples
- Cross-references (related_resources, related_concepts)
- External links
- Property grouping (frequently-used, advanced)
- Updatable flags
- Platform-specific notes
Example structure:
# config/resources/metadata/site.yaml
examples:
- description: A minimal site
yaml: |
apiVersion: skupper.io/v2alpha1
kind: Site
metadata:
name: east
namespace: hello-world-east
related_resources: [link]
related_concepts: [network, platform]
links: [skupper/site-configuration]
properties:
linkAccess:
group: frequently-used
updatable: true
related_concepts: [link]
ha:
updatable: true
defaultIssuer:
group: advanced
updatable: true
edge:
group: advanced
serviceAccount:
group: advanced
settings:
group: advancedUpdate python/resources.py to:
- Load CRDs (already done)
- Load metadata files (new)
- Merge data:
- Use CRD for: descriptions, types, required fields, validation
- Use metadata for: examples, grouping, cross-references, links
- Generate markdown (similar to current process)
Key changes needed:
class ResourceModel(Model):
def __init__(self):
super().__init__(Resource, "config/resources/metadata") # Changed path
# Load CRDs (already exists)
self.crds_by_name = dict()
for crd_file in list_dir("crds"):
# ... existing code ...
# Load metadata files (new)
self.metadata_by_name = dict()
for metadata_file in list_dir("config/resources/metadata"):
data = read_yaml(join("config/resources/metadata", metadata_file))
self.metadata_by_name[data["name"]] = data
class Resource(ModelObject):
def __init__(self, model, crd_data, metadata_data):
# Merge CRD schema with metadata
self.name = crd_data["spec"]["names"]["kind"]
self.description = crd_data["spec"]["versions"][0]["schema"]["openAPIV3Schema"]["description"]
self.examples = metadata_data.get("examples", [])
self.related_resources = metadata_data.get("related_resources", [])
# ... etc
# Extract properties from CRD schema
schema = crd_data["spec"]["versions"][0]["schema"]["openAPIV3Schema"]
for prop_name, prop_schema in schema["properties"]["spec"]["properties"].items():
prop_metadata = metadata_data.get("properties", {}).get(prop_name, {})
prop = Property(
name=prop_name,
type=prop_schema.get("type"),
description=prop_schema.get("description"),
group=prop_metadata.get("group"),
updatable=prop_metadata.get("updatable"),
# ... merge CRD and metadata
)
self.spec_properties.append(prop)- Create metadata files from existing YAML configs (extract non-CRD data)
- Test generation with new code
- Compare output with current generated docs
- Iterate until output matches or improves
- Remove old YAML configs once satisfied
- Update
./plano update_crdsto also validate metadata files - Update documentation (resources.md)
- Update contributor guidelines
Create a script to extract metadata from current YAML configs:
# scripts/extract_metadata.py
import yaml
def extract_metadata(yaml_config_path):
"""Extract non-CRD data from existing YAML config"""
with open(yaml_config_path) as f:
data = yaml.safe_load(f)
metadata = {
"examples": data.get("examples", []),
"related_resources": data.get("related_resources", []),
"related_concepts": data.get("related_concepts", []),
"links": data.get("links", []),
"properties": {}
}
# Extract property metadata
for section in ["spec", "status"]:
if section in data:
for prop in data[section].get("properties", []):
prop_meta = {}
if "group" in prop:
prop_meta["group"] = prop["group"]
if "updatable" in prop:
prop_meta["updatable"] = prop["updatable"]
if "related_concepts" in prop:
prop_meta["related_concepts"] = prop["related_concepts"]
if "related_resources" in prop:
prop_meta["related_resources"] = prop["related_resources"]
if "links" in prop:
prop_meta["links"] = prop["links"]
if "platforms" in prop:
prop_meta["platforms"] = prop["platforms"]
if prop_meta:
metadata["properties"][prop["name"]] = prop_meta
return metadata- Single Source of Truth: CRDs are authoritative for schema and descriptions
- Reduced Duplication: No need to maintain descriptions in two places
- Automatic Sync: Descriptions update automatically when CRDs update
- Simpler Maintenance: Metadata files are much smaller than current YAML configs
- Better Consistency: Schema and docs always match
- Easier Updates: Update CRD descriptions in Skupper repo, they flow through automatically
Mitigation:
- Review and enhance CRD descriptions before migration
- Establish guidelines for CRD description quality
- Can still override in metadata if needed
Mitigation:
- Metadata files provide override capability
- Can add supplementary text in metadata
- Examples remain fully controllable
Mitigation:
- Phased migration approach
- Keep old system working during transition
- Extensive testing before cutover
- Phase 1 (Enhance CRDs): 2-4 hours (review and update descriptions)
- Phase 2 (Create metadata files): 4-6 hours (extract from existing YAML)
- Phase 3 (Modify generation code): 8-12 hours (rewrite resource loading/merging)
- Phase 4 (Migration/testing): 4-6 hours (validate output, fix issues)
- Phase 5 (Documentation): 2-3 hours (update resources.md)
Total: 20-31 hours
Proceed with Approach B (Separate Metadata Files):
- ✅ Keeps CRDs clean and standard
- ✅ Minimal metadata files (much simpler than current YAML)
- ✅ Clear separation: CRDs = schema/descriptions, Metadata = doc structure
- ✅ Easy to maintain and understand
- ✅ Preserves flexibility for documentation needs
Next Steps:
- Review CRD descriptions for quality
- Create migration script to extract metadata
- Implement enhanced generation code
- Test with one resource (e.g., Site)
- Migrate remaining resources
- Update documentation
This approach gives you the best of both worlds: authoritative descriptions from CRDs with minimal metadata for documentation structure.