Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,7 @@ perfectly detailed crate that doesn't build at all.
### Validation Gate Ordering

The three-pass validation in `profiles/validator.py` has a strict dependency:
- **Base RO-Crate 1.1** must pass before ISA validation is meaningful
- **Base RO-Crate 1.2** must pass before ISA validation is meaningful
- **ISA Profile** must pass before ISA-Tox validation is meaningful
- **ISA-Tox Profile** depends on both lower layers

Expand Down Expand Up @@ -308,7 +308,7 @@ This project builds on the existing RO-Crate Python ecosystem rather than reinve
| Package | PyPI | What it provides | How we use it |
|---------|------|-----------------|---------------|
| [`ro-crate-py`](https://github.com/ResearchObject/ro-crate-py) | `uv add rocrate`<br>(import `rocrate`) | Official Python SDK for creating and manipulating RO-Crates. Provides `ROCrate`, `ContextEntity`, `File`, and other base entity classes. | The entity model classes in `profiles/models/isa.py` and `profiles/models/tox.py` subclass `rocrate.model.ContextEntity` and `rocrate.model.File`. The builder uses `ROCrate` to assemble the crate and serialise `ro-crate-metadata.json`. |
| [`rocrate-validator`](https://github.com/crs4/rocrate-validator) | `uv add roc-validator`<br>(import `rocrate_validator`) | Official SHACL-based validation library. Supports multi-profile validation (base RO-Crate → ISA → domain extensions) with severity levels. | `profiles/validator.py` wraps this in three passes (RO-Crate 1.1, ISA, ISA-Tox), suppressing inherited-profile duplicates so each pass reports only its own layer. |
| [`rocrate-validator`](https://github.com/crs4/rocrate-validator) | `uv add roc-validator`<br>(import `rocrate_validator`) | Official SHACL-based validation library. Supports multi-profile validation (base RO-Crate → ISA → domain extensions) with severity levels. | `profiles/validator.py` wraps this in three passes (RO-Crate 1.2, ISA, ISA-Tox), suppressing inherited-profile duplicates so each pass reports only its own layer. |
| [`rocrate-wizard`](https://github.com/ResearchObject/rocrate-wizard) *(external frontend)* | TBD | Frontend/UI layer that uses this backend (vitro-crate) to provide a user-facing RO-Crate builder. | This repo is the dependency — `rocrate-wizard` imports from `vitro-crate` and adds the web UI/CLI on top. Referenced in the ARC template's conversion workflow. |

These packages are imported directly — we do not fork or vendor them. Version requirements are declared in `pyproject.toml`.
Expand Down Expand Up @@ -589,7 +589,7 @@ Every tool call and graph node execution is automatically timed and recorded by

| Layer | Severity | Meaning | Agent Action |
|-------|----------|---------|--------------|
| Base RO-Crate 1.1 | REQUIRED | Structural validity | MUST fix before proceeding |
| Base RO-Crate 1.2 | REQUIRED | Structural validity | MUST fix before proceeding |
| ISA Profile | REQUIRED | ISA conformance | MUST fix |
| ISA Profile | SHOULD | Recommended metadata | Fix if data available |
| ISA Profile | MAY | Optional metadata | Note for user |
Expand Down
18 changes: 8 additions & 10 deletions builder/tools/_crate_mapping.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
LabProcessExposure,
)

ROCRATE_SPEC = "https://w3id.org/ro/crate/1.1"
ROCRATE_SPEC = "https://w3id.org/ro/crate/1.2"
# The ISA layer the tox profile actually extends (profiles/shapes/tox/profile.ttl
# prof:isProfileOf) and that resolves — the w3id ISA permalink is not yet live.
PROFILE_ISA = "https://github.com/nfdi4plants/isa-ro-crate-profile"
Expand Down Expand Up @@ -492,14 +492,10 @@ def _populate_root_and_conformance(state: CrateState, crate: ROCrate) -> None:
# reserved for the single base-spec URI, while the profiles the crate targets
# are declared on the Root Data Entity (./) — Issue #91.
#
# The base spec stays pinned to 1.1 (not 1.2) deliberately: roc-validator
# 0.10.0 bundles no ro-crate-1.2 base profile, and its base pass hard-requires
# the 1.1 URI on the descriptor (profiles/ro-crate/must/1_file-descriptor_
# metadata.ttl: `sh:hasValue <https://w3id.org/ro/crate/1.1>`). Declaring 1.2
# there fails REQUIRED validation, which build_and_validate (#87) and the
# golden fixtures (#97) rely on staying green. ro-crate-py 0.15 still emits a
# 1.2 @context; fully unifying the version on 1.2 is deferred until an upstream
# validator ships a 1.2 base profile (tracked on #91).
# The base spec is now 1.2 (ROCRATE_SPEC). The #105 deferral to 1.1 is lifted:
# roc-validator 0.11.0 ships a ro-crate-1.2 base profile
# (crs4/rocrate-validator#164), so the base pass validates against 1.2 and
# build_and_validate (#87) + the golden fixtures (#97) stay green — Issue #110.
crate.metadata["conformsTo"] = {"@id": ROCRATE_SPEC}

# Profiles the crate TARGETS, declared on ./ unconditionally — the three-layer
Expand Down Expand Up @@ -897,7 +893,9 @@ def _build_condition_table_schema(
title = col["titles"]
props: dict[str, Any] = {"@type": "csvw:Column", **col}
if value_urls.get(title):
props["valueUrl"] = value_urls[title]
# Emit valueUrl as an {@id} reference (not a bare string): RO-Crate 1.2
# REQUIRES entity links be reference objects, and flags string @ids.
props["valueUrl"] = {"@id": value_urls[title]}
column = crate.add(
ContextEntity(crate, f"#{exp_slug}_col_{title}", properties=props)
)
Expand Down
2 changes: 1 addition & 1 deletion builder/tools/validation.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ def validate(state: CrateState, crate_path: str) -> ValidationReport:
"name": "Add a `name` to `{entity}`.",
"description": "Add a `description` to `{entity}`.",
"identifier": "Add an `identifier` to `{entity}`.",
"conformsTo": "Add `conformsTo` to `{entity}` referencing the RO-Crate 1.1 spec.",
"conformsTo": "Add `conformsTo` to `{entity}` referencing the RO-Crate 1.2 spec.",
"license": "Add a `license` to `{entity}`.",
"author": "Add an `author` to `{entity}`.",
"datePublished": "Add a `datePublished` to `{entity}`.",
Expand Down
2 changes: 1 addition & 1 deletion builder/writers/maturity_report.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ def build_maturity_html(
# --- Profile adherence (from existing validation results) ---
if _validation_has_signal(val):
layers = [
("RO-Crate 1.1", val.base_passed),
("RO-Crate 1.2", val.base_passed),
("ISA", val.isa_passed),
("ISA-Tox", val.tox_passed),
]
Expand Down
34 changes: 21 additions & 13 deletions profiles/docs/isa_tox.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,20 +110,30 @@ Protocol --reagent---> ont

## Conformance

A crate following this profile declares conformance using the **RO-Crate 1.1** convention: the
*RO-Crate Metadata Descriptor* carries the base specification and the profile URIs together in its `conformsTo`
array, and each referenced profile is declared as a `Profile` contextual entity.
A crate following this profile declares conformance using the **RO-Crate 1.2** convention: the
*RO-Crate Metadata Descriptor*'s `conformsTo` carries only the single base-specification URI, while the
profile URIs the crate targets are declared on the **Root Data Entity** (`./`). Each referenced profile is
also declared as a `Profile` contextual entity.

```json
{
"@id": "ro-crate-metadata.json",
"@type": "CreativeWork",
"conformsTo": {"@id": "https://w3id.org/ro/crate/1.2"},
"about": {"@id": "./"}
}
```

The Root Data Entity declares the targeted profiles:

```json
{
"@id": "./",
"@type": "Dataset",
"conformsTo": [
{"@id": "https://w3id.org/ro/crate/1.1"},
{"@id": "https://github.com/nfdi4plants/isa-ro-crate-profile"},
{"@id": "https://w3id.org/ro/crate/isa-tox/1.0"}
],
"about": {"@id": "./"}
]
}
```

Expand All @@ -145,13 +155,11 @@ permalink is not yet registered. The referenced profiles are declared as context
}
```

> **Note — RO-Crate 1.1 vs 1.2 placement.** Under RO-Crate **1.1**, profile conformance is declared on the *Metadata
> Descriptor*, whose `conformsTo` MAY be an array of the base specification plus profile URIs. RO-Crate **1.2** instead
> recommends declaring profiles on the *Root Data Entity* (`./`) and reserving the descriptor's `conformsTo` for a single
> base-specification value. This profile follows the **1.1** placement because the current toolchain targets 1.1:
> `ro-crate-py` emits a 1.1 `@context`, and the `rocrate-validator` base profile requires
> `https://w3id.org/ro/crate/1.1` to be present in the descriptor's `conformsTo`. Move to the Root Data Entity placement
> when the toolchain validates against 1.2.
> **Note — RO-Crate 1.2 placement.** RO-Crate **1.2** recommends declaring the profiles a crate targets on the
> *Root Data Entity* (`./`) and reserving the *Metadata Descriptor*'s `conformsTo` for a single base-specification
> value — which this profile now follows (Issue #110). The earlier 1.1 placement (all URIs on the descriptor) was a
> temporary measure while the toolchain lagged: it was lifted once `rocrate-validator` 0.11.0 shipped a `ro-crate-1.2`
> base profile (crs4/rocrate-validator#164), so the base pass validates against 1.2.

A machine-readable [Profile Crate](https://www.researchobject.org/ro-crate/specification/1.2/profiles.html#profile-crate) bundling this
description with the SHACL shapes MAY additionally be published at the profile URI; this is planned but not yet provided.
Expand Down
30 changes: 21 additions & 9 deletions profiles/validator.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
validation and returns structured, plain-English results.

Passes:
1. Base RO-Crate 1.1 -> bundled ``ro-crate`` profile
1. Base RO-Crate 1.2 -> bundled ``ro-crate`` profile
2. ISA RO-Crate -> bundled ``isa-ro-crate`` profile (roc-validator >=0.10)
3. ISA-Tox RO-Crate -> our ``tox-ro-crate`` profile, loaded from SHAPES_DIR
via ``extra_profiles_path`` and composed on top of the
Expand Down Expand Up @@ -334,7 +334,7 @@ class ValidationResult:
# three passes in validate_crate(); the only difference is the document is fed as
# a dict via services.validate_metadata_as_dict instead of read from disk.
_PROFILE_PASSES: dict[str, tuple[str, dict]] = {
"base": ("ro-crate-1.1", {}),
"base": ("ro-crate-1.2", {}),
"isa": ("isa-ro-crate", {"disable_inherited_profiles_issue_reporting": True}),
"tox": (
"tox-ro-crate",
Expand All @@ -346,6 +346,12 @@ class ValidationResult:
),
}

# Checks that validate the on-disk metadata FILE itself (e.g. its byte encoding)
# and therefore cannot be evaluated on the in-memory dict path — they false-positive
# on validate_crate_dict where no file exists. Dropped from the dict path only; the
# on-disk validate_crate still enforces them. (ro-crate-1.2_3.1 = descriptor UTF-8.)
_DICT_PATH_NA_CHECKS = frozenset({"ro-crate-1.2_3.1"})

# Severity name <-> roc-validator Severity enum.
_SEVERITY_BY_NAME = {
"required": models.Severity.REQUIRED,
Expand Down Expand Up @@ -453,13 +459,19 @@ def validate_crate_dict(
**extra,
)
result = services.validate_metadata_as_dict(metadata_doc, settings)
issues = [_routable_issue(i, key) for i in result.get_issues()]
# Drop file-only checks that can't apply to an in-memory document (see
# _DICT_PATH_NA_CHECKS), then derive pass/fail from the filtered issues.
issues = [
ri
for i in result.get_issues()
if (ri := _routable_issue(i, key)).check_id not in _DICT_PATH_NA_CHECKS
]
_raise_on_transport_failure(issues, profile=key)
results.append(
DictValidationResult(
profile=key,
passed=not result.has_issues(),
passed_required=not result.has_issues(min_severity=models.Severity.REQUIRED),
passed=not issues,
passed_required=not any(i.severity == "required" for i in issues),
issues=issues,
)
)
Expand Down Expand Up @@ -509,7 +521,7 @@ def validate_crate(crate_dir: Path) -> list[ValidationResult]:
"""Run all three validation passes against crate_dir.

Returns one ValidationResult per pass in order:
1. Base RO-Crate 1.1
1. Base RO-Crate 1.2
2. ISA RO-Crate Profile
3. ISA-Tox RO-Crate Profile
"""
Expand All @@ -518,14 +530,14 @@ def validate_crate(crate_dir: Path) -> list[ValidationResult]:
# --- Pass 1: base RO-Crate 1.1 ---
settings = services.ValidationSettings(
rocrate_uri=crate_dir, # ty: ignore[unknown-argument]
profile_identifier="ro-crate-1.1",
profile_identifier="ro-crate-1.2",
requirement_severity=models.Severity.OPTIONAL,
)
result = services.validate(settings)
_raise_on_transport_failure_result(result, profile="Base RO-Crate 1.1")
_raise_on_transport_failure_result(result, profile="Base RO-Crate 1.2")
results.append(
ValidationResult(
profile="Base RO-Crate 1.1",
profile="Base RO-Crate 1.2",
passed=not result.has_issues(),
issues=_format_issues(result),
required_issues=_format_issues(result, models.Severity.REQUIRED),
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ readme = "README.md"
requires-python = ">=3.12"
dependencies = [
"rocrate>=0.15.0",
"roc-validator>=0.10.0",
"roc-validator>=0.11.0",
"requests>=2.31.0",
"pyyaml>=6.0",
"openpyxl>=3.1.0",
Expand Down
10 changes: 5 additions & 5 deletions tests/test_builder_domain.py
Original file line number Diff line number Diff line change
Expand Up @@ -386,11 +386,11 @@ def test_profiles_declared_on_root_data_entity(self, tmp_path):
assert "Profile" in (pt if isinstance(pt, list) else [pt])

def test_descriptor_conformsto_is_base_spec_only(self, tmp_path):
# Issue #91: the metadata file descriptor's conformsTo is reserved for
# the single base-spec URI (profiles moved to ./). The base spec stays
# 1.1 because roc-validator 0.10.0 bundles no 1.2 base profile and its
# base pass requires the 1.1 URI on the descriptor (sh:hasValue).
# Issue #91/#110: the metadata file descriptor's conformsTo is reserved
# for the single base-spec URI (profiles moved to ./). The base spec is
# now 1.2 — roc-validator 0.11.0 ships a ro-crate-1.2 base profile
# (crs4/rocrate-validator#164), so the #105 deferral is lifted.
state = CrateState()
_, by_id = _build(state, tmp_path)
desc_conforms = _ids(by_id["ro-crate-metadata.json"].get("conformsTo"))
assert desc_conforms == ["https://w3id.org/ro/crate/1.1"]
assert desc_conforms == ["https://w3id.org/ro/crate/1.2"]
7 changes: 7 additions & 0 deletions tests/test_e2e_agent_eval.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,13 @@

FIXTURE_INPUT_DIR = Path(__file__).parent / "fixtures" / "svhps21_input"

# This module is a heavy integration harness: each test drives the full scripted
# tool sequence plus real SHACL validation (build_and_validate + an on-disk
# round-trip validate). Under RO-Crate 1.2 the validator is slower (larger
# ontology / pyshacl ontology-mixing), so the global CI --timeout=30 is too tight
# for these on shared runners. Raise the per-test ceiling for this file only.
pytestmark = pytest.mark.timeout(120)


@pytest.fixture(autouse=True)
def _no_network(monkeypatch):
Expand Down
13 changes: 5 additions & 8 deletions tests/test_offline_validation.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ def _base_valid_doc() -> dict:
crate.root_dataset["name"] = "Test"
crate.root_dataset["description"] = "Test crate"
crate.root_dataset["license"] = "ALL RIGHTS RESERVED BY THE AUTHORS"
crate.metadata["conformsTo"] = {"@id": "https://w3id.org/ro/crate/1.1"}
crate.metadata["conformsTo"] = {"@id": "https://w3id.org/ro/crate/1.2"}
return crate.metadata.generate()


Expand Down Expand Up @@ -131,19 +131,16 @@ def test_base_pass_green_with_network_down(self):
assert attempted == [], attempted

def test_no_remote_context_check_failures_with_network_down(self):
"""Checks ``ro-crate-1.1_2.1`` / ``2.2`` (the ones that flaked) stay clean."""
"""The base pass stays clean with the network down — the remote-context
resolution checks (e.g. ro-crate-1.2_2.*) are served from the bundled
context, so they don't spuriously fail (the regression that flaked CI)."""
from profiles.validator import validate_crate_dict

with _network_down():
results = validate_crate_dict(_base_valid_doc(), profile="base")

base = next(r for r in results if r.profile == "base")
offending = [
i
for i in base.issues
if (i.check_id or "").startswith(("ro-crate-1.1_2.1", "ro-crate-1.1_2.2"))
]
assert offending == [], [(i.check_id, i.message) for i in offending]
assert base.passed_required, [(i.check_id, i.message) for i in base.issues]


class TestTransportErrorNotReportedAsRequired:
Expand Down
Loading
Loading