Skip to content

Fixes Docker Image Publishing#5379

Open
myurasov-nv wants to merge 15 commits intoisaac-sim:developfrom
myurasov-nv:my-postmerge-ci-fix
Open

Fixes Docker Image Publishing#5379
myurasov-nv wants to merge 15 commits intoisaac-sim:developfrom
myurasov-nv:my-postmerge-ci-fix

Conversation

@myurasov-nv
Copy link
Copy Markdown
Member

@myurasov-nv myurasov-nv commented Apr 23, 2026

Fixes the docker image publishing workflow; simplifies tagging on published images; removes hardcoded Isaac Sim base image versions in favor of config.yaml; removes silent failure on variable non-existence.

New tagging scheme:
   - Every build:        $IMAGE:<full-sha>             (immutable, retained for re-testing)
   - Push to develop:    $IMAGE:latest-develop         (moves to newest develop build)
   - Push to release/X:  $IMAGE:latest-release-X       (moves to newest build on that release branch)
   - Push to main:       $IMAGE:latest                 (moves to newest main build)
                         $IMAGE:v<VERSION>             (from the VERSION file, e.g. v3.0.0)

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

@github-actions github-actions Bot added bug Something isn't working infrastructure labels Apr 23, 2026
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Isaac Lab Review Bot

Summary

This PR fixes the Docker image publishing workflow by replacing the old postmerge-ci.yml with a new publish-images.yaml that uses a simpler branch-based tagging scheme. The core fix addresses illegal @sha256:... characters in Docker tags by no longer embedding the base image version in the output tag names. The new tagging scheme uses short SHA commits as immutable tags plus branch-specific moving tags (latest, latest-develop).

Architecture Impact

Self-contained CI/CD change. No impact on the Isaac Lab codebase itself. The change affects:

  • Docker image consumers who may rely on the old tag naming convention (<repo>-<branch>-<base-version>)
  • The new tags are simpler but represent a breaking change in tag naming for downstream automation

Implementation Verdict

Minor fixes needed

Test Coverage

CI workflow changes are not unit-testable in the traditional sense. The workflow should be validated by a successful run on the target branches. The PR appropriately cannot include traditional tests.

CI Status

No CI checks available yet — this is expected since the workflow itself is what's being fixed.

Findings

🟡 Warning: publish-images.yaml:94 — Short SHA collision risk without disambiguation

TAGS=("$IMAGE:$SHORT_SHA")

Using only a 7-character short SHA as an "immutable" tag creates collision risk at scale. While unlikely, if a collision occurs the tag would be silently overwritten. Consider prefixing with branch or using the full SHA for the immutable tag. The old workflow used $COMBINED_TAG-${GITHUB_SHA::7} which included branch context.

🟡 Warning: publish-images.yaml:106 — Missing VERSION file causes silent fallback, not failure

VERSION="$(tr -d '[:space:]' < VERSION)"
if [ -n "$VERSION" ]; then
  TAGS+=("$IMAGE:v$VERSION")
else
  echo "🟠 VERSION file is empty; skipping versioned tag"
fi

If the VERSION file doesn't exist, tr will fail but the error is swallowed. On main branch, publishing without a version tag may be unintended. Consider explicit file existence check:

if [ -f VERSION ]; then
  VERSION="$(tr -d '[:space:]' < VERSION)"
  ...
fi

🟡 Warning: publish-images.yaml:113-115 — release/ branches get no semantic tag**

*)
  # Other tracked branches (release/**, ...) only get the immutable sha tag.
  echo "Branch '$BRANCH_NAME' has no moving tag; only $IMAGE:$SHORT_SHA will be published."

Release branches (e.g., release/1.2.3) only get a short SHA tag, making it difficult to identify release candidate images. Consider adding a $IMAGE:$BRANCH_NAME tag (sanitized) or $IMAGE:rc-<version> pattern for release branches.

🔵 Improvement: publish-images.yaml:65-70 — NGC login proceeds silently on failure

docker login -u \$oauthtoken -p ${{ env.NGC_API_KEY }} nvcr.io
echo "🟢 Successfully logged into NGC registry"

The login command's exit status isn't checked. If login fails, the workflow continues and will fail later at push time with a confusing error. Add set -e at the script start or explicit error checking:

if ! docker login -u \$oauthtoken -p ${{ env.NGC_API_KEY }} nvcr.io; then
  echo "🔴 Failed to log into NGC registry"
  exit 1
fi

🔵 Improvement: publish-images.yaml:155-164 — Build failure won't surface which tag failed
The docker buildx build ... --push command pushes all tags atomically, so partial failure isn't an issue. However, if the build fails, there's no explicit error message before the script ends. Consider adding:

if ! docker buildx build ...; then
  echo "🔴 Docker build failed"
  exit 1
fi

🔵 Improvement: publish-images.yaml:27 — Hardcoded internal registry may cause issues

ISAACSIM_BASE_IMAGE: 'nvcr.io/nvidian/isaac-sim' # ${{ vars.ISAACSIM_BASE_IMAGE || 'nvcr.io/nvidia/isaac-sim' }}

The commented-out variable reference suggests this should be configurable but is hardcoded to an internal NVIDIA registry (nvidian). This is fine for internal use but the comment is misleading about the actual default behavior.

@myurasov-nv myurasov-nv force-pushed the my-postmerge-ci-fix branch from 039952a to 3348c1e Compare April 23, 2026 22:20
@myurasov-nv myurasov-nv marked this pull request as ready for review April 24, 2026 01:29
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 24, 2026

Greptile Summary

This PR replaces the broken postmerge-ci.yml with a new publish-images.yaml, introduces a centralized config.yaml to eliminate hardcoded Isaac Sim image references, and wires a new config job into all three CI workflows (build.yaml, daily-compatibility.yml, publish-images.yaml). The previously flagged stale env.ISAACSIM_BASE_IMAGE reference in daily-compatibility.yml is now fixed — both test-isaaclab-tasks-compat and test-general-compat correctly use needs.config.outputs.isaacsim_image_name.

Confidence Score: 5/5

Safe to merge — all logic is correct, the previously flagged stale reference is resolved, and only a minor style nit remains.

All P0/P1 concerns from prior review rounds are addressed. The remaining finding (unquoted --build-arg) is P2 style and does not affect runtime behavior for the current image name value.

No files require special attention.

Important Files Changed

Filename Overview
.github/workflows/publish-images.yaml New workflow replacing postmerge-ci.yml; implements clean tagging scheme, explicit NGC login failure, and proper docker buildx error handling. Minor: one unquoted --build-arg.
.github/workflows/config.yaml New centralized config file; cleanly consolidates IsaacSim and IsaacLab image references previously duplicated across workflows.
.github/workflows/build.yaml Removes hardcoded ISAACSIM_BASE_IMAGE/VERSION env vars and replaces with a config job; all test jobs now depend on config and use its outputs consistently.
.github/workflows/daily-compatibility.yml Removes hardcoded ISAACSIM_BASE_IMAGE; adds config job; both test jobs now correctly reference needs.config.outputs.isaacsim_image_name — the previously flagged stale reference is resolved.
.github/workflows/postmerge-ci.yml Deleted and superseded by publish-images.yaml; the new file is cleaner and fixes the previously broken tagging/publishing logic.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Push to main / develop / release/**] --> B[config job\nreads config.yaml via yq]
    B --> C[build-and-push-images job]
    C --> D{Branch?}
    D -- develop --> E["$IMAGE:latest-develop\n$IMAGE:<sha>"]
    D -- main --> F["$IMAGE:latest\n$IMAGE:v<VERSION>\n$IMAGE:<sha>"]
    D -- release/X --> G["$IMAGE:latest-release-X\n$IMAGE:<sha>"]
    D -- other --> H["$IMAGE:<sha> only"]
    E & F & G & H --> I[docker buildx build --push\nmultiarch if base supports it]

    subgraph config.yaml
        J[isaacsim_image_name]
        K[isaacsim_image_tag]
        L[isaaclab_image_name]
    end
    config.yaml --> B
Loading

Reviews (3): Last reviewed commit: "Rename config loading jobs" | Re-trigger Greptile

Comment thread .github/workflows/publish-images.yaml
Comment thread .github/workflows/publish-images.yaml Outdated
@myurasov-nv
Copy link
Copy Markdown
Member Author

@greptile-apps

@myurasov-nv myurasov-nv marked this pull request as draft April 24, 2026 02:12
@myurasov-nv myurasov-nv force-pushed the my-postmerge-ci-fix branch from e643455 to 2022578 Compare April 24, 2026 07:31
@myurasov-nv myurasov-nv force-pushed the my-postmerge-ci-fix branch from ecf4d7f to 44c3562 Compare April 24, 2026 07:51
@myurasov-nv myurasov-nv marked this pull request as ready for review April 24, 2026 09:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working infrastructure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants