Fixes compatibility issues with latest Isaac Sim updates#5358
Fixes compatibility issues with latest Isaac Sim updates#5358kellyguo11 wants to merge 33 commits intoisaac-sim:developfrom
Conversation
…broken links Combined fixes from PR isaac-sim#5279 on top of deprecation migration from PR isaac-sim#5280: 1. install.py: Add prebundle probe functions that check pip_prebundle paths instead of extsDeprecated, and uninstall pip torch when prebundle shadows it 2. isaaclab/__init__.py: Deprioritize ml_archive/pip_prebundle on sys.path so pip-installed torch/numpy/etc take priority over prebundled copies 3. setup.py: Pin mujoco==3.5.0 4. Kit files: Remove omni.warp.core to prevent conflict with pip warp-lang 5. check-links.yml: Add ubuntu.com to link checker exclusions (returns 403) 6. New tests: test_install_commands.py, test_install_prebundle.py
- Add isaacsim.pip.newton and omni.warp.core to app.extensions.excluded in all 6 kit files to prevent Kit from loading these extensions - Add isaacsim.pip.newton to _CONFLICTING_EXTS in __init__.py so its pip_prebundle paths are deprioritized on sys.path - isaacsim.pip.newton prebundles newton 1.0.0 which shadows the pip-installed newton 1.1.0.dev0 needed by Isaac Lab - omni.warp.core loads a bundled warp that conflicts with pip-installed warp-lang used by Isaac Lab
Remove the pinned SHA digest from ISAACSIM_BASE_VERSION so CI tests against the most recent Isaac Sim nightly image, which includes the pip_prebundle path changes that this PR fixes.
The previous approach filtered only omni.isaac.ml_archive and isaacsim.pip.newton prebundle paths. On newer Isaac Sim nightly images, additional or renamed extensions may also inject conflicting packages via pip_prebundle. Match on 'pip_prebundle' in the path alone to catch all current and future prebundle directories.
The renderer's render() method launches warp kernels that populate the camera output tensors on the GPU. Without an explicit synchronization barrier, downstream torch operations (e.g. torch.mean on camera data) can execute on a different CUDA stream before the warp writes complete, resulting in 'CUDA error: an illegal memory access was encountered'. This was exposed by newer Isaac Sim nightly images that changed warp's default CUDA stream behavior. Add wp.synchronize() in Camera's _update_buffers_impl after the render pass to guarantee kernel completion before any torch access.
The render() method launches a warp reshape kernel via wp.launch() then immediately performs torch operations (depth clipping, inf assignment) on the same output buffer. These run on different CUDA streams — warp's stream vs torch's default stream — with no synchronization, causing illegal memory access errors that corrupt the CUDA context. Add wp.synchronize() after wp.launch() and before any torch post- processing within the per-data-type loop. This complements the existing wp.synchronize() in Camera._update_buffers_impl() which guards the render→read_output boundary.
Extend _deprioritize_prebundle_paths() to also demote known conflicting extension directories (omni.warp.core, omni.isaac.ml_archive, etc.) in addition to pip_prebundle paths. This prevents kit extensions that bundle their own Python packages from shadowing pip-installed versions. Also re-run the sanitization after SimulationApp + extensions load, since Kit may insert new paths onto sys.path during startup that were not present when isaaclab/__init__.py first ran.
The nightly Isaac Sim Docker (>= 04/14) ships warp 1.13.x inside kit/python/lib/python3.12/site-packages. setup_python_env.sh places that directory on PYTHONPATH ahead of pip's site-packages, so warp 1.13.x was imported despite the _pin_warp_import pre-import added in 4.6.8 — that pre-import was locking the kit warp, not the pip warp. Warp 1.13.x deprecates warp.types.array with changed semantics. omni.replicator.core accesses this symbol during rendering, triggering CUDA error 700 (illegal memory access). That error poisons the CUDA context: all subsequent warp kernel lookups return NULL, producing the "Failed to find forward kernel 'update_outdated_envs_kernel'" crash. Fix: extend _should_demote() to catch kit/python/lib/*/site-packages paths so that pip-installed warp-lang always wins. Also add explicit strip_hash=False to isaaclab.sensors.kernels as a defensive safety net, and broaden the AppLauncher post-startup diagnostic to cover kit paths.
The 4.6.9 path-demotion fix moved kit/python/lib/python3.12/site-packages to the end of sys.path, but warp-lang was not a pip requirement. With no pip-managed warp installed, the demoted kit warp 1.13.x was still the only warp available and was still imported. Root fix: add warp-lang>=1.0.0,<1.13 to pip requirements. When ./isaaclab.sh --install runs, pip uses kit's own Python (kit/python/bin/python3) which installs into kit/python/lib/python3.12/ site-packages — the same directory kit shipped warp 1.13.x — overwriting it with a compatible version. Also replace path-based warnings in _pin_warp_import and AppLauncher with version-based checks (>= 1.13). Path-based checks would fire false positives after pip installs warp-lang into kit's site-packages (a kit-managed path, but pip-managed content). Version-based checks correctly identify the actual incompatibility regardless of install location.
Greptile SummaryThis PR fixes three regressions introduced by recent Isaac Sim All three core fixes are well-scoped and correctly implemented. The import migration is comprehensive across all affected assets and sensors. The remaining inline findings are P2 style/cleanup items: stale CI comments left over from the now-reverted digest pin, and hardcoded pinocchio version strings that diverge from the project's dependency manifests. Confidence Score: 5/5Safe to merge — all three regressions are correctly addressed with no new logic errors introduced All inline findings are P2 (stale comments, version string maintenance). The three core bug fixes are correct and well-tested. The import migration is exhaustive. Prior P2 concerns about wp.synchronize scope and the Carb subscription fallback remain valid suggestions but do not block merge. .github/workflows/build.yaml has stale TODO comments that should be cleaned up before merging to avoid misleading future CI maintainers. Important Files Changed
Sequence DiagramsequenceDiagram
participant Kit as Kit/Extension Startup
participant Init as isaaclab.__init__
participant PhysxInit as isaaclab_physx.__init__
participant SM as Isaac Sim SimulationManager
participant PM as PhysxManager
participant RTX as IsaacRtxRenderer
Init->>Init: _deprioritize_prebundle_paths() — demote pip_prebundle on sys.path
PhysxInit->>SM: import isaacsim.core.simulation_manager (force)
PhysxInit->>SM: SM.enable_all_default_callbacks(False) — tear down Carb subscriptions
PhysxInit->>Kit: sys.modules['...'].SimulationManager = PhysxManager
Note over PM: PhysxManager is now sole owner of physics lifecycle
Kit->>Init: AppLauncher._load_extensions()
Init->>Init: _deprioritize_prebundle_paths() (2nd pass — catches Kit-injected paths)
loop per render frame
RTX->>RTX: get tiled_data_buffer from annotator
alt buffer.size == 0 (first 1-2 frames)
RTX->>RTX: continue — skip kernel launch, output stays zero-initialised
else buffer populated
RTX->>RTX: wp.launch(reshape_tiled_image, ...)
RTX->>RTX: wp.synchronize()
end
end
Reviews (2): Last reviewed commit: "Merge branch 'develop' into kellyg/fix-i..." | Re-trigger Greptile |
There was a problem hiding this comment.
🤖 Isaac Lab Review Bot
Summary
This PR fixes two critical regressions introduced by recent Isaac Sim develop builds: (1) an empty-annotator-buffer issue that causes CUDA context poisoning in the RTX renderer, and (2) Isaac Sim's SimulationManager invalidating the shared physics view on timeline stop events. The implementation is solid and well-documented, with comprehensive test coverage for the install-related changes.
Architecture Impact
Low cross-module impact. The two core fixes are self-contained:
- The RTX renderer fix (
isaac_rtx_renderer.py) only affects therender()method's loop over annotators - The SimulationManager patch (
isaaclab_physx/__init__.py) touches module initialization but correctly tears down the original class's subscriptions before redirecting
The install-related changes (isaaclab/__init__.py, install.py) are path sanitization and probe functions that don't affect runtime behavior after startup.
Implementation Verdict
Ship it — Clean implementation with thorough root-cause analysis. The fixes directly address the documented failure modes, and the defensive fallback in the patch code handles API variations gracefully.
Test Coverage
Excellent for install commands; adequate for runtime fixes.
For the install.py changes:
test_install_commands.py(786 lines) provides comprehensive unit tests covering torch probe, prebundle uninstall, CUDA torch installation, and package repointing across all environment types (uv, pip, conda, kit Python, Windows).test_install_prebundle.pyadds targeted tests for the probe script content and_split_install_itemsparser.
For the runtime fixes:
- The PR description mentions
test_tiled_camera.pywas used to verify the RTX renderer fix. The empty-buffer guard is a pure defensive check (skip kernel launch when no data), so existing camera tests implicitly cover the happy path. - The SimulationManager patch is harder to unit-test (requires full Isaac Sim runtime), but the fix is well-reasoned and the fallback is defensive.
CI Status
All CI failures are infrastructure-related (Docker buildkit errors: mkdir /var/lib/docker/containers/...: no such file or directory), not caused by this PR. Pre-commit ✅, Build Wheel ✅, Check for Broken Links ✅.
Findings
🔵 Improvement: source/isaaclab_physx/isaaclab_physx/renderers/isaac_rtx_renderer.py:321 — wp.synchronize() is called inside the per-data-type loop, which may cause unnecessary GPU pipeline stalls
The sync is now called after every data type's wp.launch(), but only the depth-clipping torch operations actually need it. For cameras with multiple data types (e.g., rgba + depth + semantic_segmentation), this adds unnecessary sync points.
Consider moving the sync outside the loop or only syncing before torch operations that actually modify the buffer:
# After the for loop, sync once before any torch post-processing
wp.synchronize()
# Then do all torch post-processing
for data_type in render_data.annotators:
if data_type == "distance_to_camera":
output_data[data_type][output_data[data_type] > cfg.spawn.clipping_range[1]] = torch.inf
# ... etcThis is a minor performance consideration, not a correctness issue. The current implementation is safe.
🔵 Improvement: source/isaaclab_physx/isaaclab_physx/init.py:80-86 — The fallback attribute-clearing loop could be more explicit about what it's doing
The fallback when enable_all_default_callbacks(False) fails manually sets subscription handles to None. While this works, it relies on internal implementation details of the original SimulationManager class. Adding a log statement would help debugging:
import logging
logger = logging.getLogger(__name__)
# ...
except Exception as e:
logger.debug(f"enable_all_default_callbacks unavailable ({e}), clearing subscription handles manually")
for attr in (
"_default_callback_warm_start",
# ...
):
if hasattr(original_class, attr):
setattr(original_class, attr, None)There was a problem hiding this comment.
🤖 Isaac Lab Review Bot — Re-review (b0acf8b)
Summary
This push updates all imports of the PhysX tensors API from the removed omni.physics.tensors.impl.api path to the new omni.physics.tensors.api path, tracking an upstream Isaac Sim module relocation. The changes are straightforward and correct.
What Changed Since Last Review (f41b744 → b0acf8b)
Single commit: "Updates omni.physics.tensor.api import"
This commit updates 17 files across three packages to use the new import path:
| Package | Files Changed | Change Type |
|---|---|---|
isaaclab |
4 files | Runtime imports + docstring links |
isaaclab_physx |
11 files | Runtime imports + docstring links |
isaaclab_newton |
1 file | Docstring link only |
docs |
3 files | Migration guide + example outputs |
Changed imports:
# Before
import omni.physics.tensors.impl.api as physx
# After
import omni.physics.tensors.api as physxArchitecture Impact
No behavioral changes. This is a pure import path update to track an upstream Isaac Sim API relocation. The class identities (ArticulationView, RigidBodyView, SimulationView, etc.) are unchanged — only the module path moved.
The migration documentation added to migrating_to_isaaclab_3-0.rst is clear and helpful for downstream users who need to update their own code.
Implementation Verdict
Ship it — Clean, mechanical update with correct changes across all affected files. The CHANGELOGs are properly updated and migration docs are comprehensive.
Test Coverage
No new tests needed — this is a pure import path update with no behavioral changes. Existing tests (especially test_tiled_camera.py and other sensor tests) will verify the imports work correctly at runtime.
CI Status
- ✅ pre-commit
- ✅ Build Wheel
- ✅ Check for Broken Links
- ⏳ Docker + Tests (in progress)
- ⏳ Build Base Docker Image (in progress)
- ❌ Installation Tests — Infrastructure failure (Docker buildkit
resolv.conf.tmp: no such file or directory), not caused by this PR
Findings
No issues with the new changes. The import path updates are correct and comprehensive.
Note from previous review: The two suggestions from my earlier review (batching wp.synchronize() calls and adding debug logging to the fallback path) remain valid but are minor improvements, not blockers. The core fixes are solid.
Made-with: Cursor
…Lab-public into kellyg/fix-installation
|
While looking at the CI failures on this PR I - with Claude, traced the root cause of some of the failures to _patch_isaacsim_simulation_manager() in isaaclab_physx/init.py. The call to enable_all_default_callbacks(False) disables all four Carb subscriptions including warm_start (PLAY). The warm_start callback is required to initialise the rendering pipeline on sim.reset() — without it, tiled-camera RGB output stays black and visualization-marker tests fail. Does this make sense to you - I am not familiar enough with the whole code base to be sure - but I thought it worth trying to investigate? Only _default_callback_on_stop needs to be unsubscribed — it's the one that calls invalidate_physics() and wrecks the shared omni.physics.tensors view. The other three callbacks, including warm_start, should be left intact. Looking at the isaaclab (core) [1/3] CI log from this PR, the specific failures consistent with this are: test_first_frame_textured_rendering (0/1) original_class = getattr(original_module, "SimulationManager", None) |
Signed-off-by: Kelly Guo <kellyg@nvidia.com>
Signed-off-by: Kelly Guo <kellyguo123@hotmail.com>
…Lab-public into kellyg/fix-installation
Signed-off-by: Kelly Guo <kellyg@nvidia.com>
…Lab-public into kellyg/fix-installation
Description
Fixes two regressions introduced by recent Isaac Sim
developbuilds (between 2026-04-13 and 2026-04-22) that broke the standard training entry point:1. Isaac Sim
SimulationManagerinvalidating the shared physics viewA recent Isaac Sim commit (
8df6beeb0, 6.0.0-alpha.180) decoratedSimulationManager._on_stop/_on_play/_on_stage_*with@staticmethodso they finally fire from their Carb subscriptions. The newly-working_on_stopcallsSimulationManager.invalidate_physics(), which invalidates the sharedomni.physics.tensorssimulation view thatPhysxManagerand every articulation_root_viewrely on, raising:on the first
scene.update()aftersim.reset().source/isaaclab_physx/isaaclab_physx/__init__.py:_patch_isaacsim_simulation_manager()now (a) force-importsisaacsim.core.simulation_managerso the patch always runs deterministically, (b) callsoriginal_class.enable_all_default_callbacks(False)to tear down the original class's Carb subscriptions, and (c) only then swaps the module attribute.PhysxManagerbecomes the single owner of the simulation view's lifecycle.2. PhysX tensors API module relocation
Recent Isaac Sim builds removed
omni.physics.tensors.impland now expose the PhysX Tensor API types directly underomni.physics.tensors.api. The old import raisesModuleNotFoundErrorat Isaac Lab import time, taking down every entry point that touches articulations, rigid objects, deformables, contact sensors, or ray caster sensors.import omni.physics.tensors.impl.api as physx→import omni.physics.tensors.api as physxacrossisaaclab(ray-caster sensors) andisaaclab_physx(assets and contact sensor), includingTYPE_CHECKING-gated imports.PhysX Tensor APISphinx cross-reference URLs in the articulation data classes and the example console output in the contact sensor / IMU docs.docs/source/migration/migrating_to_isaaclab_3-0.rst.Fixes # (issue)
Type of change
Screenshots
N/A — runtime fixes with no UI surface.
Checklist
pre-commitchecks with./isaaclab.sh --formatconfig/extension.tomlfileCONTRIBUTORS.mdor my name already exists there