Skip to content

Fix multi-world joint indexing in warp-to-mj coord conversion#2332

Open
jsw7460 wants to merge 5 commits intonewton-physics:mainfrom
jsw7460:sangwooshin/fix-warp-coords-kernel-indexing
Open

Fix multi-world joint indexing in warp-to-mj coord conversion#2332
jsw7460 wants to merge 5 commits intonewton-physics:mainfrom
jsw7460:sangwooshin/fix-warp-coords-kernel-indexing

Conversation

@jsw7460
Copy link
Copy Markdown

@jsw7460 jsw7460 commented Apr 7, 2026

Description

convert_warp_coords_to_mj_kernel used per-world joint index for joint_type and joint_child instead of the global index. This caused wrong body CoM lookups for worldid > 0. The sibling kernel convert_mj_coords_to_warp_kernel already uses the global index.

Checklist

  • New or existing tests cover these changes
  • The documentation is up to date with these changes
  • CHANGELOG.md has been updated (if user-facing change)

Test plan

  • Without fix: test_perworld_com_produces_consistent_physics FAILS
  • With fix: both tests PASS

Bug fix

Steps to reproduce:

  1. Create a multi-world model with an articulated body (free + revolute joint)
  2. Set different body_com per world via model.body_com + notify_model_changed(BODY_INERTIAL_PROPERTIES)
  3. Set non-zero initial angular velocity
  4. Step physics and compare world 1 result against a single-world reference with the same parameters
  5. World 1 diverges because convert_warp_coords_to_mj_kernel reads world 0's joint_child

Minimal reproduction:
See newton/tests/test_multiworld_body_properties.py

Summary by CodeRabbit

  • Bug Fixes

    • Fixed joint indexing across multiple simulation worlds, resolving incorrect center-of-mass and joint handling when running concurrent worlds.
  • Tests

    • Added regression tests that verify per-world body mass/center-of-mass produce simulation results consistent with equivalent single-world runs.

@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla bot commented Apr 7, 2026

CLA Signed

The committers listed above are authorized under a signed CLA.

@jsw7460 jsw7460 had a problem deploying to external-pr-approval April 7, 2026 02:20 — with GitHub Actions Error
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 7, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Fix per-world joint indexing in the MuJoCo kernel so joint metadata lookups use a global per-world joint index; add regression tests that verify replicated worlds with per-world body mass/COM yield consistent physics; record the fix in the changelog.

Changes

Cohort / File(s) Summary
Kernel Joint Indexing Fix
newton/_src/solvers/mujoco/kernels.py
In convert_warp_coords_to_mj_kernel, compute flattened per-world joint index joint_id = joints_per_world * worldid + jntid and use it for joint metadata lookups (joint_type, joint_q_start, joint_qd_start, joint_child, joint_dof_dim) so joint-type/child/DOF resolution is world-aware.
Multiworld Body Properties Tests
newton/tests/test_multiworld_body_properties.py
Add TestMultiworldBodyProperties with a parameterized regression test that compares a multi-world simulation (per-world body_mass and body_com) to independent single-world runs and asserts consistent final joint states.
Changelog
CHANGELOG.md
Add Unreleased → Fixed entry describing the correction: joint metadata lookups in convert_warp_coords_to_mj_kernel now use global joint indices, fixing incorrect CoM references for worldid > 0.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

1.0-release

Suggested reviewers

  • eric-heiden
  • vreutskyy
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main change: fixing multi-world joint indexing in the warp-to-mj coordinate conversion kernel.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jsw7460 jsw7460 had a problem deploying to external-pr-approval April 7, 2026 02:20 — with GitHub Actions Error
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
newton/tests/test_multiworld_body_properties.py (1)

114-121: Remove unused variable bodies_per_world in the reference loop.

Line 117 computes bodies_per_world but it's never used within the loop. For single-world models, direct indexing with body_mass[0] is correct.

♻️ Remove unused variable
         for mass_val, label in [(mass_a, "A"), (mass_b, "B")]:
             model = self._build_articulated_model(num_worlds=1, device=device)
-            bodies_per_world = model.body_count // model.world_count
             body_mass = model.body_mass.numpy()
             body_mass[0] = mass_val  # base body
             model.body_mass.assign(body_mass)
             ref_q[label] = self._run_sim(model)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@newton/tests/test_multiworld_body_properties.py` around lines 114 - 121,
Remove the unused local variable bodies_per_world from the reference loop in
test_multiworld_body_properties.py: inside the loop that builds the model (see
_build_articulated_model and use of model.body_mass), delete the computation
bodies_per_world = model.body_count // model.world_count since it is never used;
leave the rest of the loop (setting body_mass[0] and
model.body_mass.assign(...), and calling self._run_sim(model) to populate ref_q)
unchanged.
newton/_src/solvers/mujoco/kernels.py (1)

578-580: Fix is correct; consider extracting the global index for consistency with the sibling kernel.

The change correctly uses the global joint index for joint_type and joint_child lookups, aligning with how joint_q_start and joint_qd_start are already indexed.

For consistency with convert_mj_coords_to_warp_kernel (line 453), consider precomputing the global index once:

♻️ Optional: Extract global index to match sibling kernel pattern
     worldid, jntid = wp.tid()

+    joint_id = joints_per_world * worldid + jntid
+
     # Skip loop joints — they have no MuJoCo qpos/qvel entries
     q_i = mj_q_start[jntid]
     if q_i < 0:
         return

     qd_i = mj_qd_start[jntid]
-    type = joint_type[joints_per_world * worldid + jntid]
-    wq_i = joint_q_start[joints_per_world * worldid + jntid]
-    wqd_i = joint_qd_start[joints_per_world * worldid + jntid]
+    type = joint_type[joint_id]
+    wq_i = joint_q_start[joint_id]
+    wqd_i = joint_qd_start[joint_id]

And at line 610:

         # Get CoM offset in world frame
-        child = joint_child[joints_per_world * worldid + jntid]
+        child = joint_child[joint_id]

Also applies to: 610-611

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@newton/_src/solvers/mujoco/kernels.py` around lines 578 - 580, Extract the
global joint index once (e.g., compute a local variable like global_jnt =
joints_per_world * worldid + jntid) and use that variable for all array lookups
instead of repeating the expression; update usages in the kernel where
joint_type, joint_child, joint_q_start, and joint_qd_start are accessed (the
occurrences around the current block and the similar lookups at the later spots
around the 610-611 area) so indexing is consistent with
convert_mj_coords_to_warp_kernel and avoids duplicated arithmetic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@newton/_src/solvers/mujoco/kernels.py`:
- Around line 578-580: Extract the global joint index once (e.g., compute a
local variable like global_jnt = joints_per_world * worldid + jntid) and use
that variable for all array lookups instead of repeating the expression; update
usages in the kernel where joint_type, joint_child, joint_q_start, and
joint_qd_start are accessed (the occurrences around the current block and the
similar lookups at the later spots around the 610-611 area) so indexing is
consistent with convert_mj_coords_to_warp_kernel and avoids duplicated
arithmetic.

In `@newton/tests/test_multiworld_body_properties.py`:
- Around line 114-121: Remove the unused local variable bodies_per_world from
the reference loop in test_multiworld_body_properties.py: inside the loop that
builds the model (see _build_articulated_model and use of model.body_mass),
delete the computation bodies_per_world = model.body_count // model.world_count
since it is never used; leave the rest of the loop (setting body_mass[0] and
model.body_mass.assign(...), and calling self._run_sim(model) to populate ref_q)
unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 47675d84-a083-42ea-bc7a-93c49c729551

📥 Commits

Reviewing files that changed from the base of the PR and between 68fc4df and 003034f.

📒 Files selected for processing (3)
  • CHANGELOG.md
  • newton/_src/solvers/mujoco/kernels.py
  • newton/tests/test_multiworld_body_properties.py

@preist-nvidia preist-nvidia requested a review from vreutskyy April 7, 2026 08:52
@preist-nvidia preist-nvidia added this to the 1.2 Release milestone Apr 7, 2026
@vreutskyy vreutskyy had a problem deploying to external-pr-approval April 7, 2026 09:02 — with GitHub Actions Error
@vreutskyy vreutskyy had a problem deploying to external-pr-approval April 7, 2026 09:03 — with GitHub Actions Error
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 7, 2026

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
2072 1 2071 392
View the top 1 failed test(s) by shortest run time
TestMultiworldBodyProperties::test_perworld_com_produces_consistent_physics
Stack Traces | 0.026s run time
Traceback (most recent call last):
  File "D:\a\newton\newton\newton\tests\test_multiworld_body_properties.py", line 116, in test_perworld_com_produces_consistent_physics
    model = self._build_articulated_model(num_worlds=1, device=device)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\a\newton\newton\newton\tests\test_multiworld_body_properties.py", line 60, in _build_articulated_model
    model = template.finalize(device=device)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\a\newton\newton\newton\_src\sim\builder.py", line 9499, in finalize
    with wp.ScopedDevice(device):
         ^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\a\newton\newton\.venv\Lib\site-packages\warp\_src\utils.py", line 1253, in __init__
    self.device = wp.get_device(device)
                  ^^^^^^^^^^^^^^^^^^^^^
  File "D:\a\newton\newton\.venv\Lib\site-packages\warp\_src\context.py", line 5867, in get_device
    return runtime.get_device(ident)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\a\newton\newton\.venv\Lib\site-packages\warp\_src\context.py", line 5629, in get_device
    raise ValueError(f"Invalid device identifier: {ident}")
ValueError: Invalid device identifier: cuda:0

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CHANGELOG.md`:
- Line 88: Edit the CHANGELOG.md entry so it’s a user-facing sentence (avoid
kernel/indexing internals) and move it to the end of the "Fixed" list under the
[Unreleased] section; replace the current line referencing
convert_warp_coords_to_mj_kernel and joint_type/joint_child with a concise
user-facing entry such as "Fix multi-world MuJoCo coordinate conversion using
the wrong body center of mass in replicated worlds" and append it as the last
item in the Fixed section (do not include implementation details like per-world
vs global indices or function symbols).

In `@newton/tests/test_multiworld_body_properties.py`:
- Line 1: The SPDX header in newton/tests/test_multiworld_body_properties.py
uses the wrong creation year (2025); update the SPDX copyright line by replacing
"2025" with "2026" so it reads the correct creation year (2026).
- Around line 135-145: The quaternion comparisons for free joints are
sign-ambiguous; implement a helper named _assert_joint_q_close (place it near
_run_sim()) that compares position parts ([:3]) and tail parts ([7:]) with
np.testing.assert_allclose, but for the quaternion slice [3:7] flips actual when
np.dot(actual_quat, expected_quat) < 0 before asserting; then replace the two
direct np.testing.assert_allclose calls that compare multi_q[0]/multi_q[1] to
ref_q["A"]/ref_q["B"] (and the similar block at lines 176-186) with calls to
_assert_joint_q_close(actual=multi_q[i], expected=ref_q[...], atol=1e-4,
err_msg=...) so q and -q are treated as equivalent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 3a7c4321-241a-4663-b113-2980d40dd0ad

📥 Commits

Reviewing files that changed from the base of the PR and between 9a0a1ae and 97e4167.

📒 Files selected for processing (2)
  • CHANGELOG.md
  • newton/tests/test_multiworld_body_properties.py

Comment thread CHANGELOG.md Outdated
Comment thread newton/tests/test_multiworld_body_properties.py Outdated
Comment on lines +135 to +145
np.testing.assert_allclose(
multi_q[0],
ref_q["A"],
atol=1e-4,
err_msg="World 0 (mass_A) diverges from single-world reference",
)
np.testing.assert_allclose(
multi_q[1],
ref_q["B"],
atol=1e-4,
err_msg="World 1 (mass_B) diverges from single-world reference",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Make these joint_q comparisons quaternion-sign invariant.

Both assertion blocks compare the free-joint quaternion verbatim, so q and -q would fail here even though they represent the same pose. That makes the regression vulnerable to false negatives.

🧪 Suggested change
-        np.testing.assert_allclose(
+        self._assert_joint_q_close(
             multi_q[0],
             ref_q["A"],
             atol=1e-4,
             err_msg="World 0 (mass_A) diverges from single-world reference",
         )
-        np.testing.assert_allclose(
+        self._assert_joint_q_close(
             multi_q[1],
             ref_q["B"],
             atol=1e-4,
             err_msg="World 1 (mass_B) diverges from single-world reference",
         )

Add a helper once, e.g. near _run_sim():

def _assert_joint_q_close(self, actual, expected, *, atol: float, err_msg: str) -> None:
    np.testing.assert_allclose(actual[:3], expected[:3], atol=atol, err_msg=err_msg)

    actual_quat = actual[3:7].copy()
    expected_quat = expected[3:7]
    if np.dot(actual_quat, expected_quat) < 0.0:
        actual_quat *= -1.0
    np.testing.assert_allclose(actual_quat, expected_quat, atol=atol, err_msg=err_msg)

    np.testing.assert_allclose(actual[7:], expected[7:], atol=atol, err_msg=err_msg)

Based on learnings, “Quaternions in builder/test assertions use Warp’s XYZW convention ... prefer identity/non-identity checks that account for q and -q equivalence.”

Also applies to: 176-186

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@newton/tests/test_multiworld_body_properties.py` around lines 135 - 145, The
quaternion comparisons for free joints are sign-ambiguous; implement a helper
named _assert_joint_q_close (place it near _run_sim()) that compares position
parts ([:3]) and tail parts ([7:]) with np.testing.assert_allclose, but for the
quaternion slice [3:7] flips actual when np.dot(actual_quat, expected_quat) < 0
before asserting; then replace the two direct np.testing.assert_allclose calls
that compare multi_q[0]/multi_q[1] to ref_q["A"]/ref_q["B"] (and the similar
block at lines 176-186) with calls to _assert_joint_q_close(actual=multi_q[i],
expected=ref_q[...], atol=1e-4, err_msg=...) so q and -q are treated as
equivalent.

@vreutskyy vreutskyy had a problem deploying to external-pr-approval April 7, 2026 10:19 — with GitHub Actions Error
@vreutskyy vreutskyy had a problem deploying to external-pr-approval April 7, 2026 10:19 — with GitHub Actions Error
@vreutskyy vreutskyy temporarily deployed to external-pr-approval April 7, 2026 10:19 — with GitHub Actions Inactive
@vreutskyy vreutskyy temporarily deployed to external-pr-approval April 7, 2026 10:19 — with GitHub Actions Inactive
@eric-heiden
Copy link
Copy Markdown
Member

🟡 Nice fix for the multi-world joint indexing in the coordinate conversion kernels.

One more instance of the same per-world indexing pattern was missed in the forward kernel, and the new mass regression test doesn't fully exercise the code path it targets.


🟡 joint_dof_dim still uses bare jntid in the generic-joint branch

In convert_warp_coords_to_mj_kernel (newton/_src/solvers/mujoco/kernels.py, line 641), the generic-joint else branch still indexes joint_dof_dim with the per-world jntid:

axis_count = joint_dof_dim[jntid, 0] + joint_dof_dim[jntid, 1]

The reverse kernel (convert_mj_coords_to_warp_kernel, line 541) already uses the global joint_id for the same lookup:

axis_count = joint_dof_dim[joint_id, 0] + joint_dof_dim[joint_id, 1]

Today this is functionally benign because _validate_homogeneous_worlds() ensures joint_dof_dim values are identical across worlds. But this is exactly the same indexing pattern this PR is correcting everywhere else, and leaving it creates an asymmetry between the forward and reverse kernels. Fixing it now costs nothing and prevents a latent bug if homogeneity validation is ever relaxed.

Suggested fix:

axis_count = joint_dof_dim[joints_per_world * worldid + jntid, 0] + joint_dof_dim[joints_per_world * worldid + jntid, 1]

return state_0.joint_q.numpy()


def test_perworld_mass_produces_consistent_physics(test, device):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚪ Both links in _build_articulated_model use com=(0, 0, 0), so the cross-product term in v_origin = v_com - ω × com_world vanishes no matter which joint_child index is read. This means the test would pass both before and after the indexing fix.

The companion test_perworld_com_produces_consistent_physics is the test that actually exercises the CoM-lookup regression. Additionally, test_randomize_body_mass in test_mujoco_solver.py already covers per-world mass propagation into MuJoCo.

Consider either:

  • Giving the base link a non-zero CoM (e.g. com=wp.vec3(0.02, 0.0, -0.01)) so varying mass also stresses the CoM-lookup path, or
  • Removing this test if the existing coverage is sufficient, saving ~300 simulation steps.

@jsw7460
Copy link
Copy Markdown
Author

jsw7460 commented Apr 8, 2026

Thanks for the review! Both comments addressed:

  1. Fixed joint_dof_dim indexing in the generic-joint branch (line 643). Extracted joint_id = joints_per_world * worldid + jntid once at the top of the kernel and reused it for all 5 lookups, matching the pattern in convert_mj_coords_to_warp_kernel.

  2. Removed test_perworld_mass_produces_consistent_physics. As you pointed out, the buggy kernel reads body_com[wrong_child], so varying mass alone doesn't trigger the bug regardless of CoM offset (all worlds share the same CoM in the mass test, so the wrong index reads the same value). test_randomize_body_mass already covers per-world mass propagatioon. I kept only the CoM test that actually exercises the buggy lookup.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
newton/_src/solvers/mujoco/kernels.py (2)

573-645: ⚠️ Potential issue | 🟠 Major

Apply the same global joint indexing in convert_qfrc_actuator_from_mj_kernel.

This hunk fixes the CoM lookup in convert_warp_coords_to_mj_kernel, but the later convert_qfrc_actuator_from_mj_kernel still indexes joint-indexed arrays with bare jntid. For worldid > 0, its FREE-joint branch will still use world 0's joint_child / body_com when shifting actuator torques, so per-world inertial-property updates remain wrong on that path.

Suggested follow-up
 def convert_qfrc_actuator_from_mj_kernel(
     ...
 ):
     worldid, jntid = wp.tid()
+    joint_id = joints_per_world * worldid + jntid

     # Skip loop joints — they have no MuJoCo DOF entries
     q_i = mj_q_start[jntid]
     if q_i < 0:
         return

     qd_i = mj_qd_start[jntid]
     wqd_i = joint_qd_start[joints_per_world * worldid + jntid]

-    type = joint_type[jntid]
+    jtype = joint_type[joint_id]

-    if type == JointType.FREE:
+    if jtype == JointType.FREE:
         ...
-        child = joint_child[jntid]
+        child = joint_child[joint_id]
         com_world = wp.quat_rotate(rot, body_com[child])
         ...
     else:
-        axis_count = joint_dof_dim[jntid, 0] + joint_dof_dim[jntid, 1]
+        axis_count = joint_dof_dim[joint_id, 0] + joint_dof_dim[joint_id, 1]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@newton/_src/solvers/mujoco/kernels.py` around lines 573 - 645, The
convert_qfrc_actuator_from_mj_kernel uses bare jntid to index joint-indexed
arrays (e.g., joint_child, body_com) causing wrong per-world lookups when
worldid > 0; compute joint_id = joints_per_world * worldid + jntid (same as in
convert_warp_coords_to_mj_kernel) and use joint_id for all accesses to
joint_child, joint_type, joint_q_start, joint_qd_start, body_com, joint_dof_dim,
etc., ensuring FREE-joint branches and actuator torque shifts reference the
per-world joint entries rather than world-0 entries.

581-645: ⚠️ Potential issue | 🟡 Minor

Rename type to jtype to avoid Ruff A001 (builtin shadowing).

Line 581 shadows Python's builtin type. Ruff rule A001 is enabled (via ruff.toml), and jtype is the consistent naming convention used throughout the same file for similar variables.

Minimal rename
-    type = joint_type[joint_id]
+    jtype = joint_type[joint_id]
 ...
-    if type == JointType.FREE:
+    if jtype == JointType.FREE:
 ...
-    elif type == JointType.BALL:
+    elif jtype == JointType.BALL:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@newton/_src/solvers/mujoco/kernels.py` around lines 581 - 645, The local
variable named type shadows Python builtin type and should be renamed to jtype:
locate the assignment "type = joint_type[joint_id]" and replace it with "jtype =
joint_type[joint_id]", then update all subsequent uses in this code block (e.g.,
the conditionals "type == JointType.FREE", "type == JointType.BALL", and any
other references) to use jtype; ensure you also update any other occurrences of
this same local variable within the same function so naming is consistent with
the file's convention (use symbols joint_type, joint_id, joint_q_start, joint_q,
joint_qd_start, joint_qd, JointType).
♻️ Duplicate comments (2)
newton/tests/test_multiworld_body_properties.py (2)

1-1: ⚠️ Potential issue | 🟡 Minor

Update the SPDX header to the 2026 creation year.

This file is new in the current PR, so the header should use 2026, not 2025.

As per coding guidelines, “In SPDX copyright lines, use the year the file was first created. Do not create date ranges or update the year when modifying a file.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@newton/tests/test_multiworld_body_properties.py` at line 1, Update the SPDX
copyright header line in this new file by changing the creation year from 2025
to 2026: modify the SPDX line (the top-of-file comment starting with
"SPDX-FileCopyrightText") to use "2026" and do not create a date range; ensure
no other SPDX lines or ranges are added.

134-145: ⚠️ Potential issue | 🟡 Minor

Make the joint_q comparisons quaternion-sign invariant.

These assertions still compare the free-joint quaternion verbatim. q and -q encode the same pose, so this regression can fail even when the multi-world and single-world trajectories match. Please route them through a helper that flips one side when the quaternion dot product is negative.

Based on learnings, “Quaternions in builder/test assertions use Warp’s XYZW convention (identity [0, 0, 0, 1]). In newton/tests/test_import_mjcf.py (e.g., test_fit_box_to_mesh_inertia_rotated), prefer identity/non-identity checks that account for q and -q equivalence (e.g., via np.abs on components).”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@newton/tests/test_multiworld_body_properties.py` around lines 134 - 145, The
quaternion comparisons between multi_q[...] and ref_q[...] need to be made
sign-invariant: implement or call a helper (e.g.,
align_quaternion_sign(quat_ref, quat_test) or flip_quat_if_needed) that computes
the dot product between the two 4D quaternions and multiplies one by -1 when the
dot is negative, then use the aligned quaternions in the
np.testing.assert_allclose calls for multi_q[0] vs ref_q["A"] and multi_q[1] vs
ref_q["B"] so q and -q are treated as equivalent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@newton/_src/solvers/mujoco/kernels.py`:
- Around line 573-645: The convert_qfrc_actuator_from_mj_kernel uses bare jntid
to index joint-indexed arrays (e.g., joint_child, body_com) causing wrong
per-world lookups when worldid > 0; compute joint_id = joints_per_world *
worldid + jntid (same as in convert_warp_coords_to_mj_kernel) and use joint_id
for all accesses to joint_child, joint_type, joint_q_start, joint_qd_start,
body_com, joint_dof_dim, etc., ensuring FREE-joint branches and actuator torque
shifts reference the per-world joint entries rather than world-0 entries.
- Around line 581-645: The local variable named type shadows Python builtin type
and should be renamed to jtype: locate the assignment "type =
joint_type[joint_id]" and replace it with "jtype = joint_type[joint_id]", then
update all subsequent uses in this code block (e.g., the conditionals "type ==
JointType.FREE", "type == JointType.BALL", and any other references) to use
jtype; ensure you also update any other occurrences of this same local variable
within the same function so naming is consistent with the file's convention (use
symbols joint_type, joint_id, joint_q_start, joint_q, joint_qd_start, joint_qd,
JointType).

---

Duplicate comments:
In `@newton/tests/test_multiworld_body_properties.py`:
- Line 1: Update the SPDX copyright header line in this new file by changing the
creation year from 2025 to 2026: modify the SPDX line (the top-of-file comment
starting with "SPDX-FileCopyrightText") to use "2026" and do not create a date
range; ensure no other SPDX lines or ranges are added.
- Around line 134-145: The quaternion comparisons between multi_q[...] and
ref_q[...] need to be made sign-invariant: implement or call a helper (e.g.,
align_quaternion_sign(quat_ref, quat_test) or flip_quat_if_needed) that computes
the dot product between the two 4D quaternions and multiplies one by -1 when the
dot is negative, then use the aligned quaternions in the
np.testing.assert_allclose calls for multi_q[0] vs ref_q["A"] and multi_q[1] vs
ref_q["B"] so q and -q are treated as equivalent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 1ecaeed4-5dd2-4ca8-be28-91f1834c6096

📥 Commits

Reviewing files that changed from the base of the PR and between 867c6a0 and 8002c98.

📒 Files selected for processing (3)
  • CHANGELOG.md
  • newton/_src/solvers/mujoco/kernels.py
  • newton/tests/test_multiworld_body_properties.py
✅ Files skipped from review due to trivial changes (1)
  • CHANGELOG.md

Copy link
Copy Markdown
Member

@adenzler-nvidia adenzler-nvidia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix — the core indexing change in kernels.py looks correct. A few things still need attention before this can land (the test device issue is the main CI blocker).

Comment thread newton/tests/test_multiworld_body_properties.py Outdated
"""Verify that multi-world simulations with per-world body mass/CoM
produce physically consistent results across all worlds."""

def _build_articulated_model(self, num_worlds: int = 2, device: str = "cuda:0"):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoding device="cuda:0" breaks all CPU-only CI runners (ValueError: Invalid device identifier: cuda:0) — this is the sole test failure across ubuntu, ubuntu-arm, macos, and windows.

Please use the get_test_devices() / add_function_test pattern (like test_mujoco_solver.py and other tests in this repo) so the test auto-skips when CUDA is unavailable. An earlier commit (867c6a0) had this conversion but it was lost on the force-push.

# Set initial height and angular velocity for all worlds.
# Non-zero angular velocity is critical: the bug in
# convert_warp_coords_to_mj_kernel affects the velocity
# conversion v_origin = v_com - ω × com. With ω=0 the cross
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unicode × (MULTIPLICATION SIGN) triggers Ruff RUF003, which is what fails pre-commit.ci. Use ASCII x instead.

Suggested change
# conversion v_origin = v_com - ω × com. With ω=0 the cross
# convert_warp_coords_to_mj_kernel affects the velocity
# conversion v_origin = v_com - w x com. With w=0 the cross

Comment thread CHANGELOG.md Outdated
### Fixed

- Fix GL viewer crash when enabling "Gap + Margin" for soft-body-only states with no rigid body transforms
- Fix incorrect joint indexing in `convert_warp_coords_to_mj_kernel` that used per-world index instead of global index for `joint_type` and `joint_child` lookups, causing wrong CoM references for `worldid > 0`
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per repo guidelines, changelog entries should be user-facing and avoid internal implementation details. This references kernel names, parameter names, and indexing mechanics. Something like:

Suggested change
- Fix incorrect joint indexing in `convert_warp_coords_to_mj_kernel` that used per-world index instead of global index for `joint_type` and `joint_child` lookups, causing wrong CoM references for `worldid > 0`
- Fix multi-world coordinate conversion using the wrong body center of mass for replicated worlds

Comment thread newton/_src/solvers/mujoco/kernels.py Outdated
type = joint_type[jntid]
wq_i = joint_q_start[joints_per_world * worldid + jntid]
wqd_i = joint_qd_start[joints_per_world * worldid + jntid]
type = joint_type[joint_id]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: type shadows the Python builtin (Ruff A001, enabled in this repo's ruff.toml). The rest of this file uses jtype for the same purpose — rename for consistency?

Suggested change
type = joint_type[joint_id]
jtype = joint_type[joint_id]

(Would need to update the if type == checks below as well.)

@jsw7460 jsw7460 had a problem deploying to external-pr-approval April 13, 2026 13:49 — with GitHub Actions Error
@jsw7460 jsw7460 had a problem deploying to external-pr-approval April 13, 2026 13:49 — with GitHub Actions Error
@jsw7460 jsw7460 had a problem deploying to external-pr-approval April 13, 2026 14:45 — with GitHub Actions Error
Copy link
Copy Markdown
Member

@adenzler-nvidia adenzler-nvidia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing all the feedback! One more thing: the test should also run on CPU.

Sibling kernel bug filed as #2471.

Comment thread newton/tests/test_multiworld_body_properties.py
adenzler-nvidia added a commit to adenzler-nvidia/newton that referenced this pull request Apr 16, 2026
convert_qfrc_actuator_from_mj_kernel indexed joint_type, joint_child,
and joint_dof_dim with per-world jntid instead of the global index
(joints_per_world * worldid + jntid). For joint_child this caused
body_com lookups to always read world 0's CoM when computing the
free-joint force transform, producing wrong angular forces for
worldid > 0. The same class of bug was reported for
convert_warp_coords_to_mj_kernel in newton-physics#2332.

Closes newton-physics#2471
@jsw7460 jsw7460 had a problem deploying to external-pr-approval April 17, 2026 00:02 — with GitHub Actions Error
@jsw7460 jsw7460 had a problem deploying to external-pr-approval April 17, 2026 00:02 — with GitHub Actions Error
@jsw7460 jsw7460 force-pushed the sangwooshin/fix-warp-coords-kernel-indexing branch from 6a32136 to 7a572a3 Compare April 17, 2026 00:20
@jsw7460 jsw7460 had a problem deploying to external-pr-approval April 17, 2026 00:20 — with GitHub Actions Error
@jsw7460 jsw7460 had a problem deploying to external-pr-approval April 17, 2026 00:20 — with GitHub Actions Error
@@ -0,0 +1,167 @@
# SPDX-FileCopyrightText: Copyright (c) 2026 The Newton Developers
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions about the test scope — I wonder if it can be tightened.

The bug causes world-N to silently read world-0's per-world-replicated data. So with per-world-differing body properties and identical initial joint state, buggy code should produce identical trajectories across worlds while correct code produces divergent ones. Given that:

  1. Do the single-world reference runs add anything beyond "world 0 and world 1 diverge"? It seems like a single multi-world run with a not np.allclose(multi_q[0], multi_q[1]) assertion would catch the same regression at 1/3 the cost.

  2. The comment "Using multiple joints/bodies is essential because the indexing bug only manifests when joint_child maps to different bodies across the flat array" — is this actually true? For any worldid > 0, buggy indexing reads slot jntid instead of joints_per_world * worldid + jntid, so the bug should fire with 1 joint per world as long as body properties differ across worlds. Am I missing something about why the child revolute joint is load-bearing here?

  3. The body_com.numpy().reshape(...).assign(...) patching post-finalize() (with the BODY_INERTIAL_PROPERTIES notify) — is there a reason to do it this way rather than building each world with its target CoM directly via add_builder? The post-hoc mutation seems to be a consequence of using replicate, which forces identical worlds.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adenzler-nvidia
Thanks for the review.

After implementing the assertFalse(allclose(multi_q[0], multi_q[1])) check, I found it doesn't actually detect the bug. Wanted to walk through why in case I'm missing something.

Root cause. notify_model_changed(BODY_INERTIAL_PROPERTIES) syncs
model.body_com into the per-world mjw_model.body_ipos array (see _update_model_inertial_properties launching update_body_mass_ipos_kernel). So MuJoCo's internal dynamics always use the correct per-world CoM, independent of the conversion kernel.

The kernel's role is the Newton <-> MuJoCo velocity convention transform v_origin = v_com - ω × rotate(rot, body_com[child]). With the bug, it makes v_origin identical across worlds (when initial v_com / ω / rot are set identically in the test), but MuJoCo then simulates each world with its own per-world body_ipos. Identical initial v_origin + different body_ipos => different dynamics. Trajectories diverge in the buggy case too.

The options I'm considering is either

  1. Keep the current test as-is.
  2. Round-trip test at the kernel level: joint_qd → warp_to_mj → mj_to_warp should be identity; with the bug it drifts by ω × rotate(delta_com) per application. It has no stepping, and directly targets the kernel matth, but calls internal conversion methods rather than the public solver surface.

Any preference?

For your second and third points, I agree. Will modify it after we decide (1).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right — (1) is the right call, and my original (1) suggestion doesn't work. I missed that notify_model_changed(BODY_INERTIAL_PROPERTIES) syncs per-world body_ipos through a separately (and correctly) indexed kernel, so MuJoCo steps with the right per-world inertial state regardless of the conversion bug. Trajectories diverge either way, just to the wrong places in the buggy case. Apologies for the noise.

Between your two options, I'd go with (1). It tests the externally-observable behavior through the public solver surface, which is what we actually want to guard against regressing. (2) is tighter but tests an internal invariant between two kernels — a symmetric break in both would pass.

Happy to proceed with (1) as-is. Could you also resolve the merge conflicts on this branch when you get a chance?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased onto latest main and simplified the test setup per points (2) and (3):

  • Single free-floating body per world (removed the articulated revolute joint)
  • Build each world directly via add_builder instead of replicate + post-hoc body_com mutation

Keeping the reference-based comparison as agreed in option (1).

jsw7460 added 4 commits April 20, 2026 09:00
convert_warp_coords_to_mj_kernel used per-world joint index for joint_type, joint_child, and joint_dof_dim lookups instead of the global
  index. This caused wrong body CoM references for worldid > 0. The sibling convert_mj_coords_to_warp_kernel already uses the global index.

Extract joint_id = joints_per_world * worldid + jntid once and reuse it for all
  per-joint lookups. Rename the shadowed type local to jtype for consistency with other kernels in the file.

Add regression test that exercises the CoM lookup path via per-world body_com overrides and non-zero initial
  angular velocity. The test uses get_test_devices() and auto-skips on CPU-only runners since the buggy code path only executes on CUDA (SolverMuJoCo with use_mujoco_cpu=False).
  Reduce the test model to a single free-floating body per world
  (the articulated 2-link structure was not required to trigger the
  indexing bug). Build each world directly via add_builder with its
  target base CoM
@jsw7460 jsw7460 force-pushed the sangwooshin/fix-warp-coords-kernel-indexing branch from 7a572a3 to 66f2b01 Compare April 20, 2026 14:44
@jsw7460 jsw7460 had a problem deploying to external-pr-approval April 20, 2026 14:44 — with GitHub Actions Error
@jsw7460 jsw7460 had a problem deploying to external-pr-approval April 20, 2026 14:44 — with GitHub Actions Error
@jsw7460 jsw7460 requested a deployment to external-pr-approval April 20, 2026 14:51 — with GitHub Actions Waiting
@jsw7460 jsw7460 requested a deployment to external-pr-approval April 20, 2026 14:51 — with GitHub Actions Waiting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants