[main] qwen-vl THD packed-sequence support and fixes by DAISY-gh · Pull Request #3323 · NVIDIA-NeMo/Megatron-Bridge

DAISY-gh · 2026-04-14T12:57:25Z

Summary

Add Qwen-VL THD packed-sequence support path and related runtime/config fixes in Megatron-Bridge.
Update Qwen-VL recipe/task-encoder/vlm-step flow for THD batch assembly and MRoPE/padding-mask handling.
Megatron-LM dependency context: this work is aligned with upstream Open PR [main] feat(moe): Support packed sequence for gated delta net (GDN) NVIDIA/Megatron-LM#2645.

curves of SFT on cord_v2 dataset based on Qwen3.5 35BA3B.

Summary by CodeRabbit

Release Notes

New Features
- Added batch-level sequence packing support for improved training efficiency in data processing pipelines.
- Introduced new Qwen3.5-VL training recipes with Energon dataset integration and configurable packing options.
Improvements
- Enhanced robustness by making optional dependencies (modelopt, diffusion modules) gracefully degrade when unavailable.
- Improved multi-modal model training with better position computation and token accounting for mixture-of-experts layers.
- Extended data provider configuration with new shuffling and packing buffer parameters for flexible batch preparation.

copy-pr-bot · 2026-04-14T12:57:30Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-04-14T13:11:18Z

📝 Walkthrough

Walkthrough

This update introduces batch-level sequence packing support for Qwen Vision Language models with THD (Token Head Dimension) diagnostics infrastructure, adds optional dependency handling for diffusion and modelopt modules, and extends Energon dataset configuration for flexible path resolution and packing strategies.

Changes

Cohort / File(s)	Summary
Submodule Reference `3rdparty/Megatron-LM`	Updated Megatron-LM vendored submodule commit reference from `980211ae...` to `9da0ee7b...`.
Dependency Management `src/megatron/bridge/models/conversion/auto_bridge.py`, `src/megatron/bridge/recipes/__init__.py`, `src/megatron/bridge/recipes/qwen_vl/qwen3_vl.py`	Added guarded imports with fallback behavior: `is_quantized` defaults to `False` on import failure; diffusion recipes wrapped in try/except; `Qwen3VLProcessor` falls back to `AutoProcessor` when unavailable. Enables graceful degradation when optional dependencies are absent.
Energon Provider Configuration `src/megatron/bridge/data/energon/energon_provider.py`	Added batch-level packing fields (`batch_level_packing`, `packing_buffer_size`, `shuffle_buffer_size`) and bin-based path resolution (`cord_bins_root`, `cord_bin_prefix`, `cord_bin_id`). Implements computed `resolved_path` logic and synchronizes task encoder sequence lengths when applicable.
Packed Sequence Core Logic `src/megatron/bridge/recipes/qwen_vl/data/energon/task_encoder.py`	Introduced `QwenVLTaskSamplePacked` dataclass, greedy knapsack binpacking algorithm, and packing metadata fields (`cu_lengths`, `max_lengths`, `cu_seqlens_argmin`) to `QwenVLTaskBatch`. Implements `select_samples_to_pack()` and `pack_selected_samples()` with optional THD diagnostics.
Qwen3VL Model Extensions `src/megatron/bridge/models/qwen_vl/modelling_qwen3_vl/model.py`	Added `rope_cu_seqlens` and `moe_padding_mask` parameters, THD-gated diagnostics infrastructure, packed RoPE/position-id computation from `cu_seqlens` splits, and MoE padding mask derivation. Includes alignment and multi-rope (MRoPE) diagnostics for packed sequences.
Qwen3VL Model Integration `src/megatron/bridge/models/qwen_vl/modelling_qwen3_vl/text_model.py`, `src/megatron/bridge/models/qwen_vl/modelling_qwen3_vl/transformer_block.py`	Threaded `padding_mask` parameter through text model and transformer block forward passes, including checkpointed activation paths; ensures mask propagation into decoder, postprocess, and layer invocations.
Packed Sequence Utilities `src/megatron/bridge/models/qwen_vl/modelling_qwen3_vl/utils.py`, `src/megatron/bridge/training/utils/packed_seq_utils.py`	Converted `attention_mask` to boolean in `preprocess_packed_seqs` for correct indexing semantics. Updated `get_packed_seq_params` to compute padded/unpadded CU boundaries and generate `cu_seqlens_q_padded`/`cu_seqlens_kv_padded` when unpadded variant unavailable.
Model Provider and Configuration `src/megatron/bridge/models/qwen_vl/qwen35_vl_provider.py`, `src/megatron/bridge/recipes/qwen_vl/__init__.py`, `src/megatron/bridge/recipes/qwen_vl/qwen35_vl.py`	Added `_patch_mtp_attention_specs()` helper and cached MTP spec computation; introduced new Energon-backed SFT configs (`qwen35_vl_35b_a3b_sft_energon_config`, `qwen35_vl_35b_a3b_sft_energon_nopack_config`); exported new config symbols in `__all__`.
Step and Training Integration `src/megatron/bridge/training/vlm_step.py`, `src/megatron/bridge/training/config.py`, `src/megatron/bridge/recipes/qwen_vl/qwen3_vl_step.py`	Added Energon pre-packed fast path in `vlm_step`, THD feature gates (`THD_DIAG`, `THD_FORCE_BSHD`, etc.), unpadded CU boundary derivation, and MoE padding mask generation. Implemented mutual-exclusivity validation between in-batch and batch-level packing; unified packing enablement logic across both modes.

Sequence Diagram(s)

sequenceDiagram
    participant Energon as Energon Provider
    participant TaskEnc as Task Encoder
    participant Step as VLM Step
    participant Model as Qwen3VL Model

    rect rgba(100, 150, 200, 0.5)
        Note over Energon,Model: Unpacked Path (Standard)
        Energon->>TaskEnc: samples (unpacked)
        TaskEnc->>TaskEnc: batch() → QwenVLTaskBatch<br/>(no packing metadata)
        Step->>Step: forward_step() → standard<br/>attention_mask, position_ids
        Step->>Model: forward(position_ids, attention_mask)
    end

    rect rgba(150, 200, 100, 0.5)
        Note over Energon,Model: Energon Pre-Packed Path (THD)
        Energon->>TaskEnc: samples (packed)<br/>or batch_level_packing=True
        TaskEnc->>TaskEnc: select_samples_to_pack()<br/>+ pack_selected_samples()
        TaskEnc->>TaskEnc: batch() → QwenVLTaskBatch<br/>(cu_lengths, cu_seqlens_argmin)
        Step->>Step: forward_step() detects cu_seqlens<br/>in batch → fast return<br/>cu_seqlens, moe_padding_mask
        Step->>Model: forward(rope_cu_seqlens,<br/>moe_padding_mask, position_ids=None)
        Model->>Model: derive MRoPE positions<br/>from cu_seqlens
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 62.96% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Results For Major Changes	⚠️ Warning	PR introduces major changes affecting numerics, performance, and stability but explicitly states testing is not yet executed and provides no test results or regression documentation.	Execute documented test plan including unit tests, pre-commit checks, convergence validation, and performance measurements before merging.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'qwen-vl THD packed-sequence support and fixes' accurately reflects the main changes: THD packed-sequence infrastructure for Qwen-VL, including packing support, padding masks, MRoPE fixes, and related configuration updates.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/megatron/bridge/models/qwen_vl/qwen3_vl_step.py (1)

251-265: ⚠️ Potential issue | 🟠 Major

batch_level_packing is enabled before the THD metadata is actually wired through.

This path now turns on packed handling for dataset.batch_level_packing, but it still re-synthesizes PackedSeqParams, rewrites the mask to all ones, and never forwards the new padding_mask / rope-MoE boundary metadata. THD batches will therefore keep counting alignment padding as real tokens on this Qwen3-specific step. Mirror the metadata flow in src/megatron/bridge/training/vlm_step.py before enabling this mode here.

Also applies to: 280-292
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/megatron/bridge/models/qwen_vl/qwen3_vl_step.py` around lines 251 - 265,
The code enables batch_level_packing (via dataset.batch_level_packing and
variable batch_level_pack_enabled) before the THD/packed-sequence metadata is
propagated, but then calls pack_or_pad_batch_sequences which re-synthesizes
PackedSeqParams, overwrites masks (loss_mask/attention_mask) and never forwards
the original padding_mask / rope-MoE boundary metadata, causing THD batches to
treat alignment padding as real tokens; fix by mirroring the metadata flow from
src/megatron/bridge/training/vlm_step.py: when batch_level_packing (and
in_batch_pack_enabled) is true, do not re-create PackedSeqParams
blindly—retrieve and forward the existing PackedSeqParams/padding_mask/rope-MoE
boundary metadata from this_pg_collection (or the THD batch metadata), ensure
pack_or_pad_batch_sequences is invoked in a mode that preserves and returns the
original packed_seq_params and padding_mask, and wire those returned values into
subsequent steps so that batch_level_packing is only enabled after the proper
metadata has been wired through.

src/megatron/bridge/training/utils/packed_seq_utils.py (1)

46-57: ⚠️ Potential issue | 🔴 Critical

The torch.argmin fallback trims every cu_seqlens* tensor to empty.

cu_seqlens starts with 0 by construction, so torch.argmin(...) returns the first element, not the first sentinel tail. Any batch that reaches this fallback path will build empty boundary arrays and invalid PackedSeqParams.

Suggested change

     if cu_seqlens_argmin is not None:
         cu_seqlens_padded = cu_seqlens_padded[: cu_seqlens_argmin.item()]
     else:
-        cu_seqlens_padded = cu_seqlens_padded[: torch.argmin(cu_seqlens_padded)]
+        padded_tail = torch.nonzero(cu_seqlens_padded[1:] == 0, as_tuple=False)
+        first_tail = padded_tail[0].item() + 1 if padded_tail.numel() else cu_seqlens_padded.numel()
+        cu_seqlens_padded = cu_seqlens_padded[:first_tail]
 
     if cu_seqlens_unpadded is not None:
         if cu_seqlens_unpadded_argmin is not None:
             cu_seqlens_unpadded = cu_seqlens_unpadded[: cu_seqlens_unpadded_argmin.item()]
         else:
-            cu_seqlens_unpadded = cu_seqlens_unpadded[: torch.argmin(cu_seqlens_unpadded)]
+            unpadded_tail = torch.nonzero(cu_seqlens_unpadded[1:] == 0, as_tuple=False)
+            first_tail = unpadded_tail[0].item() + 1 if unpadded_tail.numel() else cu_seqlens_unpadded.numel()
+            cu_seqlens_unpadded = cu_seqlens_unpadded[:first_tail]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/megatron/bridge/training/utils/packed_seq_utils.py` around lines 46 - 57,
The fallback use of torch.argmin on cu_seqlens_padded and cu_seqlens_unpadded
yields 0 (because they start with 0) and trims tensors to empty; replace both
torch.argmin fallbacks with logic that finds the start index of the
sentinel/tail region (e.g., the first index whose value equals the final
cumulative value or otherwise the first index where values stop increasing) and
slice up to that index; update the branches that reference cu_seqlens_padded,
cu_seqlens_argmin, cu_seqlens_unpadded, and cu_seqlens_unpadded_argmin to use
this computed tail_start instead of torch.argmin to avoid producing empty
boundary arrays.

🧹 Nitpick comments (3)

src/megatron/bridge/recipes/qwen_vl/data/energon/task_encoder.py (3)

470-470: Remove unused merge_length variable.

The merge_length variable is assigned but never used in pack_selected_samples().

Suggested fix

-        merge_length = self.merge_size ** 2

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/megatron/bridge/recipes/qwen_vl/data/energon/task_encoder.py` at line
470, In pack_selected_samples(), remove the unused local variable merge_length
(assignment: merge_length = self.merge_size ** 2) since it is never referenced;
simply delete that line from the pack_selected_samples function (reference
symbols: merge_length, self.merge_size, pack_selected_samples) so no unused
variable remains.

528-531: Rename ambiguous loop variable l to length.

Single-letter l is easily confused with 1 (one). Use a descriptive name.

Suggested fix

-        for l in sub_lengths:
-            cu_lengths_list.append(cu_lengths_list[-1] + l)
+        for length in sub_lengths:
+            cu_lengths_list.append(cu_lengths_list[-1] + length)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/megatron/bridge/recipes/qwen_vl/data/energon/task_encoder.py` around
lines 528 - 531, The loop uses an ambiguous single-letter variable `l` when
iterating `sub_lengths`; rename it to a descriptive name like `length` in the
loop that builds `cu_lengths_list` (the block that defines `cu_lengths_list =
[0]`, iterates over `sub_lengths`, appends cumulative sums, and constructs
`cu_lengths`), updating the iteration variable wherever referenced to avoid
confusion with the digit `1`.

68-104: Add strict=True to zip() for safer iteration.

The zip() at line 85 should include strict=True to catch length mismatches between item_sizes and samples, which would indicate a bug in the caller.

Suggested fix

-    sorted_sizes, sorted_samples = zip(*sorted(zip(item_sizes, samples), key=lambda x: x[0]))
+    sorted_sizes, sorted_samples = zip(*sorted(zip(item_sizes, samples, strict=True), key=lambda x: x[0]), strict=True)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/megatron/bridge/recipes/qwen_vl/data/energon/task_encoder.py` around
lines 68 - 104, The zip() call that builds sorted_sizes and sorted_samples in
greedy_knapsack should use strict=True to catch any unexpected length mismatches
between item_sizes and samples; update the unzip step (currently: sorted_sizes,
sorted_samples = zip(*sorted(zip(item_sizes, samples), key=lambda x: x[0]))) to
pass strict=True to the inner zip so the tuple unpacking of sorted_sizes and
sorted_samples will raise immediately if lengths differ, keeping the existing
sort and subsequent logic (sorted_sizes, sorted_samples, _search_for_fit,
popping) unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/megatron/bridge/models/conversion/auto_bridge.py`:
- Around line 53-59: The optional import of is_quantized currently uses a broad
except Exception which can mask real errors; change the except to catch
ImportError only and emit a fallback log message including the exception reason
before defining the fallback is_quantized function so callers know quantization
detection was unavailable; update the try/except around "from
modelopt.torch.quantization.utils import is_quantized" to except ImportError as
e and log the fallback (using the module logger or logging.getLogger(__name__))
consistent with the other optional-import patterns in this file.

In `@src/megatron/bridge/recipes/__init__.py`:
- Around line 21-27: The current single try/except around "from
megatron.bridge.diffusion.recipes.flux.flux import *" and "from
megatron.bridge.diffusion.recipes.wan.wan import *" hides one optional family if
the other fails; split them into independent guarded imports so a
ModuleNotFoundError in flux doesn't suppress wan (and vice versa) by wrapping
each import in its own try/except (catch ModuleNotFoundError and pass) and keep
the same module names (flux, wan) so exports remain correct.

In `@src/megatron/bridge/recipes/qwen_vl/data/energon/task_encoder.py`:
- Around line 737-739: cu_seqlens_argmin is created with dtype=torch.int32 but
downstream code (and vlm_step.py) expects int64; change the creation of
cu_seqlens_argmin from torch.tensor(cu_seqlens_argmin_list, dtype=torch.int32)
to use dtype=torch.int64 so its .item()/host handling is consistent with
vlm_step.py; ensure related tensors (e.g., max_lengths if needed) remain
unchanged unless they also require int64.

In `@src/megatron/bridge/recipes/qwen_vl/qwen3_vl.py`:
- Around line 29-32: Replace the broad "except Exception" guards around the
Qwen3VLProcessor import and the runtime .from_pretrained() load with narrow
exception types and record a log when falling back: for the import block
catching Qwen3VLProcessor, catch only ImportError and ModuleNotFoundError and
log the exception and that the processor is unavailable; for the runtime load
around Qwen3VLProcessor.from_pretrained(), catch AttributeError (missing
submodule) and expected load errors such as OSError/RuntimeError (or ValueError)
and log the exception and that you're using the fallback path; reference
Qwen3VLProcessor and its from_pretrained() call when implementing these changes
so the handlers are applied to the correct places.

In `@src/megatron/bridge/recipes/qwen_vl/qwen35_vl.py`:
- Around line 393-417: The docstring for qwen35_vl_35b_a3b_sft_energon_config
incorrectly claims "enables batch-level packing by default (THD path)" while the
implementation sets cfg.dataset.batch_level_packing = False (and also disables
cfg.dataset.pack_sequences_in_batch); either update the docstring to state that
both in-batch and batch-level packing are disabled by default and must be
enabled via CLI overrides, or change the default in the function to
cfg.dataset.batch_level_packing = True (and adjust packing_buffer_size/policy
comments accordingly) so the code matches the docstring; ensure references to
cfg.dataset.pack_sequences_in_batch, cfg.dataset.batch_level_packing, and
cfg.dataset.packing_buffer_size are consistent.

---

Outside diff comments:
In `@src/megatron/bridge/models/qwen_vl/qwen3_vl_step.py`:
- Around line 251-265: The code enables batch_level_packing (via
dataset.batch_level_packing and variable batch_level_pack_enabled) before the
THD/packed-sequence metadata is propagated, but then calls
pack_or_pad_batch_sequences which re-synthesizes PackedSeqParams, overwrites
masks (loss_mask/attention_mask) and never forwards the original padding_mask /
rope-MoE boundary metadata, causing THD batches to treat alignment padding as
real tokens; fix by mirroring the metadata flow from
src/megatron/bridge/training/vlm_step.py: when batch_level_packing (and
in_batch_pack_enabled) is true, do not re-create PackedSeqParams
blindly—retrieve and forward the existing PackedSeqParams/padding_mask/rope-MoE
boundary metadata from this_pg_collection (or the THD batch metadata), ensure
pack_or_pad_batch_sequences is invoked in a mode that preserves and returns the
original packed_seq_params and padding_mask, and wire those returned values into
subsequent steps so that batch_level_packing is only enabled after the proper
metadata has been wired through.

In `@src/megatron/bridge/training/utils/packed_seq_utils.py`:
- Around line 46-57: The fallback use of torch.argmin on cu_seqlens_padded and
cu_seqlens_unpadded yields 0 (because they start with 0) and trims tensors to
empty; replace both torch.argmin fallbacks with logic that finds the start index
of the sentinel/tail region (e.g., the first index whose value equals the final
cumulative value or otherwise the first index where values stop increasing) and
slice up to that index; update the branches that reference cu_seqlens_padded,
cu_seqlens_argmin, cu_seqlens_unpadded, and cu_seqlens_unpadded_argmin to use
this computed tail_start instead of torch.argmin to avoid producing empty
boundary arrays.

---

Nitpick comments:
In `@src/megatron/bridge/recipes/qwen_vl/data/energon/task_encoder.py`:
- Line 470: In pack_selected_samples(), remove the unused local variable
merge_length (assignment: merge_length = self.merge_size ** 2) since it is never
referenced; simply delete that line from the pack_selected_samples function
(reference symbols: merge_length, self.merge_size, pack_selected_samples) so no
unused variable remains.
- Around line 528-531: The loop uses an ambiguous single-letter variable `l`
when iterating `sub_lengths`; rename it to a descriptive name like `length` in
the loop that builds `cu_lengths_list` (the block that defines `cu_lengths_list
= [0]`, iterates over `sub_lengths`, appends cumulative sums, and constructs
`cu_lengths`), updating the iteration variable wherever referenced to avoid
confusion with the digit `1`.
- Around line 68-104: The zip() call that builds sorted_sizes and sorted_samples
in greedy_knapsack should use strict=True to catch any unexpected length
mismatches between item_sizes and samples; update the unzip step (currently:
sorted_sizes, sorted_samples = zip(*sorted(zip(item_sizes, samples), key=lambda
x: x[0]))) to pass strict=True to the inner zip so the tuple unpacking of
sorted_sizes and sorted_samples will raise immediately if lengths differ,
keeping the existing sort and subsequent logic (sorted_sizes, sorted_samples,
_search_for_fit, popping) unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 48c45fd4-5bfc-4e7e-957c-c94172451c57

📥 Commits

Reviewing files that changed from the base of the PR and between ad27e2c and e9ff163.

📒 Files selected for processing (17)

3rdparty/Megatron-LM
src/megatron/bridge/data/energon/energon_provider.py
src/megatron/bridge/models/conversion/auto_bridge.py
src/megatron/bridge/models/qwen_vl/modelling_qwen3_vl/model.py
src/megatron/bridge/models/qwen_vl/modelling_qwen3_vl/text_model.py
src/megatron/bridge/models/qwen_vl/modelling_qwen3_vl/transformer_block.py
src/megatron/bridge/models/qwen_vl/modelling_qwen3_vl/utils.py
src/megatron/bridge/models/qwen_vl/qwen35_vl_provider.py
src/megatron/bridge/models/qwen_vl/qwen3_vl_step.py
src/megatron/bridge/recipes/__init__.py
src/megatron/bridge/recipes/qwen_vl/__init__.py
src/megatron/bridge/recipes/qwen_vl/data/energon/task_encoder.py
src/megatron/bridge/recipes/qwen_vl/qwen35_vl.py
src/megatron/bridge/recipes/qwen_vl/qwen3_vl.py
src/megatron/bridge/training/config.py
src/megatron/bridge/training/utils/packed_seq_utils.py
src/megatron/bridge/training/vlm_step.py

coderabbitai · 2026-04-14T13:11:21Z

+            cu_seqlens_argmin_list = [len(cu) for cu in cu_lengths_list]
+            cu_seqlens_argmin = torch.tensor(cu_seqlens_argmin_list, dtype=torch.int32)
+            max_lengths = torch.tensor(max_lengths_list, dtype=torch.int32)


⚠️ Potential issue | 🟡 Minor

cu_seqlens_argmin dtype should be torch.int64 to match downstream expectations.

The downstream code in vlm_step.py (line 122-125) expects cu_seqlens_argmin on host with dtype compatible with .item() returning an integer. While int32 works, vlm_step.py line 414 explicitly creates it as torch.int64. For consistency:

Suggested fix

- cu_seqlens_argmin = torch.tensor(cu_seqlens_argmin_list, dtype=torch.int32) + cu_seqlens_argmin = torch.tensor(cu_seqlens_argmin_list, dtype=torch.int64)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

cu_seqlens_argmin_list = [len(cu) for cu in cu_lengths_list]

cu_seqlens_argmin = torch.tensor(cu_seqlens_argmin_list, dtype=torch.int32)

max_lengths = torch.tensor(max_lengths_list, dtype=torch.int32)

cu_seqlens_argmin_list = [len(cu) for cu in cu_lengths_list]

cu_seqlens_argmin = torch.tensor(cu_seqlens_argmin_list, dtype=torch.int64)

max_lengths = torch.tensor(max_lengths_list, dtype=torch.int32)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/megatron/bridge/recipes/qwen_vl/data/energon/task_encoder.py` around lines 737 - 739, cu_seqlens_argmin is created with dtype=torch.int32 but downstream code (and vlm_step.py) expects int64; change the creation of cu_seqlens_argmin from torch.tensor(cu_seqlens_argmin_list, dtype=torch.int32) to use dtype=torch.int64 so its .item()/host handling is consistent with vlm_step.py; ensure related tensors (e.g., max_lengths if needed) remain unchanged unless they also require int64.

Thanks for the suggestion. cu_seqlens_argmin is currently consumed via .item() / host-side scalar reads, where torch.int32 and torch.int64 are functionally equivalent. In this code path, we intentionally keep int32 to stay consistent with surrounding packed-seq metadata tensors and avoid unnecessary dtype conversions. So here, changing to int64 is a consistency preference, not a functional requirement. If we later standardize all sequence-length/index metadata to int64 across the full pipeline, we can update this in one pass.

The PR description image shows four training metric charts for the Qwen 35B-A3B model, comparing three runs: qwen35vl_35B-A3B_bin0_mbs1_tp1cp1pp1ep4_energon_cord (pink), qwen35vl_35B-A3B_bin0_mbs1_tp1cp1pp1ep4_energon_cord_nopa (purple), and qwen35vl_35B-A3B_mbs4_tp1cp1pp1ep8 (brown). The charts show lm loss, mtp_1 loss, grad-norm, and global_load_balancing_loss over ~150 steps. The lm and mtp losses converge smoothly and comparably across all three runs. The grad-norm shows some early spikes but stabilizes similarly. The global_load_balancing_loss shows more differentiation — the brown (non-Energon) run has a faster declining curve starting around step 25, while the two Energon runs track closely together with a slightly higher and slower decline.

@DAISY-gh, that's a fair design rationale — since cu_seqlens_argmin is only consumed via .item() / host-side scalar reads in this path, int32 is functionally equivalent to int64 and keeping it consistent with the surrounding int32 packed-seq metadata (e.g., padded_cu, max_lengths) avoids unnecessary conversions. Acknowledged, and I'll note this for future reviews.

✏️ Learnings added

Learnt from: DAISY-gh Repo: NVIDIA-NeMo/Megatron-Bridge PR: 3323 File: src/megatron/bridge/recipes/qwen_vl/data/energon/task_encoder.py:737-739 Timestamp: 2026-04-14T13:42:22.684Z Learning: In NVIDIA-NeMo/Megatron-Bridge (task_encoder.py), `cu_seqlens_argmin` for Qwen-VL Energon packed batches is intentionally created as `torch.int32` (not `torch.int64`) to stay consistent with surrounding packed-sequence metadata tensors (`padded_cu`, `max_lengths`). It is only consumed via `.item()` / host-side scalar reads where `int32` and `int64` are functionally equivalent. Do not flag the `int32` dtype as a bug or inconsistency.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: cuichenx Repo: NVIDIA-NeMo/Megatron-Bridge PR: 3100 File: src/megatron/bridge/recipes/qwen_vl/qwen35_vl.py:196-270 Timestamp: 2026-04-01T22:37:23.790Z Learning: In NVIDIA-NeMo/Megatron-Bridge, treat Qwen3.5 model HuggingFace identifiers (e.g., "Qwen/Qwen3.5-9B", "Qwen/Qwen3.5-35B-A3B", "Qwen/Qwen3.5-122B-A10B", "Qwen/Qwen3.5-397B-A17B") as vision-language models even though the identifier text does not include "VL". When review feedback suggests these are text-only due to the model name, do not flag them as incorrect if they are used as VLM pretrain/SFT/PEFT `hf_path` values (as in `qwen35_vl.py` and its `qwen35_vl_*` functions).

Learnt from: aroshanghias-nvd Repo: NVIDIA-NeMo/Megatron-Bridge PR: 2007 File: src/megatron/bridge/data/mimo/dataset.py:158-166 Timestamp: 2026-01-28T20:41:39.210Z Learning: In Bridge Megatron-Core datasets, use torch.ones_like(input_ids) for attention masks instead of masking padding tokens. The Megatron-Core attention kernel handles causal masking internally during the forward pass, not at the dataset level. This pattern should apply broadly to Bridge datasets: SFT Dataset uses get_attention_mask_from_fusion=True (setting attention_mask=None), VLM Providers use skip_getting_attention_mask_from_dataset=True, and Qwen VL uses torch.ones_like(input_ids). Ensure dataset code consistently generates input_ids with a corresponding attention_mask of ones (or None where the kernel derives it) and verify through the dataset and forward-pass configuration that the model relies on the kernel's internal masking.

Learnt from: yaoyu-33 Repo: NVIDIA-NeMo/Megatron-Bridge PR: 2342 File: src/megatron/bridge/models/qwen_omni/thinker_model.py:326-345 Timestamp: 2026-02-13T00:22:54.549Z Learning: In Python implementations (e.g., src/megatron/bridge/models/qwen_omni/thinker_model.py), when a feature is not supported (such as audio embeddings), raise an explicit error (e.g., NotImplementedError) instead of silently ignoring the input to fail fast with a clear message. Use a descriptive exception message that identifies the unsupported feature and the expected behavior.

Pass a THD padding_mask derived from lm_input_ids into GPTModel so MoE router aux/global-aux metrics exclude tail padding and match BSHD token-count semantics. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Allow Qwen3VLGPTModel and Qwen3VLTransformerBlock to accept and propagate padding_mask so THD MoE routing can consume valid-token masking without crashing. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Forward padding_mask from Qwen3VLGPTModel into _postprocess so MTP MoE global aux-loss uses masked token accounting consistently. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Point submodule to the main-based Megatron-LM branch commit that propagates padding_mask through MTP MoE routing. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Guard diffusion recipe imports in recipes.__init__ so missing optional dependencies like diffusers do not block LLM/VLM training startup. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Avoid import-time crashes in training environments missing optional diffusion/modelopt/transformers pieces by adding graceful fallbacks for quantization and processor loading. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Signed-off-by: Daisy Gao <daisyg@nvidia.com>

…cessor fallback Signed-off-by: Daisy Gao <daisyg@nvidia.com>

…step Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Signed-off-by: Daisy Gao <daisyg@nvidia.com>

…sync TaskEncoder seq_len from CLI override - QwenVLTaskBatch: pass __key__ and __restore_key__ to satisfy Energon Batch base class - energon_provider: sync TaskEncoder.seq_len from EnergonProvider.seq_length at build time so CLI overrides (e.g. dataset.seq_length=8192) propagate to packing capacity and padding Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Signed-off-by: Daisy Gao <daisyg@nvidia.com>

…D format Signed-off-by: Daisy Gao <daisyg@nvidia.com>

…ng segment to avoid GDN NaN Signed-off-by: Daisy Gao <daisyg@nvidia.com>

…schedule.py convention Signed-off-by: Daisy Gao <daisyg@nvidia.com>

…oss regression Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Pass position_ids=None for VLM batches so Qwen3VLModel runs its packed MRoPE path, and add short-lived debug logs to verify packed boundary position resets are actually applied. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Add local helper functions used by THD diagnostic logging after cherry-picking THD MRoPE fixes, preventing runtime NameError for _thd_diag_enabled/_rank0. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Add missing Energon batch key fields and normalize/concat visual blocks along token dimension so packed multimodal batches no longer fail on mixed visual lengths. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Drop hardcoded first-iteration debug dumps from vlm_step and Qwen3VL MRoPE path so training logs stay clean while preserving core packed-sequence behavior. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Route THD diagnostic logs behind THD_DIAG and compute image token counts from the active model config so IterStats reflects real tokenizer ids. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Add switch-gated low-content packed-sample diagnostics and gate remaining THD debug logs so all related debug output is controlled by THD_DIAG. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Connect THD_FORCE_BSHD, THD_FORCE_SINGLE_SEGMENT_CU, and THD_SKIP_PREPROCESS_PACKED_POS to actual vlm_step control flow, and log switch activation only when enabled. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Keep unpadded cu_seqlens for MRoPE sub-sequence semantics while passing padded boundaries for TE/GDN kernels, and limit THD switch logs to rank0. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Avoid switching to the unstable unpadded TE packed path by keeping PackedSeqParams padded-only, and pass unpadded cu_seqlens separately for Qwen MRoPE sub-sequence indexing. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Add rank0 THD_DIAG logs to correlate packed boundaries, MoE padding mask counts, and loss-mask valid tokens without changing training behavior. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Build MoE padding mask from packed boundaries and pass it explicitly through Qwen3VL instead of inferring padding from lm_input_ids.eq(0), with fallback kept for compatibility. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Ensure packed mask remapping uses boolean indexing semantics when preprocessing moe_padding_mask, preventing explicit THD padding mask from collapsing to zero. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Add a default-off THD_DIAG_ALIGN switch to log pre/post packed remap summaries for labels and loss_mask, so we can directly verify whether THD preprocessing preserves supervision alignment. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Add opt-in THD MRoPE pre/post remap diagnostics behind THD_DIAG_MROPE to inspect position semantics without affecting default behavior. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Use boolean attention masks at all THD preprocess_packed_seqs call sites and add a function-level bool cast guard to prevent silent positional remap corruption. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Add opt-in strict diagnostics to compare THD-remapped position ids against explicit bool-mask gather semantics for cp_size==1. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Add opt-in THD boundary diagnostics to compare kernel vs rope cu summaries and report boundary-window label/loss/moe mask consistency, while keeping all new logic disabled by default. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Document how argmin-based trimming and padded boundary fields are used so THD packed-kernel boundary expectations are explicit. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Allow EnergonProvider to accept and forward shuffle_buffer_size so training scripts can explicitly control sample shuffling behavior in THD/BSHD comparison runs. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Apply nodebug recipe/provider updates while intentionally keeping the Megatron-LM submodule pointer unchanged. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Introduce independent batch-level packing control for THD dataloader, keep in-batch packing semantics isolated, and update validation/step logic so the two modes cannot be mixed. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Keep both in-batch and THD batch-level packing disabled by default, requiring explicit CLI overrides to enable either mode and avoid unintended coupling. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Narrow optional-import exception scopes, add fallback warnings, split diffusion optional imports, and align the qwen35 energon docstring with actual default packing behavior. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

…level packing Use safe cu_seqlens tail trimming when argmin metadata is absent, and prevent qwen3_vl_step from enabling batch-level packing before full THD metadata wiring. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

github-actions bot added the community-request label Apr 14, 2026

DAISY-gh force-pushed the thd-support-main branch from e9ff163 to 8d8b782 Compare April 14, 2026 13:05

coderabbitai bot reviewed Apr 14, 2026

View reviewed changes

DAISY-gh added 25 commits April 14, 2026 21:13

fix: align THD MoE aux-loss token accounting with valid tokens

dadddda

Pass a THD padding_mask derived from lm_input_ids into GPTModel so MoE router aux/global-aux metrics exclude tail padding and match BSHD token-count semantics. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: wire padding_mask through qwen3-vl gpt forward path

7e82138

Allow Qwen3VLGPTModel and Qwen3VLTransformerBlock to accept and propagate padding_mask so THD MoE routing can consume valid-token masking without crashing. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: pass padding_mask into qwen gpt postprocess mtp path

b564ad8

Forward padding_mask from Qwen3VLGPTModel into _postprocess so MTP MoE global aux-loss uses masked token accounting consistently. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: update Megatron-LM pointer for main-based THD support

32b6cba

Point submodule to the main-based Megatron-LM branch commit that propagates padding_mask through MTP MoE routing. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: make diffusion recipe imports optional

e0d6442

Guard diffusion recipe imports in recipes.__init__ so missing optional dependencies like diffusers do not block LLM/VLM training startup. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: guard optional deps for Qwen-VL recipe startup

12197d6

Avoid import-time crashes in training environments missing optional diffusion/modelopt/transformers pieces by adding graceful fallbacks for quantization and processor loading. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

feat: add qwen35_vl_35b_a3b_sft_energon_config recipe

1c9c669

Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: add packing_buffer_size to EnergonProvider and harden Qwen3VLPro…

9eec90c

…cessor fallback Signed-off-by: Daisy Gao <daisyg@nvidia.com>

feat: Add Energon THD packing support for QwenVL TaskEncoder and vlm_…

6ad755e

…step Signed-off-by: Daisy Gao <daisyg@nvidia.com>

feat: Add Energon SFT recipe for Qwen3.5-VL 35B-A3B with THD packing

d8bc57d

Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: patch MTP attention with Qwen3VLSelfAttention for correct M-RoPE

c5109cd

Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: correctly pass padded/unpadded cu_seqlens for packed sequence TH…

0f0d2bf

…D format Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: extend last segment boundary instead of adding zero-length paddi…

5193440

…ng segment to avoid GDN NaN Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: use padded cu_seqlens for all fields following Megatron-LM data_…

2bec464

…schedule.py convention Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: compute MRoPE per sub-sequence for packed sequences to fix 22x l…

c5f98ac

…oss regression Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: force Qwen-VL to recompute MRoPE in-model for THD

7f41593

Pass position_ids=None for VLM batches so Qwen3VLModel runs its packed MRoPE path, and add short-lived debug logs to verify packed boundary position resets are actually applied. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: restore THD diagnostic helper symbols in Qwen3VL model

1d5eb04

Add local helper functions used by THD diagnostic logging after cherry-picking THD MRoPE fixes, preventing runtime NameError for _thd_diag_enabled/_rank0. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: harden Energon QwenVL batch assembly for packed visual inputs

dc0a297

Add missing Energon batch key fields and normalize/concat visual blocks along token dimension so packed multimodal batches no longer fail on mixed visual lengths. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: remove temporary THD debug logging noise

e6f37fb

Drop hardcoded first-iteration debug dumps from vlm_step and Qwen3VL MRoPE path so training logs stay clean while preserving core packed-sequence behavior. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: gate THD diagnostics and remove hardcoded IterStats token id

35f2ec5

Route THD diagnostic logs behind THD_DIAG and compute image token counts from the active model config so IterStats reflects real tokenizer ids. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

chore: add THD low-content dataloader diagnostics

e76197d

Add switch-gated low-content packed-sample diagnostics and gate remaining THD debug logs so all related debug output is controlled by THD_DIAG. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: wire THD runtime switches into vlm step paths

4f3dc5f

Connect THD_FORCE_BSHD, THD_FORCE_SINGLE_SEGMENT_CU, and THD_SKIP_PREPROCESS_PACKED_POS to actual vlm_step control flow, and log switch activation only when enabled. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: split packed boundaries for rope and kernels

2f2f6e9

Keep unpadded cu_seqlens for MRoPE sub-sequence semantics while passing padded boundaries for TE/GDN kernels, and limit THD switch logs to rank0. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: keep THD kernels padded while using unpadded rope boundaries

2cabe35

Avoid switching to the unstable unpadded TE packed path by keeping PackedSeqParams padded-only, and pass unpadded cu_seqlens separately for Qwen MRoPE sub-sequence indexing. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

DAISY-gh added 13 commits April 14, 2026 21:13

chore: add THD mask alignment diagnostics

c1d475e

Add rank0 THD_DIAG logs to correlate packed boundaries, MoE padding mask counts, and loss-mask valid tokens without changing training behavior. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: use explicit THD MoE padding mask

2c4bec7

Build MoE padding mask from packed boundaries and pass it explicitly through Qwen3VL instead of inferring padding from lm_input_ids.eq(0), with fallback kept for compatibility. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: cast attention mask to bool for THD mask remap

788e110

Ensure packed mask remapping uses boolean indexing semantics when preprocessing moe_padding_mask, preventing explicit THD padding mask from collapsing to zero. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

chore: add THD label/loss alignment diagnostics

59173b9

Add a default-off THD_DIAG_ALIGN switch to log pre/post packed remap summaries for labels and loss_mask, so we can directly verify whether THD preprocessing preserves supervision alignment. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

chore: add THD MRoPE position diagnostics

3cf745d

Add opt-in THD MRoPE pre/post remap diagnostics behind THD_DIAG_MROPE to inspect position semantics without affecting default behavior. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: enforce bool masks for THD packed remap

830c711

Use boolean attention masks at all THD preprocess_packed_seqs call sites and add a function-level bool cast guard to prevent silent positional remap corruption. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

chore: add strict THD MRoPE remap check

b2bb9ad

Add opt-in strict diagnostics to compare THD-remapped position ids against explicit bool-mask gather semantics for cp_size==1. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

chore: add THD boundary diagnostics switch

365c1cd

Add opt-in THD boundary diagnostics to compare kernel vs rope cu summaries and report boundary-window label/loss/moe mask consistency, while keeping all new logic disabled by default. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

chore: clarify packed-seq boundary documentation

bd33617

Document how argmin-based trimming and padded boundary fields are used so THD packed-kernel boundary expectations are explicit. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

feat: expose energon shuffle buffer override

398ec9b

Allow EnergonProvider to accept and forward shuffle_buffer_size so training scripts can explicitly control sample shuffling behavior in THD/BSHD comparison runs. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

chore: sync nodebug training updates without submodule bump

f726548

Apply nodebug recipe/provider updates while intentionally keeping the Megatron-LM submodule pointer unchanged. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

refactor: split batch-level and in-batch packing switches

fa7f38c

Introduce independent batch-level packing control for THD dataloader, keep in-batch packing semantics isolated, and update validation/step logic so the two modes cannot be mixed. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

fix: disable packing defaults in energon recipe

30106c3

Keep both in-batch and THD batch-level packing disabled by default, requiring explicit CLI overrides to enable either mode and avoid unintended coupling. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

DAISY-gh force-pushed the thd-support-main branch from 8d8b782 to 30106c3 Compare April 14, 2026 13:13

DAISY-gh added 2 commits April 14, 2026 21:37

fix: address optional import handling and recipe docs feedback

7263f09

Narrow optional-import exception scopes, add fallback warnings, split diffusion optional imports, and align the qwen35 energon docstring with actual default packing behavior. Signed-off-by: Daisy Gao <daisyg@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[main] qwen-vl THD packed-sequence support and fixes#3323

[main] qwen-vl THD packed-sequence support and fixes#3323
DAISY-gh wants to merge 40 commits intoNVIDIA-NeMo:mainfrom
DAISY-gh:thd-support-main

DAISY-gh commented Apr 14, 2026 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Apr 14, 2026

Uh oh!

coderabbitai bot commented Apr 14, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Apr 14, 2026 •

edited

Loading

Uh oh!

DAISY-gh Apr 14, 2026

Uh oh!

coderabbitai bot Apr 14, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

DAISY-gh commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Summary by CodeRabbit

Release Notes

Uh oh!

copy-pr-bot bot commented Apr 14, 2026

Uh oh!

coderabbitai bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DAISY-gh Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DAISY-gh commented Apr 14, 2026 •

edited

Loading

coderabbitai bot commented Apr 14, 2026 •

edited

Loading

coderabbitai bot Apr 14, 2026 •

edited

Loading