Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 33 additions & 6 deletions docs/releases/release-process.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,32 @@ From RC3 onward, RCs are cut **more frequently and as needed**, rather than stri

## Golden Values

Golden values are reference outputs used to validate model behavior in CI.
Golden values are reference outputs used to validate model behavior in CI. They live in the **internal CI repository** and are the baseline for the internal regression tracker — keeping them current and accurate is therefore critical for meaningful signal.

### When to update golden values

Any PR that can affect performance metrics (e.g. changes to model code, training loop, optimizer, or numerical kernels) **must be accompanied by a corresponding internal PR that updates the golden values** before merging. Do not wait until after the PR lands.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a statement for exceptions- "Exceptions can be made on rare occasion of issues with GPU availability- cluster is offline, compute availability is low, etc."?

### Updating golden values for PRs targeting `main`

1. **Rebase the MBridge PR against `main`** so it is at top-of-tree before launching CI.
2. **Launch an internal CI run** using:
- The **latest nightly container** as the base image.
- The **latest MCore commit** on `main`.
- The **MBridge PR commit** (the head of your MBridge branch).
3. Collect the outputs and open a PR against the **internal CI repository's `main` branch** with the updated golden values.
4. The MBridge PR and the internal golden-values PR should be merged together (or the golden-values PR first).

### Updating golden values during a release

When golden values need to be refreshed on the release branch (e.g. at the start of code-freeze or after an accepted regression):

1. **Rebase the MBridge PR against the MBridge release branch** so it is at the head of that branch.
2. **Launch an internal CI run** using:
- The **latest internal RC container** for the release.
- The **MCore commit pinned on the release branch**.
- The **MBridge PR commit** (head of the MBridge release branch).
3. Open a PR against the **internal CI repository's release branch** with the updated golden values.

Comment on lines +47 to 57
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clarify the commit source on Line 55 to avoid wrong SHA selection.

Line 55 says “MBridge PR commit” and then defines it as “head of the MBridge release branch.” That can conflict when multiple PRs target the same release branch. Recommend explicitly saying “the head commit of the PR branch (or exact PR SHA).”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/releases/release-process.md` around lines 47 - 57, The phrase "MBridge
PR commit" defined as "head of the MBridge release branch" is ambiguous; update
the text so it explicitly instructs to use the head commit of the PR branch (or
the exact PR SHA) to avoid picking the wrong commit when multiple PRs target the
same release branch—replace the current wording "MBridge PR commit (head of the
MBridge release branch)" with something like "MBridge PR commit — the head
commit of the PR branch (use the exact PR branch tip SHA or PR merge SHA)".

### During the RC Phase (before code-freeze)

Expand All @@ -41,24 +66,26 @@ This means golden values are not automatically updated with every run — a deli

### On the Release Branch (during code-freeze)

When the release branch is created at code-freeze, all golden values are updated **unconditionally**. Whatever the current output is becomes the new reference baseline for the release.
When the release branch is created at code-freeze, all golden values are updated **unconditionally** — whatever the current output is becomes the new reference baseline for the release.

In **Week 5**, the last bulk update of golden values is performed. After that point, engineers are individually responsible for updating any remaining golden values on the release branch, reviewing discrepancies and ensuring the suite is clean ahead of the release.

-----

## Code-Freeze

Code-freeze lasts **two weeks** and begins when RC3 is cut. This is the **stabilization phase** — no new features are landed.

### First Half
### First Half (Weeks 3–5)

- **Release branches are created.**
- All golden values on the release branch are updated unconditionally (see above).
- The **last bulk CI run** occurs one week into the code-freeze period.
- The **last bulk update of golden values** happens in **Week 5**.
- RCs continue to be cut as needed.

### Second Half
### Second Half (Weeks 6–7)

- **Engineers are responsible for updating golden values** on the release branch — reviewing any remaining discrepancies and ensuring the suite is in a clean state ahead of release.
- **Engineers are individually responsible for updating golden values** on the release branch — reviewing any remaining discrepancies and ensuring the suite is in a clean state ahead of release.
- RCs continue to be cut as needed.

### Release Day
Expand Down
Loading