Skip to content

Use the GPU family macros for the ROCm warp size#49016

Merged
cmsbuild merged 1 commit intocms-sw:masterfrom
fwyzard:update_warpsize_for_ROCm_7.0
Sep 30, 2025
Merged

Use the GPU family macros for the ROCm warp size#49016
cmsbuild merged 1 commit intocms-sw:masterfrom
fwyzard:update_warpsize_for_ROCm_7.0

Conversation

@fwyzard
Copy link
Copy Markdown
Contributor

@fwyzard fwyzard commented Sep 29, 2025

PR description:

Use the GPU family macros instead of the individual GPU macros to define the ROCm warp size.

PR validation:

Tests pass.

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Sep 29, 2025

cms-bot internal usage

@fwyzard fwyzard changed the title Use the GPU family macros for the ROCm warp sie Use the GPU family macros for the ROCm warp size Sep 29, 2025
@cmsbuild
Copy link
Copy Markdown
Contributor

@cmsbuild
Copy link
Copy Markdown
Contributor

A new Pull Request was created by @fwyzard for master.

It involves the following packages:

  • HeterogeneousCore/AlpakaInterface (heterogeneous)

@cmsbuild, @fwyzard, @makortel can you please review it and eventually sign? Thanks.
@makortel, @rovere this is something you requested to watch as well.
@ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@fwyzard
Copy link
Copy Markdown
Contributor Author

fwyzard commented Sep 29, 2025

test parameters:

  • enable_tests = gpu
  • gpu = amd_w7900,amd_mi300x

@fwyzard
Copy link
Copy Markdown
Contributor Author

fwyzard commented Sep 29, 2025

please test

@fwyzard
Copy link
Copy Markdown
Contributor Author

fwyzard commented Sep 29, 2025

+heterogeneous

@cmsbuild
Copy link
Copy Markdown
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs after it passes the integration tests. This pull request will now be reviewed by the release team before it's merged. @mandrenguyen, @sextonkennedy, @ftenchini (and backports should be raised in the release meeting by the corresponding L2)

@cmsbuild
Copy link
Copy Markdown
Contributor

-1

Failed Tests: RelVals
Size: This PR adds an extra 20KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-54d638/48322/summary.html
COMMIT: 6970d6c
CMSSW: CMSSW_16_0_X_2025-09-28-2300/el8_amd64_gcc12
Additional Tests: GPU,AMD_MI300X,AMD_W7900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/49016/48322/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals

  • 2025.0000001DAS Error
  • 2024.0030001DAS Error
  • 2024.0070001DAS Error
Expand to see more relval errors ...
  • 2024.0050001
  • 2024.0060001
  • 2024.0040001

AMD_W7900 Comparison Summary

Summary:

@fwyzard
Copy link
Copy Markdown
Contributor Author

fwyzard commented Sep 29, 2025

please test

@cmsbuild
Copy link
Copy Markdown
Contributor

+1

Size: This PR adds an extra 20KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-54d638/48326/summary.html
COMMIT: 6970d6c
CMSSW: CMSSW_16_0_X_2025-09-28-2300/el8_amd64_gcc12
Additional Tests: GPU,AMD_MI300X,AMD_W7900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/49016/48326/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

There are some workflows for which there are errors in the baseline:
2024.0050001 step 1
The results for the comparisons for these workflows could be incomplete
This means most likely that the IB is having errors in the relvals.The error does NOT come from this pull request

Summary:

  • You potentially removed 1 lines from the logs
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3861349
  • DQMHistoTests: Total failures: 23
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3861306
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 214 log files, 184 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

AMD_MI300X Comparison Summary

Summary:

AMD_W7900 Comparison Summary

Summary:

@ftenchini
Copy link
Copy Markdown

+1

@ftenchini
Copy link
Copy Markdown

Actually, what's going on here? It's got both the "tests approved" and "tests started" label.

@fwyzard
Copy link
Copy Markdown
Contributor Author

fwyzard commented Sep 30, 2025

The reason is that the same commit (same branch) is used for this PR and for #48948.

@fwyzard
Copy link
Copy Markdown
Contributor Author

fwyzard commented Sep 30, 2025

But by now the tests should be completed for both PRs, so it looks like the bot got confused 🤔

@iarspider can you check ?

@iarspider
Copy link
Copy Markdown
Contributor

iarspider commented Sep 30, 2025 via email

@iarspider
Copy link
Copy Markdown
Contributor

There is a bug in the library that cmsbot uses to communicate with github: if there are too many statuses, it will only return first 30, so the bot doesn't see the status that should trigger posting the results. For now, could you please push a dummy (empty) commit to get a new sha and rerun the tests?

@fwyzard
Copy link
Copy Markdown
Contributor Author

fwyzard commented Sep 30, 2025

Ehm... since the tests passed and the PR is fully signed, can we just merge it ?

@iarspider
Copy link
Copy Markdown
Contributor

Fine with me, but ultimately it's upto @cms-sw/orp-l2 .

@mandrenguyen
Copy link
Copy Markdown
Contributor

merge

@cmsbuild cmsbuild merged commit 5cbdf19 into cms-sw:master Sep 30, 2025
35 checks passed
@fwyzard fwyzard deleted the update_warpsize_for_ROCm_7.0 branch March 22, 2026 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants