Skip to content

make PR_TEST_MATRIX_EXTRAS_GPU use 2026-based relvals instead of 2025#2665

Merged
smuzaffar merged 1 commit intocms-sw:masterfrom
mmusich:2026relvals_forGPUmatrix
Jan 30, 2026
Merged

make PR_TEST_MATRIX_EXTRAS_GPU use 2026-based relvals instead of 2025#2665
smuzaffar merged 1 commit intocms-sw:masterfrom
mmusich:2026relvals_forGPUmatrix

Conversation

@mmusich
Copy link
Copy Markdown
Contributor

@mmusich mmusich commented Jan 27, 2026

In PR:

we changed the default HLT menu run in 2025 relvals to be the Fake one.
This makes the relval GPU matrix not as useful as it could be, because we can't run the heterogenous reco in the HLT itself.
Thsi PR proposes to run the 2026 equivalent workflows that are already in the relval_gpu matrix.

@AdrianoDee FYI

@cmsbuild
Copy link
Copy Markdown
Contributor

A new Pull Request was created by @mmusich for branch master.

@akritkbehera, @cmsbuild, @iarspider, @raoatifshad, @smuzaffar can you please review it and eventually sign? Thanks.
@ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.
cms-bot commands are listed here

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Jan 27, 2026

cms-bot internal usage

@mmusich
Copy link
Copy Markdown
Contributor Author

mmusich commented Jan 29, 2026

enable gpu

@mmusich
Copy link
Copy Markdown
Contributor Author

mmusich commented Jan 29, 2026

@cmsbuild, please test for CMSSW_16_0_X

@cmsbuild
Copy link
Copy Markdown
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-3e29f6/50991/summary.html
COMMIT: 59df7be
CMSSW: CMSSW_16_0_X_2026-01-28-2300/el8_amd64_gcc13
Additional Tests: GPU,AMD_MI300X,AMD_W7900,NVIDIA_H100,NVIDIA_L40S
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cms-bot/2665/50991/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 1 lines to the logs
  • Reco comparison results: 6 differences found in the comparisons
  • DQMHistoTests: Total files compared: 53
  • DQMHistoTests: Total histograms compared: 4149103
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4149083
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 52 files compared)
  • Checked 227 log files, 198 edm output root files, 53 DQM output files
  • TriggerResults: found differences in 1 / 51 workflows

AMD_MI300X Comparison Summary

Summary:

  • You potentially added 1 lines to the logs
  • Reco comparison results: 227 differences found in the comparisons
  • DQMHistoTests: Total files compared: 11
  • DQMHistoTests: Total histograms compared: 149765
  • DQMHistoTests: Total failures: 33641
  • DQMHistoTests: Total nulls: 10
  • DQMHistoTests: Total successes: 116114
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 10 files compared)
  • Checked 42 log files, 45 edm output root files, 11 DQM output files
  • TriggerResults: found differences in 1 / 10 workflows

AMD_W7900 Comparison Summary

Summary:

NVIDIA_H100 Comparison Summary

Summary:

  • You potentially added 4 lines to the logs
  • Reco comparison results: 234 differences found in the comparisons
  • DQMHistoTests: Total files compared: 11
  • DQMHistoTests: Total histograms compared: 149765
  • DQMHistoTests: Total failures: 25196
  • DQMHistoTests: Total nulls: 8
  • DQMHistoTests: Total successes: 124561
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 10 files compared)
  • Checked 42 log files, 45 edm output root files, 11 DQM output files
  • TriggerResults: found differences in 2 / 10 workflows

NVIDIA_L40S Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 194 differences found in the comparisons
  • DQMHistoTests: Total files compared: 11
  • DQMHistoTests: Total histograms compared: 149765
  • DQMHistoTests: Total failures: 22373
  • DQMHistoTests: Total nulls: 7
  • DQMHistoTests: Total successes: 127385
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 10 files compared)
  • Checked 42 log files, 45 edm output root files, 11 DQM output files
  • TriggerResults: found differences in 1 / 10 workflows

@smuzaffar
Copy link
Copy Markdown
Contributor

please test

@smuzaffar
Copy link
Copy Markdown
Contributor

@mmusich , this looks good to me and I can go ahead and merge it. Note that #2663 also updates 16.1.X GPU workflows .

@mmusich
Copy link
Copy Markdown
Contributor Author

mmusich commented Jan 30, 2026

Note that #2663 also updates 16.1.X GPU workflows

let's start with this one, otherwise we're blind to Run3 HLT on GPU...

@cmsbuild
Copy link
Copy Markdown
Contributor

-1

Failed Tests: RelVals-AMD_W7900
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-3e29f6/51010/summary.html
COMMIT: 59df7be
CMSSW: CMSSW_16_1_X_2026-01-29-2300/el8_amd64_gcc13
Additional Tests: GPU,AMD_MI300X,AMD_W7900,NVIDIA_H100,NVIDIA_L40S
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cms-bot/2665/51010/install.sh to create a dev area with all the needed externals and cmssw changes.

Failed RelVals-AMD_W7900

  • 34634.40334634.403_TTbar_14TeV+Run4D121PU_Patatrack_PixelOnlyAlpaka_Validation/step2_TTbar_14TeV+Run4D121PU_Patatrack_PixelOnlyAlpaka_Validation.log
  • 34634.40234634.402_TTbar_14TeV+Run4D121PU_Patatrack_PixelOnlyAlpaka/step2_TTbar_14TeV+Run4D121PU_Patatrack_PixelOnlyAlpaka.log
  • 34634.75134634.751_TTbar_14TeV+Run4D121PU_HLT75e33TimingAlpaka/step2_TTbar_14TeV+Run4D121PU_HLT75e33TimingAlpaka.log
Expand to see more relval errors ...

Comparison Summary

Summary:

  • You potentially removed 1 lines from the logs
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 52
  • DQMHistoTests: Total histograms compared: 4028550
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4028530
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 51 files compared)
  • Checked 222 log files, 193 edm output root files, 52 DQM output files
  • TriggerResults: no differences found

@mmusich
Copy link
Copy Markdown
Contributor Author

mmusich commented Jan 30, 2026

-1

these failures are unrelated, due to cms-sw/cmssw#49795

@smuzaffar
Copy link
Copy Markdown
Contributor

+externals

@smuzaffar smuzaffar merged commit 659a091 into cms-sw:master Jan 30, 2026
17 of 21 checks passed
@cmsbuild
Copy link
Copy Markdown
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (but tests are reportedly failing). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @mandrenguyen, @ftenchini (and backports should be raised in the release meeting by the corresponding L2)

@mmusich mmusich deleted the 2026relvals_forGPUmatrix branch January 30, 2026 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants