Skip to content

Change max-memory-used check to use real difference in MB not percentage diff. Increase the threshold to 80MB. Flag PR if check fails.#2622

Merged
smuzaffar merged 1 commit intocms-sw:masterfrom
gartung:master
Dec 15, 2025
Merged

Conversation

@gartung
Copy link
Copy Markdown
Member

@gartung gartung commented Nov 20, 2025

No description provided.

@cmsbuild
Copy link
Copy Markdown
Contributor

A new Pull Request was created by @gartung for branch master.

@akritkbehera, @cmsbuild, @iarspider, @smuzaffar can you please review it and eventually sign? Thanks.
@ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.
cms-bot commands are listed here

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Nov 20, 2025

cms-bot internal usage

Comment thread comparisons/compare-maxmem.py Outdated
Comment thread pr_testing/run-pr-comparisons Outdated
@cmsbuild
Copy link
Copy Markdown
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-20c417/49585/summary.html
COMMIT: 1ecd417
CMSSW: CMSSW_16_0_X_2025-11-20-1100/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cms-bot/2622/49585/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 4 lines from the logs
  • Reco comparison results: 6 differences found in the comparisons
  • Reco comparison had 2 failed jobs
  • DQMHistoTests: Total files compared: 51
  • DQMHistoTests: Total histograms compared: 3906528
  • DQMHistoTests: Total failures: 71
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3906437
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 50 files compared)
  • Checked 218 log files, 188 edm output root files, 51 DQM output files
  • TriggerResults: no differences found

@smuzaffar
Copy link
Copy Markdown
Contributor

@gartung , currently there is no indication on the summary page that shows if there are any memory comparisons which have memory threshold warning or error. Also system treats memory increase and decrease in the same way ( red color either there is +10MB or -10MB). Was this done on purpose?

@gartung
Copy link
Copy Markdown
Member Author

gartung commented Nov 21, 2025

Yes. It's the absolute value of the difference. Maybe it should only be for positive values.

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

4 similar comments
@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

@gartung
Copy link
Copy Markdown
Member Author

gartung commented Nov 21, 2025

@smuzaffar Do I need to modify report-pull-request-results.py to flag the PR as failed if the maxmem comparison fails?

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

1 similar comment
@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

@gartung gartung changed the title Increase the max memory difference threshold to 100MB. Add failed/FAILED to message to . Change max-memory-used check to use real difference in MB not percentage diff. Increase the threshold to 100MB. Flag PR is check fails. Nov 21, 2025
@makortel
Copy link
Copy Markdown
Contributor

Also system treats memory increase and decrease in the same way ( red color either there is +10MB or -10MB). Was this done on purpose?

Maybe it should only be for positive values.

I think flagging negative values (i.e. PR decreases the memory usage) would be useful as well. We could start with the same thresholds (10 MB and 80 MB), but maybe use different colors? (two shades of green? blue and green? we will probably end up with a very color blind unfriendly palette though)

@makortel
Copy link
Copy Markdown
Contributor

If any job's memory usage goes over the "error" threshold of 80 MB, I'd like to have a message in the PR test summary message (i.e. cms-sw/cmsdist#10199 (comment)) telling that at high level, e.g. number of workflow steps (jobs) whose memory usage increased (or whose memory usage decreased).

@smuzaffar Would you be ok for now to tag @cms-sw/core-l2 in that message? For the beginning I'd like to inspect the behavior. Eventually we could drop the tagging.

@smuzaffar
Copy link
Copy Markdown
Contributor

@smuzaffar Would you be ok for now to tag @cms-sw/core-l2 in that message? For the beginning I'd like to inspect the behavior. Eventually we could drop the tagging.

sorry , I missed this comment. Yes I am ok with tagging @cms-sw/core-l2

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

4 similar comments
@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

@cmsbuild
Copy link
Copy Markdown
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-20c417/49914/summary.html
COMMIT: a816dbf
CMSSW: CMSSW_16_0_X_2025-12-11-1100/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cms-bot/2622/49914/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 2 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 2 differences found in the comparisons
  • Reco comparison had 4 failed jobs
  • DQMHistoTests: Total files compared: 53
  • DQMHistoTests: Total histograms compared: 4273241
  • DQMHistoTests: Total failures: 44
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4273177
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 52 files compared)
  • Checked 227 log files, 198 edm output root files, 53 DQM output files
  • TriggerResults: no differences found

Max Memory Comparisons exceeding threshold

@cms-sw/core-l2 , I found 39 workflow(s) with memory usage exceeding the error threshold:

Workflow 11634.0_TTbar_14TeV+2022 step3 max memory used diff 66.025848 exceeds error threshold +/- 1.000000 MiB

Workflow 12834.0_TTbar_14TeV+2024 step3 max memory used diff -4.102814 exceeds error threshold +/- 1.000000 MiB

Workflow 12846.0_ZEE_14+2024 step3 max memory used diff -65.996857 exceeds error threshold +/- 1.000000 MiB

Workflow 1306.0_SingleMuPt1_UP15 step3 max memory used diff -65.996857 exceeds error threshold +/- 1.000000 MiB

Workflow 13234.0_TTbar_14TeV+2022FS step2 max memory used diff -2.017700 exceeds error threshold +/- 1.000000 MiB

Workflow 1330.0_ZMM_13 step3 max memory used diff -70.038803 exceeds error threshold +/- 1.000000 MiB

Workflow 1330.0_ZMM_13 step5 max memory used diff 3.566948 exceeds error threshold +/- 1.000000 MiB

Workflow 135.4_ZEEFS_13 step4 max memory used diff -4.150223 exceeds error threshold +/- 1.000000 MiB

Workflow 136.731_RunSinglePh2016B step3 max memory used diff 1.035248 exceeds error threshold +/- 1.000000 MiB

Workflow 136.793_RunDoubleEG2017C step3 max memory used diff -65.987762 exceeds error threshold +/- 1.000000 MiB

Workflow 136.874_RunEGamma2018C step3 max memory used diff -9.289871 exceeds error threshold +/- 1.000000 MiB

Workflow 139.001_RunMinimumBias2021 step3 max memory used diff 4.134583 exceeds error threshold +/- 1.000000 MiB

Workflow 14034.0_TTbar_14TeV+2023FS step2 max memory used diff -8.164169 exceeds error threshold +/- 1.000000 MiB

Workflow 14234.0_TTbar_14TeV+2023FSPU step2 max memory used diff 78.212036 exceeds error threshold +/- 1.000000 MiB

Workflow 16834.0_TTbar_14TeV+2025 step3 max memory used diff 7.203690 exceeds error threshold +/- 1.000000 MiB

Workflow 18434.0_TTbar_14TeV+2026 step3 max memory used diff 4.107956 exceeds error threshold +/- 1.000000 MiB

Workflow 18634.0_TTbar_14TeV+2026PU step3 max memory used diff 1.101791 exceeds error threshold +/- 1.000000 MiB

Workflow 2024.0000001_RunZeroBias2024B_10k step3 max memory used diff -4.057953 exceeds error threshold +/- 1.000000 MiB

Workflow 2024.0010001_RunJetMET02024C_10k step3 max memory used diff -51.587845 exceeds error threshold +/- 1.000000 MiB

Workflow 2024.0020001_RunEGamma02024D_10k step3 max memory used diff -65.994888 exceeds error threshold +/- 1.000000 MiB

Workflow 2024.0030001_RunDisplacedJet2024E_10k step3 max memory used diff -70.057892 exceeds error threshold +/- 1.000000 MiB

Workflow 2024.0040001_RunPark2MuonLowMass02024F_10k step3 max memory used diff -8.223495 exceeds error threshold +/- 1.000000 MiB

Workflow 2024.0050001_RunBTagMu2024G_10k step3 max memory used diff 8.255478 exceeds error threshold +/- 1.000000 MiB

Workflow 2024.0060001_RunMuon02024H_10k step3 max memory used diff 74.253967 exceeds error threshold +/- 1.000000 MiB

Workflow 2024.0070001_RunTau2024I_10k step3 max memory used diff 7.222870 exceeds error threshold +/- 1.000000 MiB

Workflow 2025.0000001_RunZeroBias2025B_10k step3 max memory used diff 68.085022 exceeds error threshold +/- 1.000000 MiB

Workflow 2025.0010001_RunJetMET02025C_10k step3 max memory used diff 4.700195 exceeds error threshold +/- 1.000000 MiB

Workflow 25.0_TTbar step3 max memory used diff 7.262878 exceeds error threshold +/- 1.000000 MiB

Workflow 250202.181_TTbar13TeVPUppmx2018 step4 max memory used diff 57.755692 exceeds error threshold +/- 1.000000 MiB

Workflow 25202.0_TTbar_13 step2 max memory used diff -9.503555 exceeds error threshold +/- 1.000000 MiB

Workflow 25202.0_TTbar_13 step3 max memory used diff -66.004333 exceeds error threshold +/- 1.000000 MiB

Workflow 312.0_Pyquen_ZeemumuJets_pt10_2760GeV_2022 step2 max memory used diff -5.794762 exceeds error threshold +/- 1.000000 MiB

Workflow 34434.0_TTbar_14TeV+Run4D121 step3 max memory used diff -12.370499 exceeds error threshold +/- 1.000000 MiB

Workflow 34434.911_TTbar_14TeV+Run4D121_DD4hep step3 max memory used diff -8.493011 exceeds error threshold +/- 1.000000 MiB

Workflow 34496.0_CloseByPGun_CE_E_Front_120um+Run4D121 step3 max memory used diff -65.998337 exceeds error threshold +/- 1.000000 MiB

Workflow 34634.999_TTbar_14TeV+Run4D121PU_PMXS1S2PR step2 max memory used diff -6.272766 exceeds error threshold +/- 1.000000 MiB

Workflow 34634.999_TTbar_14TeV+Run4D121PU_PMXS1S2PR step4 max memory used diff -71.105377 exceeds error threshold +/- 1.000000 MiB

Workflow 4.53_RunPhoton2012B step3 max memory used diff 66.100830 exceeds error threshold +/- 1.000000 MiB

Workflow 9.0_Higgs200ChargedTaus step3 max memory used diff 3.140152 exceeds error threshold +/- 1.000000 MiB

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

@makortel
Copy link
Copy Markdown
Contributor

workflow(s) with memory usage exceeding the error threshold:

Workflow 11634.0_TTbar_14TeV+2022 step3 max memory used diff 66.025848 exceeds error threshold +/- 1.000000 MiB

Workflow 12834.0_TTbar_14TeV+2024 step3 max memory used diff -4.102814 exceeds error threshold +/- 1.000000 MiB

I'm thinking about the presentation. My first thought was bulleted list along

  • Workflow 11634.0_TTbar_14TeV+2022 step3 max memory used diff 66.025848 exceeds error threshold +/- 1.000000 MiB
  • Workflow 12834.0_TTbar_14TeV+2024 step3 max memory used diff -4.102814 exceeds error threshold +/- 1.000000 MiB

but in the GitHub rendering the "MiB" going to next line is annoying. Maybe using e.g. {.1f} format (less decimals) would help?

@cmsbuild
Copy link
Copy Markdown
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-20c417/49915/summary.html
COMMIT: 9bb588d
CMSSW: CMSSW_16_0_X_2025-12-11-1100/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cms-bot/2622/49915/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 3 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 2 differences found in the comparisons
  • Reco comparison had 4 failed jobs
  • DQMHistoTests: Total files compared: 53
  • DQMHistoTests: Total histograms compared: 4273241
  • DQMHistoTests: Total failures: 44
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4273177
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 52 files compared)
  • Checked 227 log files, 198 edm output root files, 53 DQM output files
  • TriggerResults: no differences found

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

1 similar comment
@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

@cmsbuild
Copy link
Copy Markdown
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-20c417/49916/summary.html
COMMIT: 99a1e78
CMSSW: CMSSW_16_0_X_2025-12-11-2300/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cms-bot/2622/49916/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 2 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 10 differences found in the comparisons
  • Reco comparison had 4 failed jobs
  • DQMHistoTests: Total files compared: 53
  • DQMHistoTests: Total histograms compared: 4273241
  • DQMHistoTests: Total failures: 140
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4273081
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 52 files compared)
  • Checked 227 log files, 198 edm output root files, 53 DQM output files
  • TriggerResults: no differences found

Max Memory Comparisons exceeding threshold

@cms-sw/core-l2 , I found 31 workflow step(s) with memory usage exceeding the error threshold:

 - Workflow 11634.0_TTbar_14TeV+2022 step3 max memory used diff 16.6 exceeds error threshold +/- 1.0 MiB

 - Workflow 12834.0_TTbar_14TeV+2024 step3 max memory used diff 64.0 exceeds error threshold +/- 1.0 MiB

 - Workflow 12846.0_ZEE_14+2024 step3 max memory used diff -4.1 exceeds error threshold +/- 1.0 MiB

 - Workflow 13034.0_TTbar_14TeV+2024PU step3 max memory used diff 6.7 exceeds error threshold +/- 1.0 MiB

 - Workflow 13234.0_TTbar_14TeV+2022FS step2 max memory used diff 70.1 exceeds error threshold +/- 1.0 MiB

 - Workflow 1330.0_ZMM_13 step5 max memory used diff 2.2 exceeds error threshold +/- 1.0 MiB

 - Workflow 135.4_ZEEFS_13 step3 max memory used diff -61.8 exceeds error threshold +/- 1.0 MiB

 - Workflow 136.731_RunSinglePh2016B step3 max memory used diff 64.0 exceeds error threshold +/- 1.0 MiB

 - Workflow 136.793_RunDoubleEG2017C step3 max memory used diff -69.1 exceeds error threshold +/- 1.0 MiB

 - Workflow 136.874_RunEGamma2018C step3 max memory used diff 2.0 exceeds error threshold +/- 1.0 MiB

 - Workflow 139.001_RunMinimumBias2021 step3 max memory used diff -65.0 exceeds error threshold +/- 1.0 MiB

 - Workflow 14234.0_TTbar_14TeV+2023FSPU step2 max memory used diff -70.0 exceeds error threshold +/- 1.0 MiB

 - Workflow 16834.0_TTbar_14TeV+2025 step3 max memory used diff -4.1 exceeds error threshold +/- 1.0 MiB

 - Workflow 17034.0_TTbar_14TeV+2025PU step3 max memory used diff 66.2 exceeds error threshold +/- 1.0 MiB

 - Workflow 18434.0_TTbar_14TeV+2026 step3 max memory used diff 4.1 exceeds error threshold +/- 1.0 MiB

 - Workflow 18634.0_TTbar_14TeV+2026PU step3 max memory used diff 2.5 exceeds error threshold +/- 1.0 MiB

 - Workflow 2024.0000001_RunZeroBias2024B_10k step3 max memory used diff 4.1 exceeds error threshold +/- 1.0 MiB

 - Workflow 2024.0010001_RunJetMET02024C_10k step3 max memory used diff 14.4 exceeds error threshold +/- 1.0 MiB

 - Workflow 2024.0020001_RunEGamma02024D_10k step3 max memory used diff 1.0 exceeds error threshold +/- 1.0 MiB

 - Workflow 2024.0070001_RunTau2024I_10k step3 max memory used diff -66.0 exceeds error threshold +/- 1.0 MiB

 - Workflow 2025.0000001_RunZeroBias2025B_10k step3 max memory used diff -70.1 exceeds error threshold +/- 1.0 MiB

 - Workflow 2025.0010001_RunJetMET02025C_10k step3 max memory used diff -6.4 exceeds error threshold +/- 1.0 MiB

 - Workflow 25.0_TTbar step3 max memory used diff 8.3 exceeds error threshold +/- 1.0 MiB

 - Workflow 250202.181_TTbar13TeVPUppmx2018 step4 max memory used diff -66.0 exceeds error threshold +/- 1.0 MiB

 - Workflow 25202.0_TTbar_13 step2 max memory used diff -9.4 exceeds error threshold +/- 1.0 MiB

 - Workflow 25202.0_TTbar_13 step3 max memory used diff 64.7 exceeds error threshold +/- 1.0 MiB

 - Workflow 312.0_Pyquen_ZeemumuJets_pt10_2760GeV_2022 step2 max memory used diff -5.8 exceeds error threshold +/- 1.0 MiB

 - Workflow 34434.0_TTbar_14TeV+Run4D121 step3 max memory used diff -70.0 exceeds error threshold +/- 1.0 MiB

 - Workflow 34434.911_TTbar_14TeV+Run4D121_DD4hep step3 max memory used diff -9.2 exceeds error threshold +/- 1.0 MiB

 - Workflow 34634.999_TTbar_14TeV+Run4D121PU_PMXS1S2PR step4 max memory used diff 4.1 exceeds error threshold +/- 1.0 MiB

 - Workflow 9.0_Higgs200ChargedTaus step3 max memory used diff -8.2 exceeds error threshold +/- 1.0 MiB

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

@cmsbuild
Copy link
Copy Markdown
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-20c417/49929/summary.html
COMMIT: f80c0f6
CMSSW: CMSSW_16_0_X_2025-12-12-1100/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cms-bot/2622/49929/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 1 lines to the logs
  • Reco comparison results: 0 differences found in the comparisons
  • Reco comparison had 4 failed jobs
  • DQMHistoTests: Total files compared: 53
  • DQMHistoTests: Total histograms compared: 4273241
  • DQMHistoTests: Total failures: 106
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4273115
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 52 files compared)
  • Checked 227 log files, 198 edm output root files, 53 DQM output files
  • TriggerResults: no differences found

Max Memory Comparisons exceeding threshold

@cms-sw/core-l2 , I found 37 workflow step(s) with memory usage exceeding the error threshold:

  • Workflow 10224.0_TTbar_13+2017PU step3 max memory used diff 56.5 exceeds error threshold +/- 1.0 MiB

  • Workflow 11634.0_TTbar_14TeV+2022 step3 max memory used diff -65.9 exceeds error threshold +/- 1.0 MiB

  • Workflow 12834.0_TTbar_14TeV+2024 step3 max memory used diff -57.7 exceeds error threshold +/- 1.0 MiB

  • Workflow 13034.0_TTbar_14TeV+2024PU step3 max memory used diff -8.3 exceeds error threshold +/- 1.0 MiB

  • Workflow 1306.0_SingleMuPt1_UP15 step3 max memory used diff -66.0 exceeds error threshold +/- 1.0 MiB

  • Workflow 13234.0_TTbar_14TeV+2022FS step2 max memory used diff -61.9 exceeds error threshold +/- 1.0 MiB

  • Workflow 1330.0_ZMM_13 step3 max memory used diff -70.1 exceeds error threshold +/- 1.0 MiB

  • Workflow 135.4_ZEEFS_13 step3 max memory used diff 66.1 exceeds error threshold +/- 1.0 MiB

  • Workflow 136.731_RunSinglePh2016B step3 max memory used diff 6.2 exceeds error threshold +/- 1.0 MiB

  • Workflow 136.793_RunDoubleEG2017C step3 max memory used diff 82.5 exceeds error threshold +/- 1.0 MiB

  • Workflow 136.874_RunEGamma2018C step3 max memory used diff 16.5 exceeds error threshold +/- 1.0 MiB

  • Workflow 139.001_RunMinimumBias2021 step3 max memory used diff -68.1 exceeds error threshold +/- 1.0 MiB

  • Workflow 14034.0_TTbar_14TeV+2023FS step2 max memory used diff 66.0 exceeds error threshold +/- 1.0 MiB

  • Workflow 14234.0_TTbar_14TeV+2023FSPU step2 max memory used diff -8.3 exceeds error threshold +/- 1.0 MiB

  • Workflow 16834.0_TTbar_14TeV+2025 step3 max memory used diff -4.1 exceeds error threshold +/- 1.0 MiB

  • Workflow 18434.0_TTbar_14TeV+2026 step3 max memory used diff 70.1 exceeds error threshold +/- 1.0 MiB

  • Workflow 18634.0_TTbar_14TeV+2026PU step3 max memory used diff -78.2 exceeds error threshold +/- 1.0 MiB

  • Workflow 2023.0020001_RunJetMET02023D_10k step3 max memory used diff 4.1 exceeds error threshold +/- 1.0 MiB

  • Workflow 2024.0000001_RunZeroBias2024B_10k step3 max memory used diff 66.0 exceeds error threshold +/- 1.0 MiB

  • Workflow 2024.0020001_RunEGamma02024D_10k step3 max memory used diff -4.1 exceeds error threshold +/- 1.0 MiB

  • Workflow 2024.0030001_RunDisplacedJet2024E_10k step3 max memory used diff 4.1 exceeds error threshold +/- 1.0 MiB

  • Workflow 2024.0040001_RunPark2MuonLowMass02024F_10k step3 max memory used diff 65.0 exceeds error threshold +/- 1.0 MiB

  • Workflow 2024.0050001_RunBTagMu2024G_10k step3 max memory used diff 7.2 exceeds error threshold +/- 1.0 MiB

  • Workflow 2024.0060001_RunMuon02024H_10k step3 max memory used diff 66.0 exceeds error threshold +/- 1.0 MiB

  • Workflow 2025.0000001_RunZeroBias2025B_10k step3 max memory used diff -2.1 exceeds error threshold +/- 1.0 MiB

  • Workflow 2025.0010001_RunJetMET02025C_10k step3 max memory used diff 5.9 exceeds error threshold +/- 1.0 MiB

  • Workflow 25.0_TTbar step3 max memory used diff 8.3 exceeds error threshold +/- 1.0 MiB

  • Workflow 250202.181_TTbar13TeVPUppmx2018 step4 max memory used diff 4.1 exceeds error threshold +/- 1.0 MiB

  • Workflow 25202.0_TTbar_13 step2 max memory used diff 9.4 exceeds error threshold +/- 1.0 MiB

  • Workflow 25202.0_TTbar_13 step3 max memory used diff 66.0 exceeds error threshold +/- 1.0 MiB

  • Workflow 312.0_Pyquen_ZeemumuJets_pt10_2760GeV_2022 step2 max memory used diff -5.8 exceeds error threshold +/- 1.0 MiB

  • Workflow 34434.0_TTbar_14TeV+Run4D121 step3 max memory used diff -12.2 exceeds error threshold +/- 1.0 MiB

  • Workflow 34434.911_TTbar_14TeV+Run4D121_DD4hep step3 max memory used diff 10.4 exceeds error threshold +/- 1.0 MiB

  • Workflow 34496.0_CloseByPGun_CE_E_Front_120um+Run4D121 step3 max memory used diff 66.1 exceeds error threshold +/- 1.0 MiB

  • Workflow 34500.0_CloseByPGun_CE_H_Coarse_Scint+Run4D121 step3 max memory used diff -4.0 exceeds error threshold +/- 1.0 MiB

  • Workflow 34634.999_TTbar_14TeV+Run4D121PU_PMXS1S2PR step4 max memory used diff 7.2 exceeds error threshold +/- 1.0 MiB

  • Workflow 9.0_Higgs200ChargedTaus step3 max memory used diff -7.2 exceeds error threshold +/- 1.0 MiB

@makortel
Copy link
Copy Markdown
Contributor

Thanks @gartung. I like the bullet list more than the verbatim block. Maybe the lines could be shortened more by removing e.g. the words "used" and "error threshold"? Those words are already in the section description ("with memory usage exceeding the error threshold"), so I think it is not necessary to repeat them here. What do you think?

@gartung
Copy link
Copy Markdown
Member Author

gartung commented Dec 12, 2025

I will shorten the message.

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

1 similar comment
@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

commit fc3f8e2
Author: Patrick Gartung <gartung@fnal.gov>
Date:   Fri Dec 12 11:51:45 2025 -0600

    Make warning and error messages shorter

commit 25ca561
Author: Patrick Gartung <gartung@fnal.gov>
Date:   Fri Dec 12 08:20:11 2025 -0600

    Take out <pre></pre> from list output

commit 3783bbb
Author: Patrick Gartung <gartung@fnal.gov>
Date:   Thu Dec 11 18:07:05 2025 -0600

    Markdown formatting for list of workflow failing threshold

commit 9281a51
Author: Patrick Gartung <gartung@fnal.gov>
Date:   Thu Dec 11 15:09:40 2025 -0600

    Change the wording of failure messages

commit 0a30ff4
Author: Patrick Gartung <gartung@fnal.gov>
Date:   Thu Dec 11 13:28:58 2025 -0600

    Split the grep output and report just the match string

commit fcac2ad
Author: Patrick Gartung <gartung@fnal.gov>
Date:   Thu Dec 11 12:47:34 2025 -0600

    Remove --repo option with empty var as arg

commit bbe9538
Author: Patrick Gartung <gartung@fnal.gov>
Date:   Thu Dec 11 11:28:21 2025 -0600

    Echo report-pull-request-results command

commit 419951a
Author: Patrick Gartung <gartung@fnal.gov>
Date:   Thu Dec 11 10:24:20 2025 -0600

    Make sure testResults directory is created before writing maxmem files

commit 6dda6b2
Author: Patrick Gartung <gartung@fnal.gov>
Date:   Wed Dec 10 17:40:38 2025 -0600

    Fix read_maxmem_comparison_file

commit 59dbbb2
Author: Patrick Gartung <gartung@fnal.gov>
Date:   Wed Dec 10 17:35:10 2025 -0600

    Process the maxmem_comparison.log file instead of the individual .err file

commit 70343d5
Author: Patrick Gartung <gartung@fnal.gov>
Date:   Wed Dec 10 17:24:03 2025 -0600

    Missing space

commit 48e99bd
Author: Patrick Gartung <gartung@fnal.gov>
Date:   Wed Dec 10 17:12:02 2025 -0600

    Log output of grep to maxmem_summary.log

commit 78ba103
Author: Patrick Gartung <gartung@fnal.gov>
Date:   Wed Dec 10 14:25:56 2025 -0600

    Send the output of maxmem comparison to log

commit 0253cca
Author: Patrick Gartung <gartung@fnal.gov>
Date:   Wed Dec 10 11:39:17 2025 -0600

    Add --no-post report-pull-request-results so it prints to the console to see if send_nessage_pr is called for maxmem

commit 62ab359
Author: Patrick Gartung <gartung@fnal.gov>
Date:   Tue Dec 9 16:41:22 2025 -0600

    COMMIT_SHA not defined

commit dc11547
Author: Patrick Gartung <gartung@fnal.gov>
Date:   Mon Dec 8 13:51:10 2025 -0600

    Fix report-pull-request-result invocation

commit fbf497b
Author: Patrick Gartung <gartung@fnal.gov>
Date:   Mon Dec 8 11:37:32 2025 -0600

    Change max-memory-used check to use real difference in MB not percentage diff. Increase the threshold to 80MB. Flag PR is check fails.

    Squashed commit of the following:

    commit b9f2dc1
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Mon Dec 8 11:35:14 2025 -0600

        Use yellow and green to indicate negative warn and error thresholds

    commit fe76627
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Mon Dec 8 11:12:30 2025 -0600

        Use blud and green to indicate negative warn and error thresholds

    commit 2cbb34a
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Fri Dec 5 13:27:06 2025 -0600

        black formatting

    commit cd466d9
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Fri Dec 5 13:24:38 2025 -0600

        Use MiB scaling 1024*1024

    commit 79d93d4
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Fri Dec 5 12:43:36 2025 -0600

        Add missing \n

    commit 1ed1123
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Fri Dec 5 12:40:15 2025 -0600

        Use MiB not MB. Make sys.stderr.write append instead of overwriting

    commit 4f894aa
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Fri Dec 5 12:21:42 2025 -0600

        get_result_file_name should handle maxmem

    commit 998aa34
    Author: gartung <gartung@fnal.gov>
    Date:   Thu Dec 4 16:49:29 2025 -0600

        More black formatting

    commit e1f45eb
    Author: gartung <gartung@fnal.gov>
    Date:   Thu Dec 4 16:46:01 2025 -0600

        black formatting

    commit 888be72
    Author: gartung <gartung@fnal.gov>
    Date:   Thu Dec 4 16:43:06 2025 -0600

        Change message in summary table to say default threshold values. Remove unneeded path insertion.

    commit cc5332d
    Author: gartung <gartung@fnal.gov>
    Date:   Thu Dec 4 10:49:38 2025 -0600

        More black formatting

    commit ae25680
    Author: gartung <gartung@fnal.gov>
    Date:   Thu Dec 4 10:47:14 2025 -0600

        black formatting

    commit 4f8aeec
    Author: gartung <gartung@fnal.gov>
    Date:   Thu Dec 4 10:43:42 2025 -0600

        Make error threshold 80MB

    commit bd3977b
    Author: gartung <gartung@fnal.gov>
    Date:   Thu Dec 4 10:40:58 2025 -0600

        Add script directory to PYTHONPATH so maxmem_threshold.py can be found. Use consistent units of (1024*1024) bytes per MB.

    commit b8413cf
    Author: gartung <gartung@fnal.gov>
    Date:   Thu Dec 4 10:21:58 2025 -0600

        Put thresholds in their own file imported by the scripts that use them.

    commit 27c4e28
    Author: gartung <gartung@fnal.gov>
    Date:   Wed Dec 3 15:47:04 2025 -0600

        Send message is exceeds error threshold appears in *.err file

    commit 17bd472
    Author: gartung <gartung@fnal.gov>
    Date:   Wed Dec 3 15:45:31 2025 -0600

        Send message is exceeds appear in *.err file

    commit 83ff660
    Author: gartung <gartung@fnal.gov>
    Date:   Wed Dec 3 13:05:29 2025 -0600

        Add units to err message

    commit 340b386
    Author: gartung <gartung@fnal.gov>
    Date:   Wed Dec 3 12:54:35 2025 -0600

        Link to maxmem_summary.html

    commit 9964fb0
    Author: gartung <gartung@fnal.gov>
    Date:   Wed Dec 3 12:47:29 2025 -0600

        Add @cms-sw/core2 to message sent to PR

    commit ad79813
    Author: gartung <gartung@fnal.gov>
    Date:   Wed Dec 3 11:28:22 2025 -0600

        Fix error message

    commit aec4220
    Author: gartung <gartung@fnal.gov>
    Date:   Wed Dec 3 11:23:27 2025 -0600

        Fix indent problem

    commit a2f4385
    Author: gartung <gartung@fnal.gov>
    Date:   Wed Dec 3 11:20:07 2025 -0600

        Scale max memory used diff before applying threshold.

    commit cce8d1e
    Author: gartung <gartung@fnal.gov>
    Date:   Wed Dec 3 11:19:29 2025 -0600

        Fix table borders

    commit 1643523
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Wed Dec 3 07:28:46 2025 -0600

        Put % in correct location. Rename index.html maxmem_summary.html

    commit 790acfa
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Tue Dec 2 16:58:26 2025 -0600

        Add defintion of mem_prof_pdiffs_dicts

    commit 157a440
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Tue Dec 2 16:51:14 2025 -0600

        Black formatting

    commit 94b40dd
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Tue Dec 2 16:47:01 2025 -0600

        Add changes suggested by CoPilot to report which workflows exceed threshold.

    commit 82d77ee
    Merge: 901627b f270b07
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Tue Dec 2 14:53:44 2025 -0600

        Merge remote-tracking branch 'origin/master'

    commit 901627b
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Tue Dec 2 14:52:30 2025 -0600

        Remove FAILED_ from MAXMEM_COMPARISON

    commit e27117c
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Tue Dec 2 14:51:24 2025 -0600

        Fix type in compare-maxmem.py

    commit db380dc
    Merge: bbca0eb 8335960
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Tue Dec 2 12:00:48 2025 -0600

        Merge remote-tracking branch 'refs/remotes/upstream/master'

    commit f270b07
    Merge: bbca0eb 8335960
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Tue Dec 2 11:56:53 2025 -0600

        Merge branch 'cms-sw:master' into master

    commit bbca0eb
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Fri Nov 21 12:57:15 2025 -0600

        Report diff not abs(diff)

    commit fa38d77
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Fri Nov 21 12:55:18 2025 -0600

        Change comparison to real value diff not percentage diff

    commit 4bc2b7e
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Fri Nov 21 12:43:19 2025 -0600

        Black formatting

    commit 725349d
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Fri Nov 21 12:40:50 2025 -0600

        Fix another indent

    commit d7bd165
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Fri Nov 21 12:39:25 2025 -0600

        Fix another indent

    commit e7e0b40
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Fri Nov 21 12:37:33 2025 -0600

        Fix indent

    commit 0a5c9ea
    Merge: 249ddeb 245504f
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Fri Nov 21 12:30:18 2025 -0600

        Merge remote-tracking branch 'upstream'

    commit 249ddeb
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Fri Nov 21 12:24:21 2025 -0600

        Increase memory thresholds for warnings and errors in compare_maxmem_summary.py and use diff of real value not percentage value.

    commit 1ecd417
    Author: Patrick Gartung <gartung@fnal.gov>
    Date:   Thu Nov 20 13:36:13 2025 -0600

        Increase the max memory difference threshold to 100MB. Add failed/FAILED to message to .
@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #2622 was updated.

@gartung
Copy link
Copy Markdown
Member Author

gartung commented Dec 12, 2025

please test

@cmsbuild
Copy link
Copy Markdown
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-20c417/49935/summary.html
COMMIT: 44e8aa8
CMSSW: CMSSW_16_0_X_2025-12-12-1100/el8_amd64_gcc13
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cms-bot/2622/49935/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 2 lines from the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 12 differences found in the comparisons
  • Reco comparison had 4 failed jobs
  • DQMHistoTests: Total files compared: 53
  • DQMHistoTests: Total histograms compared: 4273241
  • DQMHistoTests: Total failures: 9
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4273212
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 52 files compared)
  • Checked 227 log files, 198 edm output root files, 53 DQM output files
  • TriggerResults: no differences found

@makortel
Copy link
Copy Markdown
Contributor

Thanks. I don't have further comments. @smuzaffar do you?

@smuzaffar
Copy link
Copy Markdown
Contributor

smuzaffar commented Dec 15, 2025

I am fine with the summary message. As we know this is for max memory comparison exceedign threashold so I would have even reduced it to just

and also remove the extra new line from each list.

I guess if needed , we can do it later.
If no other changes needed then we can get this in

@gartung
Copy link
Copy Markdown
Member Author

gartung commented Dec 15, 2025

I removed at least on new line with the latest changes.

@smuzaffar
Copy link
Copy Markdown
Contributor

+externals

we can merge this and if more formatting changes needed then we can do it in a separate PR

@cmsbuild
Copy link
Copy Markdown
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @mandrenguyen, @ftenchini (and backports should be raised in the release meeting by the corresponding L2)

@smuzaffar
Copy link
Copy Markdown
Contributor

@gartung , see cms-sw/cmssw#49636 (comment) where max memory summary contains multiple

Error
Error
Error
Error

@gartung
Copy link
Copy Markdown
Member Author

gartung commented Dec 16, 2025

There are diff values of max memory usage greater than 80MiB
https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_16_0_X_2025-12-15-2300+3595a2/72375/maxmem-comparison/maxmem_summary.html
I will have to debug why this unusual message was produced

@gartung
Copy link
Copy Markdown
Member Author

gartung commented Dec 16, 2025

@smuzaffar I found the bug and submitted #2634

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add an alert message to the cms-bot's PR test summary message

4 participants