Skip to content

add HLT vertexing resolution monitoring (by split vertex method)#48980

Merged
cmsbuild merged 1 commit intocms-sw:masterfrom
mmusich:pvresolution_for_HLT
Sep 26, 2025
Merged

add HLT vertexing resolution monitoring (by split vertex method)#48980
cmsbuild merged 1 commit intocms-sw:masterfrom
mmusich:pvresolution_for_HLT

Conversation

@mmusich
Copy link
Copy Markdown
Contributor

@mmusich mmusich commented Sep 24, 2025

PR description:

Add HLT vertexing resolution monitoring (using the "split vertex" method) for all the vertex types available at HLT (in both Run 3 and Phase-2).
This leverages the existing module PrimaryVertexResolution (which is lightly adapted to tolerate events in which the input colections are missing) and changes the configuration of both the DQM and Harvesting steps to be run in the @HLTMon sequence.

PR validation:

Run successfully both a run3 and phase-2 workflow (running non-trivial HLT menus)

runTheMatrix.py -l 17034.0,29634.0 -t 4 -j 8 --ibeos -i all --ibeos

and checked that the output files have the expected output in terms of monitor elements.

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

Not a backport, will be backported to CMSSW_15_0_X for testing in the 2025 data-taking release.

@mmusich
Copy link
Copy Markdown
Contributor Author

mmusich commented Sep 24, 2025

@cmsbuild ping

@iarspider
Copy link
Copy Markdown
Contributor

@mmusich cmsrep (which receives github webhook requests for cms-bot to run) is overloaded, I'm trying to fix it.

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Sep 24, 2025

cms-bot internal usage

@cmsbuild
Copy link
Copy Markdown
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-48980/46160

@cmsbuild
Copy link
Copy Markdown
Contributor

A new Pull Request was created by @mmusich for master.

It involves the following packages:

  • DQM/TrackingMonitor (dqm)
  • DQMOffline/Configuration (dqm)
  • DQMOffline/Trigger (dqm)

@antoniovagnerini, @cmsbuild, @ctarricone, @gabrielmscampos, @nothingface0, @rseidita can you please review it and eventually sign? Thanks.
@Fedespring, @HuguesBrun, @VinInn, @VourMa, @arossi83, @cericeci, @fioriNTU, @idebruyn, @jandrea, @jhgoh, @missirol, @mmusich, @mtosi, @richa2710, @rociovilar, @sroychow, @threus, @trocino this is something you requested to watch as well.
@ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@mmusich
Copy link
Copy Markdown
Contributor Author

mmusich commented Sep 24, 2025

@cmsbuild, please test

@mmusich
Copy link
Copy Markdown
Contributor Author

mmusich commented Sep 25, 2025

@cmsbuild, please test

@cmsbuild
Copy link
Copy Markdown
Contributor

-1

Failed Tests: RelVals
Size: This PR adds an extra 16KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-47a777/48263/summary.html
COMMIT: 68d7859
CMSSW: CMSSW_16_0_X_2025-09-23-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/48980/48263/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals

  • 2024.0070001DAS Error

@cmsbuild
Copy link
Copy Markdown
Contributor

+1

Size: This PR adds an extra 16KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-47a777/48268/summary.html
COMMIT: 68d7859
CMSSW: CMSSW_16_0_X_2025-09-24-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/48980/48268/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially removed 3 lines from the logs
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 53
  • DQMHistoTests: Total histograms compared: 4142809
  • DQMHistoTests: Total failures: 26
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4142763
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 133855.709 KiB( 52 files compared)
  • DQMHistoSizes: changed ( 16834.0,... ): 8735.531 KiB HLT/Vertexing
  • DQMHistoSizes: changed ( 2024.0000001,... ): 8695.922 KiB HLT/Vertexing
  • DQMHistoSizes: changed ( 29634.0,... ): 5821.711 KiB HLT/Vertexing
  • Checked 226 log files, 198 edm output root files, 53 DQM output files
  • TriggerResults: no differences found

@nothingface0
Copy link
Copy Markdown
Contributor

nothingface0 commented Sep 26, 2025

@mmusich

DQMHistoSizes: Histogram memory added: 133855.709 KiB( 52 files compared)

Is this all due to the new sequences?

PS: This is offline only, right?

@mmusich
Copy link
Copy Markdown
Contributor Author

mmusich commented Sep 26, 2025

@nothingface0

Is this all due to the new sequences?

yes.

PS: This is offline only, right?

right.

@nothingface0
Copy link
Copy Markdown
Contributor

DQMHistoSizes: Histogram memory added: 133855.709 KiB( 52 files compared)

@LinaresToine Could we request your input on this increase, from the T0 side? Will this create a problem?

@LinaresToine
Copy link
Copy Markdown
Contributor

LinaresToine commented Sep 26, 2025

Thanks for your question @nothingface0. The memory problem is already there, however, with dqm harvesting jobs using 3 GB we have managed to reduce the number of paused jobs to around 1 per week. This tells me that a 4% increment should not make the existing problem a much bigger one. Nonetheless, I do like to ask not to push much further

@mmusich
Copy link
Copy Markdown
Contributor Author

mmusich commented Sep 26, 2025

This tells me that a 4% increment should not make the existing problem a much bigger one. Nonetheless, I do like to ask not to push much further

I think there is a misunderstanding here. At Tier0 this PR would change only the DQM sequence for the HLTMonitor stream and nothing else. I don't think we are experiencing any problem with that recently.

@LinaresToine
Copy link
Copy Markdown
Contributor

Right @mmusich HLTMonitor is fine.

@nothingface0
Copy link
Copy Markdown
Contributor

+dqm

@cmsbuild
Copy link
Copy Markdown
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @ftenchini, @mandrenguyen (and backports should be raised in the release meeting by the corresponding L2)

@ftenchini
Copy link
Copy Markdown

+1

@cmsbuild cmsbuild merged commit 9aa7268 into cms-sw:master Sep 26, 2025
10 checks passed
@mmusich mmusich deleted the pvresolution_for_HLT branch September 26, 2025 16:07
@mandrenguyen
Copy link
Copy Markdown
Contributor

@mmusich It looks like this PR is causing some IB crashes in heavy-ion workflows. Would you please have a look?
E.g., https://cmssdt.cern.ch/SDT/cgi-bin/logreader/el8_amd64_gcc12/CMSSW_16_0_X_2025-09-27-1100/pyRelValMatrixLogs/run/161.4_ZEE_5362_HI_2024/step3_ZEE_5362_HI_2024.log#/

@mmusich
Copy link
Copy Markdown
Contributor Author

mmusich commented Sep 28, 2025

Would you please have a look?

A proper fix would require a HLT menu update (to be discussed elsewhere). I can prepare a patch to overcome the issue, likely tomorrow.

@mandrenguyen
Copy link
Copy Markdown
Contributor

Would you please have a look?

A proper fix would require a HLT menu update (to be discussed elsewhere). I can prepare a patch to overcome the issue, likely tomorrow.

Sounds good, thanks.

@mmusich
Copy link
Copy Markdown
Contributor Author

mmusich commented Sep 28, 2025

I can prepare a patch to overcome the issue, likely tomorrow.

done at #49014.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants