Skip to content

update phase-2 HLT timing script and use alpaka modifier in NGT workflows#49821

Open
mmusich wants to merge 2 commits intocms-sw:masterfrom
cms-ngt-hlt:mm_alpaka_in_ngt_scouting
Open

update phase-2 HLT timing script and use alpaka modifier in NGT workflows#49821
mmusich wants to merge 2 commits intocms-sw:masterfrom
cms-ngt-hlt:mm_alpaka_in_ngt_scouting

Conversation

@mmusich
Copy link
Copy Markdown
Contributor

@mmusich mmusich commented Jan 14, 2026

PR description:

Following the discussion had at the NGT meeting of Jan 13 2025 this PR implements:

  • use explicitly the alpaka process modifier in all the NGT scouting related workflows offload part of the HGCal reconstruction to GPUs

PR validation:

To be tested by the bot.

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

Not a backport, no backport needed.

@mmusich
Copy link
Copy Markdown
Contributor Author

mmusich commented Jan 14, 2026

type ngt

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Jan 14, 2026

cms-bot internal usage

@cmsbuild
Copy link
Copy Markdown
Contributor

@cmsbuild
Copy link
Copy Markdown
Contributor

A new Pull Request was created by @mmusich for master.

It involves the following packages:

  • Configuration/PyReleaseValidation (pdmv)
  • HLTrigger/Configuration (hlt)

@AdrianoDee, @DickyChant, @Martin-Grunewald, @antoniovagnerini, @cmsbuild, @miquork, @mmusich can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @SohamBhattacharya, @VourMa, @fabiocos, @makortel, @missirol, @rovere, @slomeo this is something you requested to watch as well.
@ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@mmusich
Copy link
Copy Markdown
Contributor Author

mmusich commented Jan 14, 2026

test parameters:

  • enable = hlt_p2_integration, hlt_p2_timing
  • workflows = ph2_hlt

@mmusich
Copy link
Copy Markdown
Contributor Author

mmusich commented Jan 14, 2026

@cmsbuild, please test

@cmsbuild
Copy link
Copy Markdown
Contributor

-1

Failed Tests: RelVals
Size: This PR adds an extra 40KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-480d8a/50620/summary.html
COMMIT: 43410e9
CMSSW: CMSSW_16_1_X_2026-01-13-2300/el8_amd64_gcc13
Additional Tests: HLT_P2_INTEGRATION,HLT_P2_TIMING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/49821/50620/install.sh to create a dev area with all the needed externals and cmssw changes.

HLT P2 Timing: chart

Failed RelVals

----- Begin Fatal Exception 14-Jan-2026 14:46:42 CET-----------------------
An exception of category 'OutOfBound' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 4 stream: 0
   [1] Running path 'HLTriggerFinalPath'
   [2] Prefetching for module TriggerSummaryProducerAOD/'hltTriggerSummaryAOD'
   [3] Prefetching for module L1HPSPFTauProducer/'l1tHPSPFTauProducer'
   [4] Prefetching for module L1TPFCandMultiMerger/'l1tLayer1'
   [5] Prefetching for module L1TCorrelatorLayer1Producer/'l1tLayer1HGCal'
   [6] Calling method for module HGCalBackendLayer2Producer/'l1tHGCalBackEndLayer2Producer'
Exception Message:
TC X1 = 0.0713466 out of the seeding histogram bounds 0.076 - 0.58
----- End Fatal Exception -------------------------------------------------

@mmusich
Copy link
Copy Markdown
Contributor Author

mmusich commented Jan 15, 2026

The updated timing result, adding the alpaka process modifier is odd:

image

@mmusich
Copy link
Copy Markdown
Contributor Author

mmusich commented Jan 15, 2026

@cmsbuild, please test

@cmsbuild
Copy link
Copy Markdown
Contributor

+1

Size: This PR adds an extra 16KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-480d8a/50652/summary.html
COMMIT: 43410e9
CMSSW: CMSSW_16_1_X_2026-01-14-2300/el8_amd64_gcc13
Additional Tests: HLT_P2_INTEGRATION,HLT_P2_TIMING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/49821/50652/install.sh to create a dev area with all the needed externals and cmssw changes.

HLT P2 Timing: chart

Comparison Summary

Summary:

  • You potentially added 5 lines to the logs
  • Reco comparison results: 10 differences found in the comparisons
  • DQMHistoTests: Total files compared: 71
  • DQMHistoTests: Total histograms compared: 4580671
  • DQMHistoTests: Total failures: 1010
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4579641
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 70 files compared)
  • Checked 285 log files, 240 edm output root files, 71 DQM output files
  • TriggerResults: found differences in 3 / 69 workflows

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Mar 2, 2026

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-49821/48331

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Mar 2, 2026

Pull request #49821 was updated. @AdrianoDee, @DickyChant, @Martin-Grunewald, @antoniovagnerini, @cmsbuild, @miquork, @mmusich can you please check and sign again.

@mmusich
Copy link
Copy Markdown
Contributor Author

mmusich commented Mar 2, 2026

@cmsbuild, please test

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Mar 2, 2026

-1

Failed Tests: HLTP2Timing
Size: This PR adds an extra 16KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-480d8a/51704/summary.html
COMMIT: 60e4b39
CMSSW: CMSSW_16_1_X_2026-03-02-1100/el8_amd64_gcc13
Additional Tests: HLT_P2_INTEGRATION,HLT_P2_TIMING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/49821/51704/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 5 lines to the logs
  • Reco comparison results: 2 differences found in the comparisons
  • DQMHistoTests: Total files compared: 72
  • DQMHistoTests: Total histograms compared: 4736055
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 4736035
  • DQMHistoTests: Total skipped: 20
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 71 files compared)
  • Checked 292 log files, 246 edm output root files, 72 DQM output files
  • TriggerResults: found differences in 3 / 70 workflows

Max Memory Comparisons exceeding threshold

@cms-sw/core-l2 , I found 2 workflow step(s) with memory usage exceeding the error threshold:

Expand to see workflows ...
  • Error: Workflow 34434.77_TTbar_14TeV+Run4D121_NGTScouting step2 max memory diff 152.3 exceeds +/- 90.0 MiB
  • Error: Workflow 34434.772_TTbar_14TeV+Run4D121_NGTScoutingWithNano step2 max memory diff 153.1 exceeds +/- 90.0 MiB

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Apr 4, 2026

Milestone for this pull request has been moved to CMSSW_17_0_X. Please open a backport if it should also go in to CMSSW_16_1_X.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants