New Heterogeneous Memory Pool by VinInn · Pull Request #37952 · cms-sw/cmssw

VinInn · 2022-05-15T13:57:34Z

This PR replaces the old "notcub" cache allocator with a memory pool featuring

lockfree operations
backend agnostic implementation
The data interface is based on a simple Buffer that is completely backend agnostic
The allocation interface (makeBuffer) currently depends on cudaStream_t that can be easily hidden behind void * or a light opaque struct
A new feature is a "Bundle deleter": buffers can be bundle together and then freed in just one operation: this reduces the number of cuda calls.
All previous users of the cache allocator (at least for Pixel wf) have been migrated.

Tests passes: it is not slower than previous implementation. Need a free machine to make definitive tests.

Some cleanup is still required to remove debug statements.

Purely technical no regression expected.

Draft Slides for a possible presentation available @ https://cernbox.cern.ch/index.php/s/Ax4NHYGLHbG8N1C

cmsbuild · 2022-05-15T14:03:58Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37952/30020

This PR adds an extra 232KB to repository
Found files with invalid states:
- HeterogeneousCore/CUDAUtilities/src/cudaMemoryPool.cu:
  - Added: f37385a
  - Modified: 75ca0db, c429d13, c5e35f0
  - Deleted: 21a646e
- CUDADataFormats/TrackingRecHit/interface/TrackingRecHit2DHeterogeneousImpl.h:
  - Added: 5291489
  - Modified: 1a43ba7, e7d8632, c8d553a
  - Deleted: 521d4c0
- HeterogeneousCore/CUDAUtilities/interface/cudaMemoryPoolImpl.h:
  - Added: 5291489
  - Modified: 1a43ba7, 59bcb2b, 29df6e2, 849da8c, e7d8632, b4f4d46, 8b149ed
  - Deleted: 1487b88
- CUDADataFormats/SiPixelDigi/interface/SiPixelDigisCUDAImpl.h:
  - Added: 3ae45f7
  - Modified: b4f4d46, c8d553a
  - Deleted: 9402cb7
- CUDADataFormats/TrackingRecHit/src/TrackingRecHit2DHeterogeneous.cc:
  - Modified: 6b050bd, 0e49a36, e8e9c0f, 88da3bc, b4f4d46, 521d4c0
  - Deleted: 21a646e
  - Added: e7d8632
There are other open Pull requests which might conflict with changes you have proposed:
- File HeterogeneousCore/CUDAServices/src/CUDAService.cc modified in PR(s): Implement ResourceInformationService #37831
- File HeterogeneousCore/CUDAUtilities/test/BuildFile.xml modified in PR(s): Use cooperative groups to populate Associations (Histograms) in Pixel Patatrack #35713
- File RecoLocalTracker/SiPixelRecHits/plugins/PixelRecHitGPUKernel.cu modified in PR(s): Use cooperative groups to populate Associations (Histograms) in Pixel Patatrack #35713
- File RecoLocalTracker/SiPixelRecHits/plugins/SiPixelRecHitSoAFromLegacy.cc modified in PR(s): Use cooperative groups to populate Associations (Histograms) in Pixel Patatrack #35713
- File RecoPixelVertexing/PixelTriplets/plugins/CAHitNtupletGeneratorKernels.cc modified in PR(s): Use cooperative groups to populate Associations (Histograms) in Pixel Patatrack #35713
- File RecoPixelVertexing/PixelTriplets/plugins/CAHitNtupletGeneratorKernels.cu modified in PR(s): Use cooperative groups to populate Associations (Histograms) in Pixel Patatrack #35713
- File RecoPixelVertexing/PixelTriplets/plugins/CAHitNtupletGeneratorKernels.h modified in PR(s): Use cooperative groups to populate Associations (Histograms) in Pixel Patatrack #35713
- File RecoPixelVertexing/PixelTriplets/plugins/CAHitNtupletGeneratorKernelsAlloc.cc modified in PR(s): Use cooperative groups to populate Associations (Histograms) in Pixel Patatrack #35713

cmsbuild · 2022-05-15T14:04:21Z

A new Pull Request was created by @VinInn (Vincenzo Innocente) for master.

It involves the following packages:

CUDADataFormats/BeamSpot (heterogeneous, reconstruction)
CUDADataFormats/Common (heterogeneous)
CUDADataFormats/SiPixelDigi (heterogeneous, reconstruction)
CUDADataFormats/Track (heterogeneous, reconstruction)
CUDADataFormats/TrackingRecHit (heterogeneous, reconstruction)
CUDADataFormats/Vertex (heterogeneous, reconstruction)
EventFilter/SiPixelRawToDigi (reconstruction)
HeterogeneousCore/CUDACore (heterogeneous)
HeterogeneousCore/CUDAServices (heterogeneous)
HeterogeneousCore/CUDAUtilities (heterogeneous)
RecoLocalTracker/SiPixelRecHits (reconstruction)
RecoPixelVertexing/PixelTrackFitting (reconstruction)
RecoPixelVertexing/PixelTriplets (reconstruction)
RecoPixelVertexing/PixelVertexFinding (reconstruction)
RecoVertex/BeamSpotProducer (reconstruction, alca)

@malbouis, @yuanchao, @makortel, @slava77, @clacaputo, @cmsbuild, @fwyzard, @jpata, @tvami, @francescobrivio can you please review it and eventually sign? Thanks.
@tvami, @makortel, @felicepantaleo, @GiacomoSguazzoni, @JanFSchulte, @rovere, @VinInn, @Martin-Grunewald, @missirol, @OzAmram, @tocheng, @ferencek, @mtosi, @gpetruc, @mmusich, @dkotlins, @threus, @dgulhan, @francescobrivio this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

VinInn · 2022-05-15T15:47:35Z

@cmsbuild , please test

VinInn · 2022-05-15T15:47:40Z

enable gpu

cmsbuild · 2022-05-15T20:02:21Z

-1

Failed Tests: UnitTests
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-651b42/24728/summary.html
COMMIT: b8d0837
CMSSW: CMSSW_12_4_X_2022-05-15-0000/slc7_amd64_gcc10
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37952/24728/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found errors in the following unit tests:

---> test cpuVertexFinderByDensity_t had ERRORS
---> test cpuVertexFinderIterative_t had ERRORS

GPU Comparison Summary

Summary:

No significant changes to the logs found
Reco comparison results: 24 differences found in the comparisons
DQMHistoTests: Total files compared: 4
DQMHistoTests: Total histograms compared: 19874
DQMHistoTests: Total failures: 1171
DQMHistoTests: Total nulls: 1
DQMHistoTests: Total successes: 18702
DQMHistoTests: Total skipped: 0
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 3 files compared)
Checked 12 log files, 9 edm output root files, 4 DQM output files
TriggerResults: found differences in 3 / 3 workflows

Comparison Summary

@slava77 comparisons for the following workflows were not done due to missing matrix map:

/data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-651b42/11634.301_TTbar_14TeV+2021_Run3FS+TTbar_14TeV_TuneCP5_GenSim+HARVESTNano

Summary:

No significant changes to the logs found
Reco comparison results: 2 differences found in the comparisons
DQMHistoTests: Total files compared: 50
DQMHistoTests: Total histograms compared: 3741432
DQMHistoTests: Total failures: 92
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 3741318
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
Checked 208 log files, 45 edm output root files, 50 DQM output files
TriggerResults: no differences found

VinInn · 2022-05-16T06:55:30Z

@cmsbuild , please test

cmsbuild · 2022-05-16T07:02:23Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37952/30028

This PR adds an extra 236KB to repository
Found files with invalid states:
- HeterogeneousCore/CUDAUtilities/src/cudaMemoryPool.cu:
  - Added: f37385a
  - Modified: 75ca0db, c429d13, c5e35f0
  - Deleted: 21a646e
- CUDADataFormats/TrackingRecHit/interface/TrackingRecHit2DHeterogeneousImpl.h:
  - Added: 5291489
  - Modified: 1a43ba7, e7d8632, c8d553a
  - Deleted: 521d4c0
- HeterogeneousCore/CUDAUtilities/interface/cudaMemoryPoolImpl.h:
  - Added: 5291489
  - Modified: 1a43ba7, 59bcb2b, 29df6e2, 849da8c, e7d8632, b4f4d46, 8b149ed
  - Deleted: 1487b88
- CUDADataFormats/SiPixelDigi/interface/SiPixelDigisCUDAImpl.h:
  - Added: 3ae45f7
  - Modified: b4f4d46, c8d553a
  - Deleted: 9402cb7
- CUDADataFormats/TrackingRecHit/src/TrackingRecHit2DHeterogeneous.cc:
  - Modified: 6b050bd, 0e49a36, e8e9c0f, 88da3bc, b4f4d46, 521d4c0
  - Deleted: 21a646e
  - Added: e7d8632
There are other open Pull requests which might conflict with changes you have proposed:
- File HeterogeneousCore/CUDAServices/src/CUDAService.cc modified in PR(s): Implement ResourceInformationService #37831
- File HeterogeneousCore/CUDAUtilities/test/BuildFile.xml modified in PR(s): Use cooperative groups to populate Associations (Histograms) in Pixel Patatrack #35713
- File RecoLocalTracker/SiPixelRecHits/plugins/PixelRecHitGPUKernel.cu modified in PR(s): Use cooperative groups to populate Associations (Histograms) in Pixel Patatrack #35713
- File RecoLocalTracker/SiPixelRecHits/plugins/SiPixelRecHitSoAFromLegacy.cc modified in PR(s): Use cooperative groups to populate Associations (Histograms) in Pixel Patatrack #35713
- File RecoPixelVertexing/PixelTriplets/plugins/CAHitNtupletGeneratorKernels.cc modified in PR(s): Use cooperative groups to populate Associations (Histograms) in Pixel Patatrack #35713
- File RecoPixelVertexing/PixelTriplets/plugins/CAHitNtupletGeneratorKernels.cu modified in PR(s): Use cooperative groups to populate Associations (Histograms) in Pixel Patatrack #35713
- File RecoPixelVertexing/PixelTriplets/plugins/CAHitNtupletGeneratorKernels.h modified in PR(s): Use cooperative groups to populate Associations (Histograms) in Pixel Patatrack #35713
- File RecoPixelVertexing/PixelTriplets/plugins/CAHitNtupletGeneratorKernelsAlloc.cc modified in PR(s): Use cooperative groups to populate Associations (Histograms) in Pixel Patatrack #35713

smuzaffar · 2024-02-12T20:08:20Z

ping

cmsbuild · 2024-08-27T08:08:44Z

Milestone for this pull request has been moved to CMSSW_14_2_X. Please open a backport if it should also go in to CMSSW_14_1_X.

antoniovilela · 2024-09-03T09:44:05Z

ping (to make bot change milestone)

cmsbuild · 2024-11-22T13:10:51Z

Milestone for this pull request has been moved to CMSSW_15_0_X. Please open a backport if it should also go in to CMSSW_14_2_X.

cmsbuild · 2025-02-07T08:31:45Z

Milestone for this pull request has been moved to CMSSW_15_1_X. Please open a backport if it should also go in to CMSSW_15_0_X.

cmsbuild · 2025-02-07T08:35:37Z

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-651b42/44257/summary.html
COMMIT: a97d64e
CMSSW: CMSSW_15_0_X_2025-02-06-2300/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37952/44257/install.sh to create a dev area with all the needed externals and cmssw changes.

This pull request cannot be automatically merged, could you please rebase it?
You can see the log for git cms-merge-topic here: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-651b42/44257/git-merge-result

cmsbuild · 2025-09-10T06:54:32Z

Milestone for this pull request has been moved to CMSSW_16_0_X. Please open a backport if it should also go in to CMSSW_15_1_X.

cmsbuild · 2025-12-18T12:58:07Z

Milestone for this pull request has been moved to CMSSW_16_1_X. Please open a backport if it should also go in to CMSSW_16_0_X.

mandrenguyen · 2026-02-24T13:30:57Z

-1
Clearing from our queue, feel free to close if no longer needed.

cmsbuild · 2026-04-04T12:39:19Z

Milestone for this pull request has been moved to CMSSW_17_0_X. Please open a backport if it should also go in to CMSSW_16_1_X.

cmsbuild · 2026-04-04T12:41:59Z

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-651b42/52469/summary.html
COMMIT: a97d64e
CMSSW: CMSSW_16_1_X_2026-04-04-1100/el8_amd64_gcc13
Additional Tests: GPU,AMD_MI300X,AMD_W7900,NVIDIA_H100,NVIDIA_L40S
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37952/52469/install.sh to create a dev area with all the needed externals and cmssw changes.

This pull request cannot be automatically merged, could you please rebase it?
You can see the log for git cms-merge-topic here: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-651b42/52469/git-merge-result

cmsbuild added this to the CMSSW_12_4_X milestone May 15, 2022

cmsbuild added alca-pending code-checks-pending heterogeneous-pending orp-pending pending-signatures reconstruction-pending tests-pending labels May 15, 2022

cmsbuild added code-checks-approved and removed code-checks-pending labels May 15, 2022

cmsbuild added tests-started and removed tests-pending labels May 15, 2022

cmsbuild mentioned this pull request May 15, 2022

BTV DQM Updates #37832

Merged

cmsbuild added tests-rejected code-checks-pending tests-pending and removed tests-started tests-rejected code-checks-approved labels May 15, 2022

cmsbuild added tests-started and removed tests-pending labels May 16, 2022

cmsbuild added code-checks-approved and removed code-checks-pending labels May 16, 2022

cmsbuild mentioned this pull request Feb 14, 2024

Alpaka vs CUDA DQM compare modules for pixel tracks objects #43964

Closed

cmsbuild mentioned this pull request May 3, 2024

Introduce edm::Async service, and use it in CUDA and Alpaka modules #44901

Merged

cmsbuild mentioned this pull request Jun 7, 2024

Include HIon type traits for Alpaka pixel tracking #45151

Merged

cmsbuild mentioned this pull request Jul 25, 2024

Mark ScopedContextAcquire destructor as noexcept(false) #45560

Merged

cmsbuild mentioned this pull request Sep 1, 2024

Remove legacy CUDA modules for pixel track and vertex reconstruction #45853

Closed

cmsbuild mentioned this pull request Sep 20, 2024

Remove the configuration of the legacy CUDA workflows #46076

Closed

cmsbuild mentioned this pull request Dec 3, 2024

Removing CUDA/gpu from Pixel code configs and dropping all CUDA wfs #46853

Merged

cmsbuild mentioned this pull request Dec 20, 2024

add fillDescriptions to a bunch of plugins in the EventFilter and Reco areas used at HLT (1/N) #47017

Merged

cmsbuild mentioned this pull request Jan 10, 2025

Fix for HI Alpaka Pixel Configs #47078

Merged

cmsbuild mentioned this pull request Mar 17, 2025

A More Flexible And Lightweight CA #47611

Merged

cmsbuild mentioned this pull request May 28, 2025

Remove cms::cuda::ScopedContextTask as unused #48191

Merged

cmsbuild mentioned this pull request Jun 20, 2025

Updated SoA View accessors from raw pointers to span #48377

Merged

cmsbuild mentioned this pull request Oct 23, 2025

[NGT] Extension of CA Pixel Tracking to Phase 2 Outer Tracker barrel #48921

Merged

cmsbuild mentioned this pull request Dec 17, 2025

Remove most dictionaries of CUDADataFormats #49656

Merged

This was referenced Jan 10, 2026

Remove legacy CUDA pixel local reconstruction #49761

Merged

Remove legacy CUDA pixel track and vertex reconstruction #49794

Merged

Remove legacy CUDA pixel EventSetup modules #49799

Merged

cmsbuild mentioned this pull request Jan 21, 2026

Remove legacy CUDA framework and tests #49891

Merged

cmsbuild mentioned this pull request Feb 10, 2026

Drop legacy CUDA utilities #50108

Merged

Conversation

VinInn commented May 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cmsbuild commented May 15, 2022

Uh oh!

cmsbuild commented May 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

VinInn commented May 15, 2022

Uh oh!

VinInn commented May 15, 2022

Uh oh!

cmsbuild commented May 15, 2022

Unit Tests

GPU Comparison Summary

Comparison Summary

Uh oh!

VinInn commented May 16, 2022

Uh oh!

cmsbuild commented May 16, 2022

Uh oh!

smuzaffar commented Feb 12, 2024

Uh oh!

cmsbuild commented Aug 27, 2024

Uh oh!

antoniovilela commented Sep 3, 2024

Uh oh!

cmsbuild commented Nov 22, 2024

Uh oh!

cmsbuild commented Feb 7, 2025

Uh oh!

cmsbuild commented Feb 7, 2025

Uh oh!

cmsbuild commented Sep 10, 2025

Uh oh!

cmsbuild commented Dec 18, 2025

Uh oh!

mandrenguyen commented Feb 24, 2026

Uh oh!

cmsbuild commented Apr 4, 2026

Uh oh!

cmsbuild commented Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

VinInn commented May 15, 2022 •

edited

Loading

cmsbuild commented May 15, 2022 •

edited

Loading