Skip to content

New Heterogeneous Memory Pool#37952

Open
VinInn wants to merge 39 commits intocms-sw:masterfrom
VinInn:NewMemoryPoolV1
Open

New Heterogeneous Memory Pool#37952
VinInn wants to merge 39 commits intocms-sw:masterfrom
VinInn:NewMemoryPoolV1

Conversation

@VinInn
Copy link
Copy Markdown
Contributor

@VinInn VinInn commented May 15, 2022

This PR replaces the old "notcub" cache allocator with a memory pool featuring

lockfree operations
backend agnostic implementation
The data interface is based on a simple Buffer that is completely backend agnostic
The allocation interface (makeBuffer) currently depends on cudaStream_t that can be easily hidden behind void * or a light opaque struct
A new feature is a "Bundle deleter": buffers can be bundle together and then freed in just one operation: this reduces the number of cuda calls.
All previous users of the cache allocator (at least for Pixel wf) have been migrated.

Tests passes: it is not slower than previous implementation. Need a free machine to make definitive tests.

Some cleanup is still required to remove debug statements.

Purely technical no regression expected.

Draft Slides for a possible presentation available @ https://cernbox.cern.ch/index.php/s/Ax4NHYGLHbG8N1C

@cmsbuild
Copy link
Copy Markdown
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37952/30020

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 15, 2022

A new Pull Request was created by @VinInn (Vincenzo Innocente) for master.

It involves the following packages:

  • CUDADataFormats/BeamSpot (heterogeneous, reconstruction)
  • CUDADataFormats/Common (heterogeneous)
  • CUDADataFormats/SiPixelDigi (heterogeneous, reconstruction)
  • CUDADataFormats/Track (heterogeneous, reconstruction)
  • CUDADataFormats/TrackingRecHit (heterogeneous, reconstruction)
  • CUDADataFormats/Vertex (heterogeneous, reconstruction)
  • EventFilter/SiPixelRawToDigi (reconstruction)
  • HeterogeneousCore/CUDACore (heterogeneous)
  • HeterogeneousCore/CUDAServices (heterogeneous)
  • HeterogeneousCore/CUDAUtilities (heterogeneous)
  • RecoLocalTracker/SiPixelRecHits (reconstruction)
  • RecoPixelVertexing/PixelTrackFitting (reconstruction)
  • RecoPixelVertexing/PixelTriplets (reconstruction)
  • RecoPixelVertexing/PixelVertexFinding (reconstruction)
  • RecoVertex/BeamSpotProducer (reconstruction, alca)

@malbouis, @yuanchao, @makortel, @slava77, @clacaputo, @cmsbuild, @fwyzard, @jpata, @tvami, @francescobrivio can you please review it and eventually sign? Thanks.
@tvami, @makortel, @felicepantaleo, @GiacomoSguazzoni, @JanFSchulte, @rovere, @VinInn, @Martin-Grunewald, @missirol, @OzAmram, @tocheng, @ferencek, @mtosi, @gpetruc, @mmusich, @dkotlins, @threus, @dgulhan, @francescobrivio this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@VinInn
Copy link
Copy Markdown
Contributor Author

VinInn commented May 15, 2022

@cmsbuild , please test

@VinInn
Copy link
Copy Markdown
Contributor Author

VinInn commented May 15, 2022

enable gpu

@cmsbuild
Copy link
Copy Markdown
Contributor

-1

Failed Tests: UnitTests
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-651b42/24728/summary.html
COMMIT: b8d0837
CMSSW: CMSSW_12_4_X_2022-05-15-0000/slc7_amd64_gcc10
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37952/24728/install.sh to create a dev area with all the needed externals and cmssw changes.

Unit Tests

I found errors in the following unit tests:

---> test cpuVertexFinderByDensity_t had ERRORS
---> test cpuVertexFinderIterative_t had ERRORS

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 24 differences found in the comparisons
  • DQMHistoTests: Total files compared: 4
  • DQMHistoTests: Total histograms compared: 19874
  • DQMHistoTests: Total failures: 1171
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 18702
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 3 files compared)
  • Checked 12 log files, 9 edm output root files, 4 DQM output files
  • TriggerResults: found differences in 3 / 3 workflows

Comparison Summary

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-651b42/11634.301_TTbar_14TeV+2021_Run3FS+TTbar_14TeV_TuneCP5_GenSim+HARVESTNano

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 2 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3741432
  • DQMHistoTests: Total failures: 92
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3741318
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 208 log files, 45 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@VinInn
Copy link
Copy Markdown
Contributor Author

VinInn commented May 16, 2022

@cmsbuild , please test

@cmsbuild
Copy link
Copy Markdown
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37952/30028

@smuzaffar
Copy link
Copy Markdown
Contributor

ping

@cmsbuild
Copy link
Copy Markdown
Contributor

Milestone for this pull request has been moved to CMSSW_14_2_X. Please open a backport if it should also go in to CMSSW_14_1_X.

@antoniovilela
Copy link
Copy Markdown
Contributor

ping (to make bot change milestone)

@cmsbuild
Copy link
Copy Markdown
Contributor

Milestone for this pull request has been moved to CMSSW_15_0_X. Please open a backport if it should also go in to CMSSW_14_2_X.

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Feb 7, 2025

Milestone for this pull request has been moved to CMSSW_15_1_X. Please open a backport if it should also go in to CMSSW_15_0_X.

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Feb 7, 2025

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-651b42/44257/summary.html
COMMIT: a97d64e
CMSSW: CMSSW_15_0_X_2025-02-06-2300/el8_amd64_gcc12
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37952/44257/install.sh to create a dev area with all the needed externals and cmssw changes.

This pull request cannot be automatically merged, could you please rebase it?
You can see the log for git cms-merge-topic here: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-651b42/44257/git-merge-result

@cmsbuild
Copy link
Copy Markdown
Contributor

Milestone for this pull request has been moved to CMSSW_16_0_X. Please open a backport if it should also go in to CMSSW_15_1_X.

@cmsbuild
Copy link
Copy Markdown
Contributor

Milestone for this pull request has been moved to CMSSW_16_1_X. Please open a backport if it should also go in to CMSSW_16_0_X.

@mandrenguyen
Copy link
Copy Markdown
Contributor

-1
Clearing from our queue, feel free to close if no longer needed.

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Apr 4, 2026

Milestone for this pull request has been moved to CMSSW_17_0_X. Please open a backport if it should also go in to CMSSW_16_1_X.

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Apr 4, 2026

-1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-651b42/52469/summary.html
COMMIT: a97d64e
CMSSW: CMSSW_16_1_X_2026-04-04-1100/el8_amd64_gcc13
Additional Tests: GPU,AMD_MI300X,AMD_W7900,NVIDIA_H100,NVIDIA_L40S
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/37952/52469/install.sh to create a dev area with all the needed externals and cmssw changes.

This pull request cannot be automatically merged, could you please rebase it?
You can see the log for git cms-merge-topic here: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-651b42/52469/git-merge-result

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants