Digi Morphing for HLT#48734
Conversation
|
This PR contains too many commits (644 >= 240) and will not be processed. |
cf734a9 to
69c506d
Compare
|
cms-bot internal usage |
|
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-48734/45791 ERROR: Build errors found during clang-tidy run. |
There was a problem hiding this comment.
please remove these changes
| } | ||
|
|
||
| ALPAKA_FN_ACC | ||
| bool isMorphingModule(uint32_t moduleId, const uint32_t* morphingModules, uint32_t nMorphingModules) { |
There was a problem hiding this comment.
if there are more than about 10 modules, it would be much more efficient to
- sort the modules (once, on the cpu, in the constructor of the module)
- use a binary search to check if a module is in the list
There was a problem hiding this comment.
Thanks!
Could you make the implementation generic and move it to a central place, like HeterogeneousCore/AlpakaInterface/interface/alpakastdAlgorithm.h ?
You may be able to reuse some of the functionality from that file.
See also HeterogeneousCore/CUDAUtilities/interface/cudastdAlgorithm.h for the old CUDA-only version.
There was a problem hiding this comment.
Has this been addressed?
There was a problem hiding this comment.
No, but I guess it can be done later.
| // FIXME: this is just an estimate, to be studied and optimised | ||
| //static constexpr uint32_t maxFakesInModule = TrackerTraits::maxPixInModule * 2 / 5; |
| uint32_t thisModuleId = digi_view[firstPixel].moduleId(); | ||
| uint32_t rawModuleId = digi_view[firstPixel].rawIdArr(); | ||
| applyDigiMorphing = | ||
| applyDigiMorphing && pixelStatus::isMorphingModule(rawModuleId, morphingModules, nMorphingModules); |
There was a problem hiding this comment.
I think this is wrong: a kernel block may loop over different modules (line 196); once it encounters a module that is not in the list, it will change the applyDigiMorphing to false; at that point all other modules will be ignored.
| ALPAKA_FN_ACC void operator()(Acc1D const& acc, | ||
| SiPixelDigisSoAView digi_view, | ||
| SiPixelDigisSoAView fakes_view, | ||
| bool applyDigiMorphing, |
There was a problem hiding this comment.
| bool applyDigiMorphing, | |
| bool enableDigiMorphing, |
| applyDigiMorphing = | ||
| applyDigiMorphing && pixelStatus::isMorphingModule(rawModuleId, morphingModules, nMorphingModules); |
There was a problem hiding this comment.
| applyDigiMorphing = | |
| applyDigiMorphing && pixelStatus::isMorphingModule(rawModuleId, morphingModules, nMorphingModules); | |
| bool applyDigiMorphing = | |
| enableDigiMorphing && pixelStatus::isMorphingModule(rawModuleId, morphingModules, nMorphingModules); |
| uint32_t rawModuleId = digi_view[firstPixel].rawIdArr(); | ||
| applyDigiMorphing = | ||
| applyDigiMorphing && pixelStatus::isMorphingModule(rawModuleId, morphingModules, nMorphingModules); | ||
| //applyDigiMorphing = applyDigiMorphing && isMorphingModule; |
| std::vector<std::string> limits; | ||
| boost::split(limits, r, boost::is_any_of("-")); | ||
| try { | ||
| if (limits.size() > 1) { |
There was a problem hiding this comment.
maybe add a check that there are exactly 0 or 1 hyphens, and not more than 1 ?
|
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-48734/45792
Code check has found code style and quality issues which could be resolved by applying following patch(s)
|
| auto morphingModules_d = | ||
| cms::alpakatools::make_device_buffer<uint32_t[]>(queue, digiMorphingConfig.morphingModules.size()); | ||
| auto morphingModules_h = cms::alpakatools::make_host_view(digiMorphingConfig.morphingModules.data(), | ||
| digiMorphingConfig.morphingModules.size()); | ||
| alpaka::memcpy(queue, morphingModules_d, morphingModules_h); |
There was a problem hiding this comment.
This is going to stay the same for the whole job.
Can you move the allocation of the buffer and the copy outside of the event loop ?
One option would be to use the cms::alpakatools::MoveToDeviceCache functionality (see the README.md).
There was a problem hiding this comment.
Hi, I couln't manage to use MoveToDevice functionality as the struct I used in place of TestAlgo::UpdateInfo ( as done in the example) contains std::vector which is not trivially destructable and there are compilation issues with that.
Instead I moved this part to where the cabling map changes so that it's not repeated for every event.
Please let me know if this is fine or there is a better way
| #if 0 | ||
| alpaka::exec<Acc1D>(queue, | ||
| workDivMaxNumModules, | ||
| FindClus<TrackerTraits>{}, | ||
| digis_view, | ||
| unused.view(), | ||
| clusters_d->view(), | ||
| numDigis); | ||
| #endif |
| tTopo_ = &iSetup.getData(trackerTopologyToken_); | ||
| digiMorphingConfig_.morphingModules.clear(); | ||
| for (const auto& connection : cablingMap_->det2fedMap()) { | ||
| auto rawId = connection.first; | ||
| if (rawId == 0) | ||
| continue; | ||
| DetId detId(rawId); | ||
| if (!skipDetId(tTopo_, detId, theBarrelRegions_, theEndcapRegions_)) { | ||
| digiMorphingConfig_.morphingModules.push_back(rawId); | ||
| } | ||
| } |
There was a problem hiding this comment.
(sorry, it is already done only if there are changes, ignore my previous comment)
There was a problem hiding this comment.
However, the tracker topology is in a different record (TrackerTopologyRcd) than the cabling map (SiPixelFedCablingMapRcd), so the check needs to be split ?
There was a problem hiding this comment.
Sorry, I am not sure what you mean by the check needs to be split... Could you please clarify?
|
please test |
|
tests here seem stuck. Are we expecting them to finish in a finite amount of time? |
|
@mmusich I have restarted the unit test |
|
+1 Size: This PR adds an extra 24KB to repository Comparison SummarySummary:
AMD_MI300X Comparison SummarySummary:
NVIDIA_T4 Comparison SummarySummary:
|
|
@cms-sw/heterogeneous-l2 @cms-sw/reconstruction-l2 do you have any other comments and / or requests for this? |
|
I will not be able to re-check the code in the near future, so let's assume that all comments have been addressed 🤷🏻♂️ |
|
+heterogeneous |
|
+1 |
|
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @mandrenguyen, @sextonkennedy, @ftenchini (and backports should be raised in the release meeting by the corresponding L2) |
|
+1 |
|
I suspect this PR is causing failures in the GPU relvals since when it was merged in Also FWIW, running the |
|
Also:
doesn't look very sane. |
| trackerTopologyToken_(esConsumes<TrackerTopology, TrackerTopologyRcd>()), | ||
| includeErrors_(iConfig.getParameter<bool>("IncludeErrors")), | ||
| useQuality_(iConfig.getParameter<bool>("UseQualityInfo")), | ||
| verbose_(iConfig.getParameter<bool>("verbose")), |
This is because this PR has unintentionally undone #48494. To be fixed. |
PR description:
Morphing Algorithm for Fake Digi Addition
This PR introduces an alternative alpaka-based algorithm for pixel digi morphing. This implementation is equivalent in principle to the legacy implementation and to the other alpaka-based implementation in PR #48343 but supercedes the latter as it has issues with throughput due to memory usage. Like the other implementations, digi morphing is applied here only to specific detector regions which can be configured as shown below.
Configuration Options
The morphing behavior can be controlled in the EDProducer configuration with the following options:
Regional Morphing
Regions can be specified in the configuration as follows:
Barrel Regions:
Use
LAYER,LADDER,MODULEcoordinates, with support for individual values or ranges (e.g.,1-12for ladders 1 through 12).Endcap Regions:
Use
DISK,BLADE,SIDE,PANELcoordinates, with support for ranges.PR validation:
Validation results can be found here: https://indico.cern.ch/event/1567986/contributions/6647897/attachments/3116797/5527819/HLT_digi_morphing_validation.pdf