Adapt template pixel cpe algo to better handle shortened or broken clusters#48356
Conversation
|
cms-bot internal usage |
|
A new Pull Request was created by @mroguljic for master. It involves the following packages:
@AdrianoDee, @Moanwar, @antoniovilela, @atpathak, @cmsbuild, @davidlange6, @DickyChant, @fabiocos, @francescobrivio, @jfernan2, @mandrenguyen, @miquork, @perrotta, @rappoccio, @srimanob, @subirsarkar can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
|
@cmsbuild, please test |
|
+1 Size: This PR adds an extra 40KB to repository Comparison SummarySummary:
|
|
There seem to be many differences in the Reco comparison results, are they understood? Thanks |
As noted in the PR description, we suspect the differences arise from the modified footprint of the SiPixelTemplate class. This draft PR introduces a minimal change that produces similar number of discrepancies in Reco comparisons. We would appreciate suggestions from experts on how to further investigate or isolate the impact of this change. |
|
For reference: the new algorithm requires proper CPE templates. In case of EEoR3, we could use 140X_mcRun3_2024_realistic_EEOR3_Pix_HV800_Tr2000 GT as the baseline. With the goodEdgeAlgo turned on, we just replace the 1D template payload to: |
|
Let see if @makortel or @Dr15Jones have anything to say or suggest |
|
@makortel is on vacation for the next 2 weeks |
|
@mroguljic I would not expect changes to the size of an algorithm to affect the results returned by the algorithm if the code is strictly C++ compliant. By compliant I mean it doesn't have any undefined behavior or race conditions. With undefined behavior we have seen slight variations of an algorithm causing large differences in results as the compiler makes a different assumption. There could be a slight change of numerical round-off, but with IEEE floating point (which is what we use for almost all the code) that should be extremely negligible (unless the calculations are continuously hitting underflows/overflows). |
|
So taking a quick look at the code and I see tons of variable declarations without initialization, e.g. cmssw/RecoLocalTracker/SiPixelRecHits/src/SiPixelTemplateReco.cc Lines 162 to 175 in f7cd851 It is highly likely that an uninitialized variable is causing the problem. The change in algorithm size could move the where in the stack the variable is being defined and it now happens to sit somewhere which previously was being filled with a set value but with it movement is now over a different value. |
|
Thank you @Dr15Jones for the consultation. Given what you found, I think that @mroguljic should better imbark in a revision campaign of this code that allows all uninitialized variables being initialized, for the sake of reproducibility, at least. |
|
+db |
|
+1 |
|
+Upgrade |
We will follow it up with a separate PR |
|
+pdmv
|
|
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @antoniovilela, @sextonkennedy, @mandrenguyen, @rappoccio (and backports should be raised in the release meeting by the corresponding L2) |
|
+1 |
PR description:
While validating PR#47966, it was found that the proposed changes to generic CPE algorithm are not beneficial. This PR contains only the proposed changes to the template CPE algorithm that are simpler, and beneficial for long clusters in heavily irradiated pixel sensors (right-most columns in slides 23 onwards). The proposed changes to the algorithm are gated behind a process modifier.
PR validation:
The following two plots are showing the RecHit-SimHit distribution in extended-end-of-run-3 muon gun simulation. As expected, the effect is more pronounced in the high eta region.
Although the logic of the code does not change when the
goodEdgeAlgoflag is set to false, minute differences w.r.t reference were observed in many workflows. We hypothesize it originates from the changed footprint of theSiPixelTemplateclass. This draft PR contains minimum edit that produces similar differences.PR passes the basic battery of tests
Backport
To be backported to 15.0.X.