Skip to content

Adapt template pixel cpe algo to better handle shortened or broken clusters#48356

Merged
cmsbuild merged 1 commit intocms-sw:masterfrom
CMSTrackerDPG:updated_cpe_algo_tpl
Jun 24, 2025
Merged

Adapt template pixel cpe algo to better handle shortened or broken clusters#48356
cmsbuild merged 1 commit intocms-sw:masterfrom
CMSTrackerDPG:updated_cpe_algo_tpl

Conversation

@mroguljic
Copy link
Copy Markdown
Contributor

PR description:

While validating PR#47966, it was found that the proposed changes to generic CPE algorithm are not beneficial. This PR contains only the proposed changes to the template CPE algorithm that are simpler, and beneficial for long clusters in heavily irradiated pixel sensors (right-most columns in slides 23 onwards). The proposed changes to the algorithm are gated behind a process modifier.

PR validation:

The following two plots are showing the RecHit-SimHit distribution in extended-end-of-run-3 muon gun simulation. As expected, the effect is more pronounced in the high eta region.

recHitYResLayer1_tracking_central_Template
recHitYResLayer1_tracking_forward_Template

Although the logic of the code does not change when the goodEdgeAlgo flag is set to false, minute differences w.r.t reference were observed in many workflows. We hypothesize it originates from the changed footprint of the SiPixelTemplate class. This draft PR contains minimum edit that produces similar differences.

PR passes the basic battery of tests

50 49 48 42 21 1 1 1 1 1 1 tests passed, 0 0 0 0 0 0 0 0 0 0 0 failed

Backport

To be backported to 15.0.X.

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Jun 18, 2025

cms-bot internal usage

@cmsbuild
Copy link
Copy Markdown
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-48356/45235

@cmsbuild
Copy link
Copy Markdown
Contributor

A new Pull Request was created by @mroguljic for master.

It involves the following packages:

  • CondFormats/SiPixelTransient (reconstruction, db)
  • Configuration/ProcessModifiers (operations)
  • Configuration/PyReleaseValidation (pdmv, upgrade)
  • RecoLocalTracker/SiPixelRecHits (reconstruction)

@AdrianoDee, @Moanwar, @antoniovilela, @atpathak, @cmsbuild, @davidlange6, @DickyChant, @fabiocos, @francescobrivio, @jfernan2, @mandrenguyen, @miquork, @perrotta, @rappoccio, @srimanob, @subirsarkar can you please review it and eventually sign? Thanks.
@GiacomoSguazzoni, @JanChyczynski, @Martin-Grunewald, @PonIlya, @VinInn, @VourMa, @dkotlins, @fabiocos, @felicepantaleo, @ferencek, @gpetruc, @makortel, @missirol, @mmusich, @mroguljic, @mtosi, @rovere, @rsreds, @seemasharmafnal, @slomeo, @threus, @tsusa, @tvami, @yuanchao this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@mmusich
Copy link
Copy Markdown
Contributor

mmusich commented Jun 18, 2025

@cmsbuild, please test

@cmsbuild
Copy link
Copy Markdown
Contributor

+1

Size: This PR adds an extra 40KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-291a7b/46810/summary.html
COMMIT: f7cd851
CMSSW: CMSSW_15_1_X_2025-06-17-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/48356/46810/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

@jfernan2
Copy link
Copy Markdown
Contributor

There seem to be many differences in the Reco comparison results, are they understood? Thanks

@mroguljic
Copy link
Copy Markdown
Contributor Author

There seem to be many differences in the Reco comparison results, are they understood? Thanks

As noted in the PR description, we suspect the differences arise from the modified footprint of the SiPixelTemplate class. This draft PR introduces a minimal change that produces similar number of discrepancies in Reco comparisons.

We would appreciate suggestions from experts on how to further investigate or isolate the impact of this change.

@mroguljic
Copy link
Copy Markdown
Contributor Author

For reference: the new algorithm requires proper CPE templates. In case of EEoR3, we could use 140X_mcRun3_2024_realistic_EEOR3_Pix_HV800_Tr2000 GT as the baseline. With the goodEdgeAlgo turned on, we just replace the 1D template payload to: SiPixelTemplateDBObject_phase1_38T_EEoR3_HV800_Tr2000_bugfix3b. Other conditions stay the same.

@perrotta
Copy link
Copy Markdown
Contributor

Let see if @makortel or @Dr15Jones have anything to say or suggest

@Dr15Jones
Copy link
Copy Markdown
Contributor

@makortel is on vacation for the next 2 weeks

@Dr15Jones
Copy link
Copy Markdown
Contributor

@mroguljic I would not expect changes to the size of an algorithm to affect the results returned by the algorithm if the code is strictly C++ compliant. By compliant I mean it doesn't have any undefined behavior or race conditions. With undefined behavior we have seen slight variations of an algorithm causing large differences in results as the compiler makes a different assumption. There could be a slight change of numerical round-off, but with IEEE floating point (which is what we use for almost all the code) that should be extremely negligible (unless the calculations are continuously hitting underflows/overflows).

@Dr15Jones
Copy link
Copy Markdown
Contributor

So taking a quick look at the code and I see tons of variable declarations without initialization, e.g.

int i, j, k, minbin, binl, binh, binq, midpix, fypix, lypix, logypx;
int fxpix, lxpix, logxpx, shifty, shiftx, nyzero[TYSIZE];
int nclusx, nclusy;
int deltaj, jmin, jmax, fxbin, lxbin, fybin, lybin, djy, djx;
//int fypix2D, lypix2D, fxpix2D, lxpix2D;
float sythr, sxthr, rnorm, delta, sigma, sigavg, pseudopix, qscale, q50;
float ss2, ssa, sa2, ssba, saba, sba2, rat, fq, qtotal, qpixel, fbin[3];
float originx, originy, qfy, qly, qfx, qlx, bias, maxpix, minmax;
double chi2x, meanx, chi2y, meany, chi2ymin, chi2xmin, chi21max;
double hchi2, hndof, prvav, mpv, sigmaQ, kappa, xvav, beta2;
float ytemp[41][BYSIZE], xtemp[41][BXSIZE], ysum[BYSIZE], xsum[BXSIZE], ysort[BYSIZE], xsort[BXSIZE];
float chi2ybin[41], chi2xbin[41], ysig2[BYSIZE], xsig2[BXSIZE];
float yw2[BYSIZE], xw2[BXSIZE], ysw[BYSIZE], xsw[BXSIZE];
bool yd[BYSIZE], xd[BXSIZE], anyyd, anyxd, calc_probQ, use_VVIObj;

It is highly likely that an uninitialized variable is causing the problem. The change in algorithm size could move the where in the stack the variable is being defined and it now happens to sit somewhere which previously was being filled with a set value but with it movement is now over a different value.

@perrotta
Copy link
Copy Markdown
Contributor

Thank you @Dr15Jones for the consultation. Given what you found, I think that @mroguljic should better imbark in a revision campaign of this code that allows all uninitialized variables being initialized, for the sake of reproducibility, at least.
In the meanwhile I understand the origin of the observed discrepancies, and I can sign this PR

@perrotta
Copy link
Copy Markdown
Contributor

+db

@jfernan2
Copy link
Copy Markdown
Contributor

+1
Pending initialization of local variables (introduced 6 years ago) in this or another PR

@Moanwar
Copy link
Copy Markdown
Contributor

Moanwar commented Jun 24, 2025

+Upgrade

@mroguljic
Copy link
Copy Markdown
Contributor Author

+1 Pending initialization of local variables (introduced 6 years ago) in this or another PR

We will follow it up with a separate PR

@AdrianoDee
Copy link
Copy Markdown
Contributor

+pdmv

  • all good wf-wise

@cmsbuild
Copy link
Copy Markdown
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @antoniovilela, @sextonkennedy, @mandrenguyen, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@mandrenguyen
Copy link
Copy Markdown
Contributor

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants