Skip to content

Adapt pixel cpe algo to better handle broken clusters [15.0.X]#48008

Closed
mroguljic wants to merge 4 commits intocms-sw:CMSSW_15_0_Xfrom
mroguljic:updated_cpe_algo-150X
Closed

Adapt pixel cpe algo to better handle broken clusters [15.0.X]#48008
mroguljic wants to merge 4 commits intocms-sw:CMSSW_15_0_Xfrom
mroguljic:updated_cpe_algo-150X

Conversation

@mroguljic
Copy link
Copy Markdown
Contributor

Backport
Backport of #47966 to CMSSW 15.0.X to request data rereco for validation

PR description:

At the end of 2024, cluster breakage at high eta became significant. This caused issues when trying to improve calibrations for pixel cluster (position) parameter estimation, CPE.

This fix changes the CPE algorithms by relying on the "good" cluster edge, instead of both edges. This alternative algorithm is only used if the clusters are shorter than expected (e.g. broken). The change affects template reconstruction, and generic reconstruction both at CPU and GPU (alpaka). The fix is gated behind process modifiers for testing. The new versions of CPE algorithms require corresponding condition updates.

PR validation:

Undergoing runTheMatrix.py -l limited -i all --ibeos. We don't expect any workflow to be affect since the changes are protected by process modifiers. The PR was opened before the validation to allower others to comment early.

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 5, 2025

A new Pull Request was created by @mroguljic for CMSSW_15_0_X.

It involves the following packages:

  • CondFormats/SiPixelTransient (db, reconstruction)
  • Configuration/ProcessModifiers (operations)
  • RecoLocalTracker/SiPixelRecHits (reconstruction)

@antoniovilela, @atpathak, @cmsbuild, @consuegs, @davidlange6, @fabiocos, @francescobrivio, @jfernan2, @mandrenguyen, @perrotta, @rappoccio can you please review it and eventually sign? Thanks.
@GiacomoSguazzoni, @JanChyczynski, @Martin-Grunewald, @PonIlya, @VinInn, @VourMa, @dkotlins, @fabiocos, @felicepantaleo, @ferencek, @gpetruc, @makortel, @missirol, @mmusich, @mroguljic, @mtosi, @rovere, @rsreds, @seemasharmafnal, @threus, @tsusa, @tvami, @yuanchao this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 5, 2025

cms-bot internal usage

@mroguljic mroguljic changed the title change pixel cpe algo to handle broken clusters Adapt pixel cpe algo to better handle broken clusters [15.0.X] May 5, 2025
Copy link
Copy Markdown
Contributor

@mmusich mmusich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some naive questions / comments from a non-reviewer.

Comment thread CondFormats/SiPixelTransient/src/SiPixelTemplate.cc Outdated
Comment thread CondFormats/SiPixelTransient/src/SiPixelTemplate.cc Outdated
Comment thread RecoLocalTracker/SiPixelRecHits/python/PixelCPEClusterRepair_cfi.py
@mmusich
Copy link
Copy Markdown
Contributor

mmusich commented May 5, 2025

assign heterogeneous

  • for the pixel local reco on GPU part

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 5, 2025

New categories assigned: heterogeneous

@fwyzard,@makortel you have been requested to review this Pull request/Issue and eventually sign? Thanks

@mmusich
Copy link
Copy Markdown
Contributor

mmusich commented May 5, 2025

backport of #47966

@mmusich
Copy link
Copy Markdown
Contributor

mmusich commented May 5, 2025

enable gpu

@mroguljic
Copy link
Copy Markdown
Contributor Author

runTheMatrix finished with: 50 48 24 19 14 1 1 1 1 1 1 tests passed, 0 1 22 1 0 0 0 0 0 0 0 failed
I'm going through the logs to see what the issues are.

@mroguljic mroguljic force-pushed the updated_cpe_algo-150X branch from 70746c0 to 680f6cb Compare May 6, 2025 05:55
@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 6, 2025

Pull request #48008 was updated. @antoniovilela, @atpathak, @cmsbuild, @consuegs, @davidlange6, @fabiocos, @francescobrivio, @fwyzard, @jfernan2, @makortel, @mandrenguyen, @perrotta, @rappoccio can you please check and sign again.

@mroguljic
Copy link
Copy Markdown
Contributor Author

runTheMatrix finished with 50 49 46 41 19 1 1 1 1 1 1 tests passed, 0 0 1 0 0 0 0 0 0 0 0 failed. Workflow 1000.0 fails at step3 with

----- Begin Fatal Exception 06-May-2025 07:46:12 CEST-----------------------
An exception of category 'FatalRootError' occurred while
   [0] Calling EventProcessor::runToCompletion (which does almost everything after beginJob and before endJob)
   Additional Info:
      [a] Fatal Root Error: @SUB=TStorageFactorySystem::Unlink
Unsupported

----- End Fatal Exception -------------------------------------------------

Not sure if this is related to the changes.

Copy link
Copy Markdown
Contributor

@perrotta perrotta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review and code development should be done in the development (master) release, and only backported when stabilized. However, since some review already started in ths backport PR, let me add a few comments here.
It would be preferable, however, that since now the two PRs get synchronized, and then the review continue in the master.

Comment on lines +156 to +161
// define the centers of the first last last pixel coordinates
float x1 = pitch * (upper_edge_first_pix - 0.5 * pitchfraction_first);
float x2 = pitch * (lower_edge_last_pix + 0.5 * pitchfraction_last);

if (delta / pitch > delta_length_cut) {
// observed cluster is much shorter than expected, use one-sided reco
if (w_pred > 0.f) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// define the centers of the first last last pixel coordinates
float x1 = pitch * (upper_edge_first_pix - 0.5 * pitchfraction_first);
float x2 = pitch * (lower_edge_last_pix + 0.5 * pitchfraction_last);
if (delta / pitch > delta_length_cut) {
// observed cluster is much shorter than expected, use one-sided reco
if (w_pred > 0.f) {
if (delta / pitch > delta_length_cut) {
// define the centers of the first and last pixel coordinates
float x1 = pitch * (upper_edge_first_pix - 0.5 * pitchfraction_first);
float x2 = pitch * (lower_edge_last_pix + 0.5 * pitchfraction_last);
// observed cluster is much shorter than expected, use one-sided reco
if (w_pred > 0.f) {


edm::ParameterSet pset_;
bool useErrorsFromTemplates_;
bool GoodEdgeAlgo_;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming rule nr 9 in https://cms-sw.github.io/cms_coding_rules.html

Suggested change
bool GoodEdgeAlgo_;
bool goodEdgeAlgo_;

auto const& myname = p.getParameter<std::string>("ComponentName");
auto const& magname = p.getParameter<edm::ESInputTag>("MagneticFieldRecord");
useErrorsFromTemplates_ = p.getParameter<bool>("UseErrorsFromTemplates");
GoodEdgeAlgo_ = p.getParameter<bool>("GoodEdgeAlgo");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
GoodEdgeAlgo_ = p.getParameter<bool>("GoodEdgeAlgo");
goodEdgeAlgo_ = p.getParameter<bool>("GoodEdgeAlgo");

genErrorDBObjectProduct,
lorentzAngleWidthProduct);
lorentzAngleWidthProduct,
GoodEdgeAlgo_);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
GoodEdgeAlgo_);
goodEdgeAlgo_);

const SiPixelGenErrorDBObject* genErrorDBObject,
const SiPixelLorentzAngle* lorentzAngleWidth)
const SiPixelLorentzAngle* lorentzAngleWidth,
const bool GoodEdgeAlgo)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const bool GoodEdgeAlgo)
const bool goodEdgeAlgo)

cout << "(int)useErrorsFromTemplates_ = " << (int)useErrorsFromTemplates_ << endl;
cout << "truncatePixelCharge_ = " << (int)truncatePixelCharge_ << endl;
cout << "IrradiationBiasCorrection_ = " << (int)IrradiationBiasCorrection_ << endl;
cout << "GoodEdgeAlgo_ = " << (int)GoodEdgeAlgo_ << endl;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cout << "GoodEdgeAlgo_ = " << (int)GoodEdgeAlgo_ << endl;
cout << "goodEdgeAlgo_ = " << (int)GoodEdgeAlgo_ << endl;

the_size_cutX); // cut for eff charge width &&&
the_size_cutX, // cut for eff charge width &&&
delta_length_cut,
GoodEdgeAlgo_);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
GoodEdgeAlgo_);
goodEdgeAlgo_);

shifty = templ.cytemp() - midpix;
} //else

shifty = BHY - midpix;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overwrites L479

Comment on lines +483 to +484
// midpix = (fypix + lypix) / 2;
// shifty = templ.cytemp() - midpix;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// midpix = (fypix + lypix) / 2;
// shifty = templ.cytemp() - midpix;

Comment thread CondFormats/SiPixelTransient/src/SiPixelTemplate.cc
@mmusich
Copy link
Copy Markdown
Contributor

mmusich commented May 6, 2025

Not sure if this is related to the changes.

@mroguljic, do you happen to have run your tests on eos?
This looks related to #44369, thus unrelated.

@mroguljic
Copy link
Copy Markdown
Contributor Author

Not sure if this is related to the changes.

@mroguljic, do you happen to have run your tests on eos? This looks related to #44369, thus unrelated.

Thank you Marco. Indeed, I ran this on eos

@mandrenguyen
Copy link
Copy Markdown
Contributor

please test

@mroguljic
Copy link
Copy Markdown
Contributor Author

please test

@mandrenguyen This backport is not yet synchronized to master. If there are no more comments on master, I can do it now.

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented May 9, 2025

Pull request #48008 was updated. @Martin-Grunewald, @antoniovilela, @atpathak, @cmsbuild, @davidlange6, @fabiocos, @francescobrivio, @fwyzard, @jfernan2, @makortel, @mandrenguyen, @mmusich, @perrotta, @rappoccio can you please check and sign again.

@elusian
Copy link
Copy Markdown

elusian commented May 10, 2025

Hello, these are the new timing tests with the updated code, with the same conditions as before but with the addition of a test with the PR merged without the customization applied:

Of the increase in timing, we have:

  • +1.5 in doublet recovery track building,
  • +O(0.1) ms in doublet recovery track fitting
  • +O(0.1) ms in iter0 track building and fitting
  • +O(0.1) ms in pixel pair producers
  • +O(0.1) ms in pixel tracks/cluster conversion
  • -3.5 ms in other

The small changes are probably within error, I'm reporting them because in principle we can expect some changes there.

@mtosi
Copy link
Copy Markdown
Contributor

mtosi commented May 10, 2025

thanks @elusian !
so, now we are missing the modifier for the offline and online reconstruction in order to have the physics validation performed, right ?

@mroguljic
Copy link
Copy Markdown
Contributor Author

thanks @elusian ! so, now we are missing the modifier for the offline and online reconstruction in order to have the physics validation performed, right ?

Hi Mia, we did add modifiers for both offline and online in these files:

Configuration/ProcessModifiers/python/siPixelGoodEdgeAlgo_cff.py
RecoLocalTracker/SiPixelRecHits/python/PixelCPEClusterRepair_cfi.py
RecoLocalTracker/SiPixelRecHits/python/PixelCPEESProducers_cff.py
RecoLocalTracker/SiPixelRecHits/python/PixelCPEGeneric_cfi.py
RecoLocalTracker/SiPixelRecHits/python/PixelCPETemplateReco_cfi.py

@fwyzard
Copy link
Copy Markdown
Contributor

fwyzard commented May 12, 2025

+heterogeneous

@mtosi
Copy link
Copy Markdown
Contributor

mtosi commented May 12, 2025

thank you, @mroguljic

Configuration/ProcessModifiers/python/siPixelGoodEdgeAlgo_cff.py
RecoLocalTracker/SiPixelRecHits/python/PixelCPEClusterRepair_cfi.py
RecoLocalTracker/SiPixelRecHits/python/PixelCPEESProducers_cff.py
RecoLocalTracker/SiPixelRecHits/python/PixelCPEGeneric_cfi.py
RecoLocalTracker/SiPixelRecHits/python/PixelCPETemplateReco_cfi.py

I'm afraid I don't see the customization for HLT, though
@mmusich @Martin-Grunewald correct me if I'm wrong, but we typically collect such kind of customization function in
HLTrigger/Configuration/python/customizeHLTforCMSSW.py

the customization function should have the following

process.hltESPPixelCPEGeneric.GoodEdgeAlgo = cms.bool(True)
process.hltESPPixelCPEFastParamsPhase1.GoodEdgeAlgo = cms.bool(True)

@Martin-Grunewald
Copy link
Copy Markdown
Contributor

Martin-Grunewald commented May 12, 2025

Typically we'd like a CMSHLT Jira ticket to change HLT. The fillDescriptions default should be set as to not use the new feature, ie, preserve the current behaviour (at least at HLT).

@mmusich
Copy link
Copy Markdown
Contributor

mmusich commented May 12, 2025

I'm afraid I don't see the customization for HLT, though, correct me if I'm wrong, but we typically collect such kind of customization function in HLTrigger/Configuration/python/customizeHLTforCMSSW.py

I'd rather put the customization in a RecolLocalTracker package and then call it directly in the relval machinery.

@cmsbuild
Copy link
Copy Markdown
Contributor

@elusian
Copy link
Copy Markdown

elusian commented May 22, 2025

Hello, I ran some HLT tracking validation and the plots can be found here.
The tests were done on part of the Winter25 TTbar sample produced with GT 142X_mcRun3_2025_realistic_v7.

To repeat the tests, the steps are (for the reference)

cmsDriver.py step2_Tk25 --conditions 142X_mcRun3_2025_realistic_v7 --datatier GEN-SIM-DIGI-RAW --era Run3_2025 --eventcontent FEVTDEBUGHLT --filein file:TTbar_FEVTDEBUGHLT.root --fileout file:HLT.root --geometry DB:Extended --nStreams 1 --nThreads 1 --no_exec --number -1 --step HLT:Reasonable_2025_menu --process reHLT --customise HLTrigger/Configuration/customizeHLTfor2025Studies.customizeHLTfor2024L1TMenu,HLTrigger/Configuration/customizeHLTfor2025Studies.customizeHLTfor2025Studies

Reasonable_2025_menu was created with

hltGetConfiguration /frozen/2025/2e34/v1.0/HLT/V1 --cff --data --type GRun --path MC_ReducedIterativeTracking_v* > src/HLTrigger/Configuration/python/HLT_Reasonable_2025_menu_cff.py

A new GRun menu should also suffice.
Then the validation is done with

cmsDriver.py val --filein file:HLT.root -s VALIDATION:hltMultiTrackValidation --conditions 142X_mcRun3_2025_realistic_v7 --eventcontent DQM --datatier DQMIO --fileout file:DQMIO.root --nThreads 1 -n -1 --geometry DB:Extended --era Run3_2025 --mc --no_exec --hltProcess reHLT

and the harvesting with

harvestTrackValidationPlots.py DQMIO.root -o DQM.root

Finally, plots are made with

makeTrackValidationPlots.py DQM_ref.root DQM_target1.root DQM_target2.root ... --outputDir www/folder

To test the PR it's enough to modify the conditions following Dinko's comment and use either the procModifier or the customizer.
In the validation step the CPE is not used so in principle the updated conditions are not needed (but just in case this was tested with the black vs orange points in the plots)

@mmusich
Copy link
Copy Markdown
Contributor

mmusich commented Jun 19, 2025

-hlt

@mroguljic
Copy link
Copy Markdown
Contributor Author

Closing the PR because it is superseeded by: #48356

@mroguljic mroguljic closed this Jun 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.