Skip to content

Add custom plugin to force particle decays using products defined in an external file#50156

Open
Ma128-bit wants to merge 3 commits intocms-sw:masterfrom
Ma128-bit:plutoIntegration_v16
Open

Add custom plugin to force particle decays using products defined in an external file#50156
Ma128-bit wants to merge 3 commits intocms-sw:masterfrom
Ma128-bit:plutoIntegration_v16

Conversation

@Ma128-bit
Copy link
Copy Markdown
Contributor

PR description:

This plugin is required for the central MC production of η′ → 4μ. Since the η′ decay is not implemented in Pythia, an external generator must be used.
This PR introduces a custom plugin that forces a particle (used here for the η′) to decay using kinematic information read from an external ROOT file. The decay is generated in the particle’s center-of-mass frame by an external event generator (in this case, Pluto).

PR validation:

I successfully compiled the code in CMSSW_16_1_X_2026-02-15-2300 without any issues and privately produced a small MC sample for validation.

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Feb 16, 2026

cms-bot internal usage

@cmsbuild
Copy link
Copy Markdown
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-50156/48101

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Copy Markdown
Contributor

@cmsbuild
Copy link
Copy Markdown
Contributor

A new Pull Request was created by @Ma128-bit for master.

It involves the following packages:

  • GeneratorInterface/PlutoInterface (****)

The following packages do not have a category, yet:

GeneratorInterface/PlutoInterface
Please create a PR for https://github.com/cms-sw/cms-bot/blob/master/categories_map.py to assign category

@cmsbuild can you please review it and eventually sign? Thanks.
@alberto-sanchez, @mkirsano this is something you requested to watch as well.
@ftenchini, @mandrenguyen, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@cmsbuild
Copy link
Copy Markdown
Contributor

New categories assigned: generators

@lviliani,@mkirsano,@sensrcn,@theofil you have been requested to review this Pull request/Issue and eventually sign? Thanks

@civanch
Copy link
Copy Markdown
Contributor

civanch commented Feb 17, 2026

assign generator

@civanch
Copy link
Copy Markdown
Contributor

civanch commented Feb 17, 2026

@Ma128-bit , the directory structure should be defined by the generator l2. I only may comment that the new plugin is assumed of "stream", that mean multi-threaded. in that case std::cout should be substituted by edm::LogVerbatim or edm:LogInfo and random number generator Trandom3 should not be used. If instead the plugin is assumed to be used in a sequential mode only it should not be stream.

@smuzaffar
Copy link
Copy Markdown
Contributor

@Ma128-bit @cms-sw/generators-l2 , do we really need a separate CMSSW package for this plugin. Can we add it under any of the existing package's plugin directory?

@Ma128-bit
Copy link
Copy Markdown
Contributor Author

@smuzaffar I wasn’t sure where it should go, so I created a separate package. However, it can definitely be moved under an existing package’s plugin directory if that’s preferred. Just let me know where you think it would be more appropriate to place it.

@lviliani
Copy link
Copy Markdown
Contributor

I'm not sure I understand the use case of this plugin (e.g. how do you pass the input ROOT file in central production).

Has this ever been discussed within GEN? I can't remember of any presentation during a GEN meeting, but please share some pointers in case I'm wrong.

@Ma128-bit
Copy link
Copy Markdown
Contributor Author

@lviliani I haven’t discussed this in a GEN meeting. I discussed it with the BPH MC contacts, and they told me it was ok and to submit a pull request to CMSSW. I wasn’t aware that it needed to be presented at a GEN meeting beforehand.

Regarding how the input ROOT file would be passed in central production, I’m not entirely sure about the practical details. I was told that this should be possible, but I’m not very familiar with how the MC production machines are configured or what access they have. My initial idea was to have the file read via xrootd, but I agree that this aspect probably needs to be clarified more carefully.

If you think it would be better to discuss the use case in more detail during a GEN meeting before moving forward, I would be happy to do so.

@lviliani
Copy link
Copy Markdown
Contributor

I think it's a good idea to have a dedicated discussion in a GEN meeting.

In particular I would be interested to know if we could consider alternative approaches to handle this process, for example:

  • Use Pluto (which I don't know much) to produce LHE files, which we can later pass to Pythia easily in central production.
  • Implement this custom decay (if it's not already there) directly in Pythia or EvtGen.

@smuzaffar
Copy link
Copy Markdown
Contributor

@Ma128-bit @cms-sw/generators-l2 , do we really need a separate CMSSW package for this plugin. Can we add it under any of the existing package's plugin directory?

@cms-sw/generators-l2, are we OK with adding this new package?

@Kiarendil
Copy link
Copy Markdown

Hi @smuzaffar ,
I believe this needs follow-up from PdmV-Offline Samples + CompOps-PnR to understand how we can read a file in a central production before we can converge on this PR. This likely needs some solution with limitations on which sites such workflows can run ...

FYI @DickyChant

Best,
Kirill for PdmV-OS and PnR

@vlimant
Copy link
Copy Markdown
Contributor

vlimant commented Mar 11, 2026

assign generators

@cmsbuild
Copy link
Copy Markdown
Contributor

New categories assigned: generators

@lviliani,@mkirsano,@sensrcn,@theofil you have been requested to review this Pull request/Issue and eventually sign? Thanks

@vlimant
Copy link
Copy Markdown
Contributor

vlimant commented Mar 11, 2026

my 2 cents : such external files to parametrise generator/simulation/reconstruction either have to be in the global tag or in the release (in cms-data, that ends up in cvmfs) ; that removes the need for discussion about where/how to run this type of generator in production

@lviliani
Copy link
Copy Markdown
Contributor

One option to consider could be to store them in cvfms as we did for some NPS special cases, for example here:
/cvmfs/cms-griddata.cern.ch/phys_generator/model_config/SlhaTrees_pMSSM/

What's the size of these root files?

@Ma128-bit
Copy link
Copy Markdown
Contributor Author

@lviliani With the number of events I have generated (10M), I currently have two files, each about 14 GB in size (but they can be split into several smaller files). It might also be possible to remove some branches or change the data type to reduce their size if necessary.

@Kiarendil
Copy link
Copy Markdown

Hi @Ma128-bit ,
looking to the overall size of /cvmfs/cms-griddata.cern.ch/phys_generator/ and what's inside, I guess maybe indeed it is a good idea to try to put your files here. This should solve the problem we are worried.
Can you please indeed to play with your branches etc to see how small you can make them? Maybe try some compression as well ...

@cmsbuild
Copy link
Copy Markdown
Contributor

@cmsbuild
Copy link
Copy Markdown
Contributor

Pull request #50156 was updated. @cmsbuild, @lviliani, @mkirsano, @sensrcn, @theofil can you please check and sign again.

@Ma128-bit
Copy link
Copy Markdown
Contributor Author

Ma128-bit commented Mar 21, 2026

I noticed a small bug, so I made another commit to fix it.

Regarding the status of the PR:

  • I saw a comment mentioning that the plugin is not suitable for multithreaded use, and indeed that’s the case. I also tried implementing multithreading, but with poor results: I observed strange behavior, such as the MC efficiency decreasing as the number of threads increased. So, although it’s not very elegant, I think we can leave it as it is for now, with the constraint that it should only be used in single-thread mode, unless someone can help me adapt it.
  • Regarding the file size: I could reduce it by changing the type from double to float for the kinematics of the final-state particles, but honestly I would prefer to keep it as double to maintain as much precision as possible and avoid potential issues. If strictly necessary it could be changed, but from what I see, the /cvmfs/cms-griddata.cern.ch/phys_generator folder is quite large, so I don’t think space will be a problem, please correct me if I’m wrong. In any case, the files could be removed after the GEN-SIM production step. If this is fine, please let me know exactly how and where to copy them.
  • Regarding the correct functioning of the plugin: I did some tests, and the produced MC events seem to describe the data much better than a simple phase-space model.
  • Regarding the plugin location: at the moment it is in the new package I created, GeneratorInterface/PlutoInterface, but as mentioned above, it could be moved elsewhere if needed. Please let me know if there is a preferred location.

@lviliani
Copy link
Copy Markdown
Contributor

Regarding the multithreading, @civanch was mentioning that the plugin should not be of type stream if it's supposed to run in sequential mode, so I guess you still need to fix that.

And about the file size, I don't think the problem is the available space in the cvmfs directory, but rather copying/opening large files in the production jobs.
To me this looks similar to pLHE campaigns, but reading a ROOT file instead of LHE. I think CompOps can comment on whether the file size and this way of reading the ROOT file can be a problem in central production or not.

@Kiarendil
Copy link
Copy Markdown

I feel quite unhappy with the prospect of generating 10M events with filter eff. 10-5 (meaning we need to generate 10^12 events ...) with a single-thread mode only. It should be very painful for the central production, while running this with 16/32 cores at opportunistic sites could be quite good.

So I strongly suggest taking another look with multithread. Otherwise even if we get this done in terms of other stuff, the timescale and feasibility of the production feels unclear...

@Ma128-bit
Copy link
Copy Markdown
Contributor Author

Ma128-bit commented Mar 26, 2026

@Kiarendil I managed to slightly improve the efficiency with some changes in the fragment (it is now ~10⁻⁴). Considering that, at least with the current preliminary selections, the analysis efficiency is around 5%, I think that, if needed, we could slightly reduce the total number of events, for example to 1M–2M, and check whether that is sufficient. This would correspond to a total of about 10¹⁰ events.

Could this work in single-thread mode? If so, I can proceed to modify the plugin to run sequentially only.

If not, do you have, as reference, a similar plugin that works correctly in multithreading that I could use as a reference? I can try again starting from that.

@Kiarendil
Copy link
Copy Markdown

Factor of 100 is clearly an improvement... though it is hard to estimate it still (at least for me). Maybe some private tests with CRAB could be useful to check how such production could go.

@Ma128-bit
Copy link
Copy Markdown
Contributor Author

@Kiarendil I ran a private production with CRAB. I submitted 5000 jobs with 10^5 events each, and the average runtime per job is about 1.5–2 hours. Therefore, with 5000 jobs available, it is easily possible to produce 10^{10} events in about 40 hours. At that point, one could also produce 10^{11} events in “only” 400 hours, or much less if more jobs are available.

@cmsbuild
Copy link
Copy Markdown
Contributor

cmsbuild commented Apr 4, 2026

Milestone for this pull request has been moved to CMSSW_17_0_X. Please open a backport if it should also go in to CMSSW_16_1_X.

@cmsbuild cmsbuild modified the milestones: CMSSW_16_1_X, CMSSW_17_0_X Apr 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants