Skip to content

Add microclumping#275

Open
AMG3141 wants to merge 79 commits intodevelopfrom
microclumping
Open

Add microclumping#275
AMG3141 wants to merge 79 commits intodevelopfrom
microclumping

Conversation

@AMG3141
Copy link
Copy Markdown
Contributor

@AMG3141 AMG3141 commented Jan 14, 2026

Microclumping

Function volume_filling_factor takes time and radial velocity and returns a volume filling factor $f_v$ (in the interval $(0,1]$). These are used in update_grid to set the clumping factors $(1/f_v)$.
Volume filling factors can instead be read from a file if READ_CLUMIPNG_FACTORS_FROM_FILE is true. File "clumping-factors.txt" is a table of floats formatted with %.6e. It has (num timesteps) rows and (num cells) cols. Will terminate immediately if file is expected but not found in any of the data folders.

Storing Clumping Factors

New grid property grid::clumpfactor_allcells. If USE_MICROCLUMPING is set to false it is never allocated and a global clumping factor of 1. is assumed.

Using Clumping Factors

If microclumping is turned on grid::apply_clumping applies the given clumping factor to the value. The processes impacted by microclumping are:

  • Collisional (de)excitation
  • Collisional ionisation
  • Collisional recombination
  • Radiative recombination
  • Collisional capture
  • Stimulated recombination
  • Free-free heating
  • Free-free cooling

Other processes are physically affected by clumping (e.g. non-thermal processes) but the effects balance eachother out so no changes are required.

Misc

Add function grid::modelcell_mean_radial_vel to calculate the radial velocity of a model grid cell.

Add function grid::check_mgi_is_nonempty to determine if the cell with the given mgi is non-empty, and return the nonemptymgi if so.

A (probably incomplete) list of things left to do

  • Carry out more rigorous testing of everything implemented so far.
  • Actually use the clumping factors in the calculations.
    • Free-free, rpkt, kpkt, macroatom need looked at
  • If it is required, check clumping-factors.txt exists in check_microclumping_options() and terminate if it doesn't.
    • Probably change the "clumping factors file" (and all associated variables/constants) to a "volume filling factors file" to make the naming consistent.
  • Update artisoptions_doc.md once implementation is done.
  • Store $1/f_v$ as that is what we actually use in calculations.
  • Change oneoverfv to something easier to read.

Things which might be nice to do

  • Option to only start using microclumping after a certain number of time steps.
  • Make script to generate clumping-factors.txt.

@AMG3141 AMG3141 self-assigned this Jan 14, 2026
@lukeshingles lukeshingles changed the title Microclumping Add microclumping Feb 2, 2026
@lukeshingles
Copy link
Copy Markdown
Member

lukeshingles commented Feb 2, 2026

Still in draft, but mostly for my own curiosity:

  • Where are the files that contain clumping factors for every cell and timestep coming from? If these are algorithmically generated, then couldn't the whole design be simplified by defining a new function in artistioptions that takes time_days, radial_velocity, etc and returns the clumping factor? That would avoid the complexity and performance cost of the file I/O, additional configuration options, and allow the same clumping function to be applied to different models and timestep choices.
  • Why is "fv" used for the variable names? Is that a volume filling factor? Why not something self-descriptive like "clumpfactor"?
  • Is it necessary to pass clumping factors down the call graph or could you simply pass the clumped free electron density to the existing functions, col_excitation_ratecoeff, rad_recombination_ratecoeff, col_recombination_ratecoeff, etc?

Some of the recent updates on the develop branch are showing up in the diff on this PR, which means something weird happened during rebasing. Shouldn't affect merging (since we squash merge anyway) but makes it a bit harder to read the diff.

@AMG3141
Copy link
Copy Markdown
Contributor Author

AMG3141 commented Feb 2, 2026

It never occurred to me to have a function in artisoptions. I don’t think that would restrict our ability to set the clumping factors in the way Stuart was thinking and it's obviously much more efficient. I will discuss it with him.

Fv is for volume filling factor but was more of a place holder. Something more descriptive is probably the way to go.

I think passing the clumping factor is easier, because it doesn’t require one to think about it each time those functions are used. But I have just noticed that right now I am applying the clumping factor at the return statement(s), but it would require less calls to grid::apply_clumping and would probably be easier to read to just have something like nne = grid::apply_clumping(nne, clumpfactor) at the start of the functions which require it. So the clumped free electron density is what is used, but doesn't have to be calculated manually each time the function is called.

And yes I messed up rebasing at some point, sorry about that!

@lukeshingles
Copy link
Copy Markdown
Member

Ok, sounds good. For now, I recommend implementing it in the simplest way with the minimal code changes to give you the highest confidence that the solution is correct. After that, we can lock the checksums and make sure that the default case of constexpr float clumping_factor() {return 1.0} is being fully optimised out. The rate coefficients are called in tight loops in the macro atom and level pop solver are and fairly critical to performance.

See ION_NLEVELS_EXCITED_NLTE() and NLEVELS_REQUIRETRANSITIONS() for examples of the constexpr functions. You should be able to pass in whatever you need (time_days, radial_velocity) as an argument, but if that's too limiting, you could use a #define that can call anything that is defined in the code. Let me know if you need any help with getting that to work.

@AMG3141
Copy link
Copy Markdown
Contributor Author

AMG3141 commented Feb 4, 2026

The way I have implemented it currently the clumping factors are read in during update_grid and stored as a grid property. get_oneoverfv is then a getter for oneoverfv_allcells. Is it best to keep this structure (but instead of reading we calculate using the new function) or to calculate the clumping factors as we need them? I guess it’s a trade off between increased memory usage vs more operations?

@lukeshingles
Copy link
Copy Markdown
Member

The way I have implemented it currently the clumping factors are read in during update_grid and stored as a grid property. get_oneoverfv is then a getter for oneoverfv_allcells. Is it best to keep this structure (but instead of reading we calculate using the new function) or to calculate the clumping factors as we need them? I guess it’s a trade off between increased memory usage vs more operations?

I don't think it's obvious which way to go on that, so choose with whichever you find easier to implement. There could be a performance benefit to storing and communicating the clumping factor of each cell if calculating them requires fetching several other numbers (e.g. the cell radial velocity is not stored directly, so has to be calculated from x-y-z coords, in case you want a radial dependence in the clumping factor), but I don't think it's easy to predict which will be faster. One nice thing about calculating it each time is that the compiler can see if your clump factor function always returns 1.00 and will just optimise it out without you having to have to define bool USE_MICROCLUMPING = false, whereas it has no idea what numbers will be in the array of clumping factors.

Copilot AI review requested due to automatic review settings April 1, 2026 11:04
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 2 comments.

@AMG3141
Copy link
Copy Markdown
Contributor Author

AMG3141 commented Apr 1, 2026

There are a few things copilot has brought up. I left the unresolved for you to have a look at. A couple of them I've left replies on explaining my thinking behind it. There are also a few things I want to check:

  • grid::get_modelcell_mean_radial_vel: Any preference on which version to use? I think the commented out one is cleaner, but the one I’ve left in should be slightly more efficient.
  • ratecoeff.cc:652 have I done that correctly? The ternary operator is just there to check for empty vs nonempty cells right?
  • sn3d.cc:560 I’ve added a check that all of the clumping factors are physically sensible values. The thinking is to catch quickly if any clumping factors are problematic, though it does require looping over all non-empty cells. Is it fine to keep it in production or should it be moved to test mode?

One last thing is because copilot automatically reviews, it comments on the same things multiple times and causes a lot of clutter and makes things difficult to follow. Is there a way to disable the automatic reviews?

@AMG3141 AMG3141 marked this pull request as ready for review April 1, 2026 14:29
@AMG3141 AMG3141 requested a review from lukeshingles as a code owner April 1, 2026 14:29
Copilot AI review requested due to automatic review settings April 1, 2026 14:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 1 comment.

@lukeshingles
Copy link
Copy Markdown
Member

I’m on leave until April 20. If you’re reasonably confident that the implementation is correct then you can update the checksums to avoid breakage as you work on simplifying the design.

Copilot AI review requested due to automatic review settings April 2, 2026 15:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 1 comment.

Comment on lines +389 to +391
if constexpr (READ_VOLUME_FILLING_FACTORS_FROM_FILE) {
vol_filling_factors_file >> vol_filling_factor;

Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When READ_VOLUME_FILLING_FACTORS_FROM_FILE is enabled, vol_filling_factors_file >> vol_filling_factor is not checked for success. If the file has too few entries or the stream is in a bad state (e.g., a failed seekg()), vol_filling_factor will remain NaN and the subsequent set_clumpfactor() assertion will fail without clearly indicating an I/O/formatting problem. Consider asserting the stream state after the read (and/or before the read) to fail fast with a more diagnostic condition.

Suggested change
if constexpr (READ_VOLUME_FILLING_FACTORS_FROM_FILE) {
vol_filling_factors_file >> vol_filling_factor;
if constexpr (READ_VOLUME_FILLING_FACTORS_FROM_FILE) {
// Ensure the volume filling factors file stream is usable before reading
assert_always(vol_filling_factors_file.is_open());
assert_always(!vol_filling_factors_file.bad());
vol_filling_factors_file >> vol_filling_factor;
// Fail fast if the read did not succeed (e.g. EOF, format error, or prior bad state)
assert_always(!vol_filling_factors_file.fail());
assert_always(!std::isnan(vol_filling_factor));

Copilot uses AI. Check for mistakes.
grid.h Outdated
Comment on lines +133 to +138
template <typename T>
inline auto apply_clumping(const T val, const float clumpfactor) -> T {
if constexpr (USE_MICROCLUMPING) {
return val * clumpfactor;
}
return val;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to define a new function just to multiply two numbers together. Remove

grid.h Outdated
[[nodiscard]] auto get_propcell_random_position_tmin(int cellindex) -> Vec3d;
[[nodiscard]] DEVICE_FUNC auto boundary_distance(const Vec3d& dir, const Vec3d& pos, double tstart, int cellindex)
-> std::tuple<double, int>;
void set_clumpfactor(int nonemptymgi, float vol_filling_factor);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function is called set_clumpfactor but it takes a volume filling factor instead. Rename to 'set_vol_filling_factor' or take the clumping factor as an argument to be consistent.

Comment on lines +1551 to +1560
[[gnu::pure]] [[nodiscard]] DEVICE_FUNC auto get_modelcell_mean_radial_vel(const int modelgridindex, const double tmin)
-> double {
// TODO: both versions give the same thing. uncommented one should give ever so slightly better performance

const int assoc_cells = grid::get_numpropcells(modelgridindex);
return modelgrid_input[modelgridindex].initial_radial_pos_sum / tmin / assoc_cells;
// ---------------
// return get_modelcell_mean_radial_pos(modelgridindex, tratmid) / tmid;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trivial function used only once. Inline at the call site.

Comment on lines +1726 to +1734
[[nodiscard]] DEVICE_FUNC auto check_mgi_is_nonempty(const int mgi, int& nonemptymgi) -> bool {
assert_testmodeonly(get_nonempty_npts_model() > 0);
assert_testmodeonly(mgi >= 0);
assert_testmodeonly(mgi < get_npts_model());

nonemptymgi = nonemptymgi_of_mgi[mgi];
return nonemptymgi >= 0;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other parts of the code check if a cell is non-empty using the expression: grid::get_numpropcells(mgi) > 0


constexpr bool USE_MICROCLUMPING = false;

constexpr bool READ_VOLUME_FILLING_FACTORS_FROM_FILE = false;
Copy link
Copy Markdown
Member

@lukeshingles lukeshingles Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we had eliminated the need for file I/O using the artisoptions function? Can that be removed now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants