Added kwarg to convert_hdf5_to_fits to consistently trim too-long column names#842
Added kwarg to convert_hdf5_to_fits to consistently trim too-long column names#842Stargrazer82301 wants to merge 4 commits intoBEAST-Fitting:masterfrom
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #842 +/- ##
==========================================
- Coverage 40.55% 40.42% -0.13%
==========================================
Files 107 107
Lines 10320 10358 +38
==========================================
+ Hits 4185 4187 +2
- Misses 6135 6171 +36 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Doc test fail from ACCESS, as with prior PR, and some ADS timeouts? Anyways, not at our end. Unsure what the coverage issue is... Is that just because the new kwarg isn't True by default, and isn't in the tests? |
karllark
left a comment
There was a problem hiding this comment.
Please see my comments. Just trying to get my head around this PR.
| autotrim : bool, optional (default=False) | ||
| Whether to automatically trim column names that are too long for FITS files, | ||
| and will therefore make this function crash otherwise |
There was a problem hiding this comment.
If this function will crash if this variable is false, why not make the default true? And issue a statement of what is happening? This sounds like a better solution than having a known crash.
There was a problem hiding this comment.
When this docstring means is that if the already-extisting st_file function will crash if the columns names are too long.
So shouldn't crash is this is false. Rather, if false, then the function replace_largest_common_substring simply won't get called. In that way, the behaviour of all existing code is unchanged. But now this exists for uses to enable if they want.
| # Identify and replace too-long column names, if requested | ||
| if autotrim: | ||
| set0_trim, set1_trim, set2_trim = replace_largest_common_substring( | ||
| set0, set1, set2 | ||
| ) |
There was a problem hiding this comment.
I am feeling dim right now. Where are these variables used? I don't see anything before the function ends, so it is not clear to me where the new set0_trim, etc. variables are used to update the column names.
There was a problem hiding this comment.
They're used in lines 91-99, to update the column names with the new, shorter, programmatically-generated column names produced by replace_largest_common_substring (if that's found to be necessary).
This means the test coverage has decreased. Maybe because there is no test for kwarg? Or even many no test for this entire script. |
Yep. Might have to give up on having that URL in the docs. |
|
Ping. |
Discovered
convert_hdf5_to_fitsfails if the column names in.phot.hdf5file are too long. These column names are typically set by the names of the files from MAST on which the photometry was done.This fix adds an
autotrimkwarg to the function (default to False).This option, if column names are too long, finds the largest contiguous substring common to the column names that are too long for FITS key names. It then makes a shorter version of those column names. it first attempts to just remove
-and_characters from the column names; and after that, it lops off the beginning characters of the largest contiguous common substring (as these are most likely to be generic fluff). The shortened column name substring is then propagated to all column names that contain the original string.