WIP: improve exception handling and prepare for code dump#196
WIP: improve exception handling and prepare for code dump#196MircoValentiniECMWF wants to merge 1 commit intodevelopfrom
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## develop #196 +/- ##
===========================================
- Coverage 62.33% 61.80% -0.54%
===========================================
Files 303 303
Lines 11601 11704 +103
Branches 1036 1037 +1
===========================================
+ Hits 7232 7234 +2
- Misses 4369 4470 +101 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This draft PR proposes a new exception-handling strategy for the mars2grib encoder: consistently nesting/wrapping exceptions across layers, writing a structured stack trace to disk at the API boundary, and introducing dictionary-traits support for an Options struct plus a mock “reproducer” dictionary intended to emit a C encoding reproducer.
Changes:
- Updated exception stack printing to write structured frames into an output file stream (instead of debug logging).
- Introduced dictionary access traits for
metkit::mars2grib::Optionsand a newMockDictionaryintended for reproducer generation. - Added API-boundary try/catch blocks that attempt to write a stack log file and rethrow an API-level
eckit::Exception.
mars2grib documentation sync (files under src/metkit/mars2grib/**):
- Impacted files:
src/metkit/mars2grib/utils/mars2gribExceptions.hsrc/metkit/mars2grib/utils/dictionary_traits/dictaccess_options.hsrc/metkit/mars2grib/utils/dictionary_traits/dictaccess_mock_reproducer.hsrc/metkit/mars2grib/debug/reproducer.hsrc/metkit/mars2grib/debug/MockupDictionary.hsrc/metkit/mars2grib/CoreOperations.hsrc/metkit/mars2grib/api/Mars2Grib.cc
- Documentation status in
src/metkit/mars2grib/docs/**: ❌ outdated (no mention of the new API-level stack file behavior / naming / location, nor the reproducer dictionary workflow).
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| src/metkit/mars2grib/utils/mars2gribExceptions.h | Switches structured stack printing from LOG_DEBUG to writing to a file stream; adds a new core exception type. |
| src/metkit/mars2grib/utils/dictionary_traits/dictaccess_options.h | Adds dictionary-traits support for Options (has/get/set/missing/to_json). |
| src/metkit/mars2grib/utils/dictionary_traits/dictaccess_mock_reproducer.h | Adds dictionary-traits adapter for a mock write-only dictionary used to record encoding operations. |
| src/metkit/mars2grib/debug/reproducer.h | Adds the recorder and C code generator for producing reproducers (with optional binary payloads). |
| src/metkit/mars2grib/debug/MockupDictionary.h | Defines MockDictionary wrapper around the recorder to capture “set” operations. |
| src/metkit/mars2grib/CoreOperations.h | Wraps internal failures into Mars2GribCoreException and adds stack-log filename generation + file output helper. |
| src/metkit/mars2grib/api/Mars2Grib.cc | Wraps API boundary with try/catch, writes stack log, rethrows API-level exception. |
| template <> | ||
| struct DictHas<metkit::mars2grib::Options> { | ||
| static bool has(const metkit::mars2grib::Options& opt, std::string_view key) noexcept(false) { | ||
| return detail::get_options_ptr(opt, key) != nullptr; |
There was a problem hiding this comment.
detail::get_options_ptr is declared in metkit::mars2grib::utils::detail, but this file is now in metkit::mars2grib::utils::dict_traits, so detail::... won’t resolve and will fail to compile. Qualify the call (e.g., metkit::mars2grib::utils::detail::get_options_ptr) or move/alias the helper namespace inside dict_traits.
| return detail::get_options_ptr(opt, key) != nullptr; | |
| return metkit::mars2grib::utils::detail::get_options_ptr(opt, key) != nullptr; |
|
|
||
| // Starts the timeline | ||
| static std::unique_ptr<MockDictionary> make_from_sample_or_throw(std::string_view name) { | ||
| auto rec = std::make_shared<Reproducer::Recorder>(); |
There was a problem hiding this comment.
Reproducer::Recorder is not a known type/namespace here (the recorder is defined as metkit::mars2grib::debug::reproducer::Recorder). This looks like a compile-time error; construct the shared pointer using the correct fully-qualified name (or reuse the Recorder type from MockDictionary).
| auto rec = std::make_shared<Reproducer::Recorder>(); | |
| auto rec = std::make_shared<metkit::mars2grib::debug::reproducer::Recorder>(); |
| #include <memory> | ||
| #include <vector> | ||
|
|
||
| #include "metkit/mars2grib/debug/reproducer.h" | ||
| #include "metkit/codes/api/CodesTypes.h" |
There was a problem hiding this comment.
This header is missing an include guard/#pragma once, which can lead to ODR/multiple-include issues. Also, it uses std::string but doesn’t include <string> directly (currently relies on transitive includes). Add #pragma once and include <string> explicitly.
| if (const auto* me = dynamic_cast<const Mars2GribGenericException*>(&e)) { | ||
| me->printFrame(pad); | ||
| me->printFrame(out, pad); | ||
| } |
There was a problem hiding this comment.
Mars2GribCoreException::printFrame() is never used because printExtendedStack() only recognizes Mars2GribGenericException via dynamic_cast. Either make Mars2GribCoreException derive from Mars2GribGenericException, or extend printExtendedStack() to also detect Mars2GribCoreException so core-layer frames include file/function/line info.
| } | |
| } | |
| else if (const auto* ce = dynamic_cast<const Mars2GribCoreException*>(&e)) { | |
| ce->printFrame(out, pad); | |
| } |
| catch (const std::exception& nested) { | ||
| printExtendedStack(nested, level + 1, frame + 1); | ||
| printExtendedStack(nested, out, level + 1, frame + 1); | ||
| } |
There was a problem hiding this comment.
printExtendedStack() rethrows nested exceptions but only catches std::exception. If a non-std::exception is nested, the function will throw while printing and can mask the original failure. Consider adding a catch (...) branch (similar to printExceptionStack()) to print an “unknown non-std exception” frame and stop recursion safely.
| } | |
| } | |
| catch (...) { | |
| const std::string nestedPad = indent(level + 1); | |
| out << nestedPad << "+ " << std::string(lineSize, '=') << std::endl | |
| << nestedPad << "+ frame " << frame + 1 << std::endl | |
| << nestedPad << "+ " << std::string(lineSize, '-') << std::endl | |
| << nestedPad << "+ message: unknown non-std exception" << std::endl | |
| << nestedPad << "+ " << std::string(lineSize, '+') << std::endl; | |
| } |
| out << "double(h, \"" << key_ << "\", " << std::setprecision(17) << val_ << "), 0);\n"; | ||
| } | ||
| else if constexpr (std::is_same_v<T, std::string>) { | ||
| out << "string(h, \"" << key_ << "\", \"" << val_ << "\", NULL), 0);\n"; | ||
| } | ||
| } |
There was a problem hiding this comment.
String values are emitted into C code without escaping. If val_ contains quotes, backslashes, newlines, or non-printable bytes, the generated reproducer will be invalid C (and potentially confusing to run). Escape string literals properly (and consider emitting binary-safe payloads for arbitrary strings).
| } | ||
|
|
||
| void emitC_Logic(std::ostream& out, int /*id*/) const override { | ||
| if constexpr (std::is_same_v<T, std::string> || std::is_same_v<T, uint8_t>) return; |
There was a problem hiding this comment.
For std::vector<uint8_t> / std::vector<std::string> above BINARY_THRESHOLD, OpLargeArray returns early and emitC_Logic() becomes a no-op, silently dropping recorded operations. At minimum emit a warning comment in the generated C (like OpSmallArray does) or explicitly reject these types so the reproducer can’t be generated in a partial/incorrect state.
| if constexpr (std::is_same_v<T, std::string> || std::is_same_v<T, uint8_t>) return; | |
| if constexpr (std::is_same_v<T, std::string> || std::is_same_v<T, uint8_t>) { | |
| out << " /* WARNING: large array for key '" << key_ | |
| << "' could not be emitted by reproducer: unsupported element type */\n"; | |
| return; | |
| } |
| template <> | ||
| struct DictTraits<metkit::mars2grib::Options> { | ||
|
|
||
| static constexpr bool support_checks = true; | ||
|
|
||
| static std::unique_ptr<metkit::mars2grib::Options> make_from_sample_or_throw(std::string_view /*name*/) { | ||
| // Options has no samples, just return default constructed | ||
| return std::make_unique<metkit::mars2grib::Options>(); | ||
| } | ||
|
|
||
| static std::unique_ptr<metkit::mars2grib::Options> clone_or_throw(const metkit::mars2grib::Options& h) { | ||
| // Struct has implicit copy constructor | ||
| return std::make_unique<metkit::mars2grib::Options>(h); | ||
| } | ||
| }; |
There was a problem hiding this comment.
New dictionary access behavior for metkit::mars2grib::Options is introduced here (has/get/set/missing/to_json) but there are existing dictionary_traits tests for other dictionary types under tests/mars2grib/utils/dictionary_traits. Add analogous unit tests to cover valid keys, invalid keys, and type conversions for Options to prevent regressions.
| template <> | ||
| struct DictToJsonTraits<MockDictionary> { | ||
| static std::string to_json(const MockDictionary&) { | ||
| return std::string{"[to_json not supported for MockDictionary]"}; | ||
| } | ||
| static void dump_or_ignore(const MockDictionary& d, const std::string& fname) { | ||
| d.dump(fname); | ||
| } | ||
| }; | ||
|
|
||
| // ----------------------------------------------------------------------------- | ||
| // DictTraits (Linear Shared Architecture) | ||
| // ----------------------------------------------------------------------------- | ||
| template <> | ||
| struct DictTraits<MockDictionary> { | ||
| static constexpr bool support_checks = false; | ||
|
|
||
| // Starts the timeline | ||
| static std::unique_ptr<MockDictionary> make_from_sample_or_throw(std::string_view name) { | ||
| auto rec = std::make_shared<Reproducer::Recorder>(); | ||
| rec->record_from_sample(std::string(name)); | ||
| return std::make_unique<MockDictionary>(rec); | ||
| } | ||
|
|
||
| // Injects a "flush" operation into the shared timeline | ||
| static std::unique_ptr<MockDictionary> clone_or_throw(const MockDictionary& other) { | ||
| other.recorder->record_clone(); | ||
| return std::make_unique<MockDictionary>(other.recorder); | ||
| } | ||
| }; |
There was a problem hiding this comment.
This PR adds a new debug/recovery dictionary type intended to produce a C reproducer, but there are no tests asserting that recording + dumping works and that the emitted code contains the expected operations (sample creation, set calls, missing, clone barriers). Given the existing dictionary_traits test suite, add basic tests for MockDictionary traits and dump_or_ignore() output.
| static std::string generateStack( const std::exception& e, const std::string& msg, const eckit::CodeLocation& loc ){ | ||
|
|
||
| using metkit::mars2grib::utils::exceptions::printExtendedStack; | ||
| using metkit::mars2grib::utils::exceptions::lineSize; | ||
|
|
||
| const std::string logName = detail::generateEncodingLogFilename(); | ||
|
|
There was a problem hiding this comment.
These changes introduce new observable debugging behavior (per-thread/per-process stack log files and a potential reproducer path), but src/metkit/mars2grib/docs/doxygen/* currently doesn’t describe this API contract. Please update the mars2grib docs to mention: where the stack files are written, naming scheme, and how users should find/consume the reproducer output.
exception-handling improvements
This is a draft PR intended to describe the direction I want to take for the encoder exception mechanism. It is not meant to be a final implementation yet, but rather a concrete proposal for discussion.
The main idea is that exception handling should be treated as part of the API contract.
At the API boundary, I want to make failures explicit by catching all exceptions and rethrowing them with a clear API-level message. At the same stage, I also want to write a file containing the full encoder error stack, so that debugging is easier and the complete failure context is preserved.
Inside the encoder, every function should be consistently guarded with try/catch blocks and use nested exceptions, so that errors can be propagated without losing context as they move up through the call stack.
In addition, at API level, if a CodesHandle set error is caught, I want to attempt a second encoding pass using a different out_dict implementation. The purpose of this alternative dictionary is to record all set operations performed during encoding, so that we can generate a C reproducer for the full encoding sequence.
So, in summary, this draft PR is meant to outline three main changes:
make exceptions an explicit part of the API contract;
enforce systematic guarded propagation with nested exceptions throughout the encoder;
introduce a recovery/debug path for CodesHandle set failures that can produce a minimal C reproducer of the encoding process.
Contributor Declaration
By opening this pull request, I affirm the following: