Skip to content

WIP: improve exception handling and prepare for code dump#196

Draft
MircoValentiniECMWF wants to merge 1 commit intodevelopfrom
feature/mars2grib-debug-improvements
Draft

WIP: improve exception handling and prepare for code dump#196
MircoValentiniECMWF wants to merge 1 commit intodevelopfrom
feature/mars2grib-debug-improvements

Conversation

@MircoValentiniECMWF
Copy link
Copy Markdown
Contributor

exception-handling improvements

This is a draft PR intended to describe the direction I want to take for the encoder exception mechanism. It is not meant to be a final implementation yet, but rather a concrete proposal for discussion.

The main idea is that exception handling should be treated as part of the API contract.

At the API boundary, I want to make failures explicit by catching all exceptions and rethrowing them with a clear API-level message. At the same stage, I also want to write a file containing the full encoder error stack, so that debugging is easier and the complete failure context is preserved.

Inside the encoder, every function should be consistently guarded with try/catch blocks and use nested exceptions, so that errors can be propagated without losing context as they move up through the call stack.

In addition, at API level, if a CodesHandle set error is caught, I want to attempt a second encoding pass using a different out_dict implementation. The purpose of this alternative dictionary is to record all set operations performed during encoding, so that we can generate a C reproducer for the full encoding sequence.

So, in summary, this draft PR is meant to outline three main changes:

make exceptions an explicit part of the API contract;
enforce systematic guarded propagation with nested exceptions throughout the encoder;
introduce a recovery/debug path for CodesHandle set failures that can produce a minimal C reproducer of the encoding process.

Contributor Declaration

By opening this pull request, I affirm the following:

  • All authors agree to the Contributor License Agreement.
  • The code follows the project's coding standards.
  • I have performed self-review and added comments where needed.
  • I have added or updated tests to verify that my changes are effective and functional.
  • I have run all existing tests and confirmed they pass.

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 3.70370% with 156 lines in your changes missing coverage. Please review.
✅ Project coverage is 61.80%. Comparing base (03c7e0f) to head (277ca41).
⚠️ Report is 28 commits behind head on develop.

Files with missing lines Patch % Lines
src/metkit/mars2grib/api/Mars2Grib.cc 6.25% 90 Missing ⚠️
src/metkit/mars2grib/CoreOperations.h 0.00% 39 Missing ⚠️
src/metkit/mars2grib/utils/mars2gribExceptions.h 0.00% 27 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #196      +/-   ##
===========================================
- Coverage    62.33%   61.80%   -0.54%     
===========================================
  Files          303      303              
  Lines        11601    11704     +103     
  Branches      1036     1037       +1     
===========================================
+ Hits          7232     7234       +2     
- Misses        4369     4470     +101     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This draft PR proposes a new exception-handling strategy for the mars2grib encoder: consistently nesting/wrapping exceptions across layers, writing a structured stack trace to disk at the API boundary, and introducing dictionary-traits support for an Options struct plus a mock “reproducer” dictionary intended to emit a C encoding reproducer.

Changes:

  • Updated exception stack printing to write structured frames into an output file stream (instead of debug logging).
  • Introduced dictionary access traits for metkit::mars2grib::Options and a new MockDictionary intended for reproducer generation.
  • Added API-boundary try/catch blocks that attempt to write a stack log file and rethrow an API-level eckit::Exception.

mars2grib documentation sync (files under src/metkit/mars2grib/**):

  • Impacted files:
    • src/metkit/mars2grib/utils/mars2gribExceptions.h
    • src/metkit/mars2grib/utils/dictionary_traits/dictaccess_options.h
    • src/metkit/mars2grib/utils/dictionary_traits/dictaccess_mock_reproducer.h
    • src/metkit/mars2grib/debug/reproducer.h
    • src/metkit/mars2grib/debug/MockupDictionary.h
    • src/metkit/mars2grib/CoreOperations.h
    • src/metkit/mars2grib/api/Mars2Grib.cc
  • Documentation status in src/metkit/mars2grib/docs/**: ❌ outdated (no mention of the new API-level stack file behavior / naming / location, nor the reproducer dictionary workflow).

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
src/metkit/mars2grib/utils/mars2gribExceptions.h Switches structured stack printing from LOG_DEBUG to writing to a file stream; adds a new core exception type.
src/metkit/mars2grib/utils/dictionary_traits/dictaccess_options.h Adds dictionary-traits support for Options (has/get/set/missing/to_json).
src/metkit/mars2grib/utils/dictionary_traits/dictaccess_mock_reproducer.h Adds dictionary-traits adapter for a mock write-only dictionary used to record encoding operations.
src/metkit/mars2grib/debug/reproducer.h Adds the recorder and C code generator for producing reproducers (with optional binary payloads).
src/metkit/mars2grib/debug/MockupDictionary.h Defines MockDictionary wrapper around the recorder to capture “set” operations.
src/metkit/mars2grib/CoreOperations.h Wraps internal failures into Mars2GribCoreException and adds stack-log filename generation + file output helper.
src/metkit/mars2grib/api/Mars2Grib.cc Wraps API boundary with try/catch, writes stack log, rethrows API-level exception.

template <>
struct DictHas<metkit::mars2grib::Options> {
static bool has(const metkit::mars2grib::Options& opt, std::string_view key) noexcept(false) {
return detail::get_options_ptr(opt, key) != nullptr;
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

detail::get_options_ptr is declared in metkit::mars2grib::utils::detail, but this file is now in metkit::mars2grib::utils::dict_traits, so detail::... won’t resolve and will fail to compile. Qualify the call (e.g., metkit::mars2grib::utils::detail::get_options_ptr) or move/alias the helper namespace inside dict_traits.

Suggested change
return detail::get_options_ptr(opt, key) != nullptr;
return metkit::mars2grib::utils::detail::get_options_ptr(opt, key) != nullptr;

Copilot uses AI. Check for mistakes.

// Starts the timeline
static std::unique_ptr<MockDictionary> make_from_sample_or_throw(std::string_view name) {
auto rec = std::make_shared<Reproducer::Recorder>();
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reproducer::Recorder is not a known type/namespace here (the recorder is defined as metkit::mars2grib::debug::reproducer::Recorder). This looks like a compile-time error; construct the shared pointer using the correct fully-qualified name (or reuse the Recorder type from MockDictionary).

Suggested change
auto rec = std::make_shared<Reproducer::Recorder>();
auto rec = std::make_shared<metkit::mars2grib::debug::reproducer::Recorder>();

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +5
#include <memory>
#include <vector>

#include "metkit/mars2grib/debug/reproducer.h"
#include "metkit/codes/api/CodesTypes.h"
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This header is missing an include guard/#pragma once, which can lead to ODR/multiple-include issues. Also, it uses std::string but doesn’t include <string> directly (currently relies on transitive includes). Add #pragma once and include <string> explicitly.

Copilot uses AI. Check for mistakes.
if (const auto* me = dynamic_cast<const Mars2GribGenericException*>(&e)) {
me->printFrame(pad);
me->printFrame(out, pad);
}
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mars2GribCoreException::printFrame() is never used because printExtendedStack() only recognizes Mars2GribGenericException via dynamic_cast. Either make Mars2GribCoreException derive from Mars2GribGenericException, or extend printExtendedStack() to also detect Mars2GribCoreException so core-layer frames include file/function/line info.

Suggested change
}
}
else if (const auto* ce = dynamic_cast<const Mars2GribCoreException*>(&e)) {
ce->printFrame(out, pad);
}

Copilot uses AI. Check for mistakes.
catch (const std::exception& nested) {
printExtendedStack(nested, level + 1, frame + 1);
printExtendedStack(nested, out, level + 1, frame + 1);
}
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

printExtendedStack() rethrows nested exceptions but only catches std::exception. If a non-std::exception is nested, the function will throw while printing and can mask the original failure. Consider adding a catch (...) branch (similar to printExceptionStack()) to print an “unknown non-std exception” frame and stop recursion safely.

Suggested change
}
}
catch (...) {
const std::string nestedPad = indent(level + 1);
out << nestedPad << "+ " << std::string(lineSize, '=') << std::endl
<< nestedPad << "+ frame " << frame + 1 << std::endl
<< nestedPad << "+ " << std::string(lineSize, '-') << std::endl
<< nestedPad << "+ message: unknown non-std exception" << std::endl
<< nestedPad << "+ " << std::string(lineSize, '+') << std::endl;
}

Copilot uses AI. Check for mistakes.
Comment on lines +81 to +86
out << "double(h, \"" << key_ << "\", " << std::setprecision(17) << val_ << "), 0);\n";
}
else if constexpr (std::is_same_v<T, std::string>) {
out << "string(h, \"" << key_ << "\", \"" << val_ << "\", NULL), 0);\n";
}
}
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String values are emitted into C code without escaping. If val_ contains quotes, backslashes, newlines, or non-printable bytes, the generated reproducer will be invalid C (and potentially confusing to run). Escape string literals properly (and consider emitting binary-safe payloads for arbitrary strings).

Copilot uses AI. Check for mistakes.
}

void emitC_Logic(std::ostream& out, int /*id*/) const override {
if constexpr (std::is_same_v<T, std::string> || std::is_same_v<T, uint8_t>) return;
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For std::vector<uint8_t> / std::vector<std::string> above BINARY_THRESHOLD, OpLargeArray returns early and emitC_Logic() becomes a no-op, silently dropping recorded operations. At minimum emit a warning comment in the generated C (like OpSmallArray does) or explicitly reject these types so the reproducer can’t be generated in a partial/incorrect state.

Suggested change
if constexpr (std::is_same_v<T, std::string> || std::is_same_v<T, uint8_t>) return;
if constexpr (std::is_same_v<T, std::string> || std::is_same_v<T, uint8_t>) {
out << " /* WARNING: large array for key '" << key_
<< "' could not be emitted by reproducer: unsupported element type */\n";
return;
}

Copilot uses AI. Check for mistakes.
Comment on lines +57 to +71
template <>
struct DictTraits<metkit::mars2grib::Options> {

static constexpr bool support_checks = true;

static std::unique_ptr<metkit::mars2grib::Options> make_from_sample_or_throw(std::string_view /*name*/) {
// Options has no samples, just return default constructed
return std::make_unique<metkit::mars2grib::Options>();
}

static std::unique_ptr<metkit::mars2grib::Options> clone_or_throw(const metkit::mars2grib::Options& h) {
// Struct has implicit copy constructor
return std::make_unique<metkit::mars2grib::Options>(h);
}
};
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New dictionary access behavior for metkit::mars2grib::Options is introduced here (has/get/set/missing/to_json) but there are existing dictionary_traits tests for other dictionary types under tests/mars2grib/utils/dictionary_traits. Add analogous unit tests to cover valid keys, invalid keys, and type conversions for Options to prevent regressions.

Copilot uses AI. Check for mistakes.
Comment on lines +75 to +104
template <>
struct DictToJsonTraits<MockDictionary> {
static std::string to_json(const MockDictionary&) {
return std::string{"[to_json not supported for MockDictionary]"};
}
static void dump_or_ignore(const MockDictionary& d, const std::string& fname) {
d.dump(fname);
}
};

// -----------------------------------------------------------------------------
// DictTraits (Linear Shared Architecture)
// -----------------------------------------------------------------------------
template <>
struct DictTraits<MockDictionary> {
static constexpr bool support_checks = false;

// Starts the timeline
static std::unique_ptr<MockDictionary> make_from_sample_or_throw(std::string_view name) {
auto rec = std::make_shared<Reproducer::Recorder>();
rec->record_from_sample(std::string(name));
return std::make_unique<MockDictionary>(rec);
}

// Injects a "flush" operation into the shared timeline
static std::unique_ptr<MockDictionary> clone_or_throw(const MockDictionary& other) {
other.recorder->record_clone();
return std::make_unique<MockDictionary>(other.recorder);
}
};
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR adds a new debug/recovery dictionary type intended to produce a C reproducer, but there are no tests asserting that recording + dumping works and that the emitted code contains the expected operations (sample creation, set calls, missing, clone barriers). Given the existing dictionary_traits test suite, add basic tests for MockDictionary traits and dump_or_ignore() output.

Copilot uses AI. Check for mistakes.
Comment on lines +342 to +348
static std::string generateStack( const std::exception& e, const std::string& msg, const eckit::CodeLocation& loc ){

using metkit::mars2grib::utils::exceptions::printExtendedStack;
using metkit::mars2grib::utils::exceptions::lineSize;

const std::string logName = detail::generateEncodingLogFilename();

Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes introduce new observable debugging behavior (per-thread/per-process stack log files and a potential reproducer path), but src/metkit/mars2grib/docs/doxygen/* currently doesn’t describe this API contract. Please update the mars2grib docs to mention: where the stack files are written, naming scheme, and how users should find/consume the reproducer output.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants