Skip to content

deps: add ata JSON Schema validator for node.config.json#62603

Open
mertcanaltin wants to merge 5 commits intonodejs:mainfrom
mertcanaltin:feat/json-schema-validation
Open

deps: add ata JSON Schema validator for node.config.json#62603
mertcanaltin wants to merge 5 commits intonodejs:mainfrom
mertcanaltin:feat/json-schema-validation

Conversation

@mertcanaltin
Copy link
Copy Markdown
Member

@mertcanaltin mertcanaltin commented Apr 5, 2026

Adds ata-validator (v0.8.0) to validate node.config.json against its JSON Schema before parsing.

Clear error messages for invalid config instead of generic "Invalid value" errors.

Example output:

Invalid configuration in node.config.json: /nodeOptions/import: expected exactly one oneOf match, got 0

What's included:

  • deps/ata/ with ata-validator built using ATA_NO_RE2 (no extra dependencies, only simdjson which is already in core)
  • Schema validation step in node_config_file.cc before existing parsing
  • process.versions.ata

Spec compliance:

  • 94.2% Draft 2020-12 test suite (1156/1227)
  • 95.3% @exodus/schemasafe test suite
  • $dynamicRef / $anchor: 42/42 (100%)

Security:

  • Recursion depth guard for circular $ref
  • __proto__ safe const/enum comparison
  • Date format range validation
  • OSS-Fuzz submitted, 283/283 JSONTestSuite passing

refs: #62598

@nodejs-github-bot
Copy link
Copy Markdown
Collaborator

Review requested:

  • @nodejs/gyp
  • @nodejs/security-wg

@nodejs-github-bot nodejs-github-bot added lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Apr 5, 2026
@mertcanaltin mertcanaltin marked this pull request as draft April 5, 2026 20:03
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 5, 2026

Codecov Report

❌ Patch coverage is 80.00000% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.80%. Comparing base (726b220) to head (1bf468e).
⚠️ Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
src/node_config_file.cc 78.94% 0 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #62603      +/-   ##
==========================================
- Coverage   89.81%   89.80%   -0.01%     
==========================================
  Files         699      699              
  Lines      216331   216436     +105     
  Branches    41355    41389      +34     
==========================================
+ Hits       194294   194367      +73     
+ Misses      14156    14150       -6     
- Partials     7881     7919      +38     
Files with missing lines Coverage Δ
src/node_metadata.cc 92.85% <100.00%> (+0.10%) ⬆️
src/node_config_file.cc 77.62% <78.94%> (-3.88%) ⬇️

... and 54 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@juanarbol
Copy link
Copy Markdown
Member

IMHO this is a bit too much, an extra dep for a schema parser inside Node.js (not to mention that is not even v1.x)

This is not a blocker, this is just my personal opinion

@mertcanaltin
Copy link
Copy Markdown
Member Author

That's a fair concern. The dependency is intentionally minimal, it only uses simdjson which is already in core (no new external dependencies since it's built with ATA_NO_RE2).

On the version, you're right it's pre-1.0, but it already passes 96.9% of the official JSON Schema Test Suite. Happy to work toward 1.0 if that's a requirement.

@TheOneTheOnlyJJ
Copy link
Copy Markdown
Contributor

Maybe ata could also be exposed to JS userland as an API if it ends up being vendored in?

@mertcanaltin
Copy link
Copy Markdown
Member Author

Yes, that's the plan! The initial PR focuses on node.config.json validation as a starting point, but exposing it as a public API is the natural next step. We discussed this in #62598, Joyee mentioned it would be the strongest use case.

The main prerequisite is getting the security story solid first, we've already submitted to OSS-Fuzz and all 283 nst/JSONTestSuite tests are passing.

@ChALkeR
Copy link
Copy Markdown
Member

ChALkeR commented Apr 7, 2026

@mertcanaltin check also https://www.npmjs.com/package/@exodus/schemasafe testsuite

@mertcanaltin
Copy link
Copy Markdown
Member Author

Thanks @ChALkeR, I ran the schemasafe test suite against ata (draft2020-12):

95.3% pass rate, 1103/1157, 59 skipped (remote refs).

Main gaps are unevaluatedProperties/Items annotation tracking, some $ref URI edge cases (URN, absolute-path), grapheme counting in minLength/maxLength, and a few optional format semantics.

Running the suite also helped me catch a few things I fixed since: recursion depth guard for circular $ref, proto handling in const/enum, and date format range checks.

$dynamicRef is fully passing now (42/42). Working on the rest.

@mertcanaltin mertcanaltin force-pushed the feat/json-schema-validation branch from edc0176 to 85e4bca Compare April 10, 2026 20:26
@mertcanaltin mertcanaltin marked this pull request as ready for review April 10, 2026 20:28
@Qard
Copy link
Copy Markdown
Member

Qard commented Apr 11, 2026

CI says:

../../src/node_config_file.cc:2:10: fatal error: 'ata.h' file not found
    2 | #include "ata.h"
      |          ^~~~~~~

Copy link
Copy Markdown
Member

@Qard Qard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it is a non-trivial amount of C++, it is all neatly contained in the external ata dependency where we can trivially iterate on any correctness requirements if there's any of the remaining spec features folks think are essential. The actual integration surface area is extremely minimal and straightforward.

I think it looks great. Excellent work! 👏🏻

@mertcanaltin mertcanaltin force-pushed the feat/json-schema-validation branch 3 times, most recently from 33202b7 to 32ba0e9 Compare April 11, 2026 06:33
@mertcanaltin mertcanaltin requested a review from Qard April 11, 2026 06:34
@mertcanaltin
Copy link
Copy Markdown
Member Author

thanks for review @Qard, I some conflicts solved, if have available time, can you review again ?

Add ata (v0.8.0) as a bundled dependency for JSON Schema Draft
2020-12 validation. Config files are now validated against a
generated schema before option parsing, providing clear type
error messages to users.

- deps: add ata v0.8.0 JSON Schema validator
- src: validate config after simdjson parse, before option loop
- build: add shared library support and nix fileset for ata
- test: add schema validation and process.versions.ata tests
@mertcanaltin mertcanaltin force-pushed the feat/json-schema-validation branch from 32ba0e9 to 5bdb5f5 Compare April 11, 2026 07:51
Copy link
Copy Markdown
Member

@lemire lemire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me but I recommending adding more tests. Currently, the builtin tests are somewhat mininal.

@mertcanaltin
Copy link
Copy Markdown
Member Author

Thanks @lemire, added more test coverage, type mismatches for boolean, number, array, and nested array items, plus positive tests for valid configs and empty objects.

Copy link
Copy Markdown
Member

@avivkeller avivkeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an update script?

@mertcanaltin mertcanaltin force-pushed the feat/json-schema-validation branch from d72a1af to 1bf468e Compare April 12, 2026 05:29
@mertcanaltin
Copy link
Copy Markdown
Member Author

Can you add an update script?

I added thanks @avivkeller

@ChALkeR
Copy link
Copy Markdown
Member

ChALkeR commented Apr 12, 2026

@mertcanaltin could it notice mistakes like #61595?

Adds compile-time warnings for misplaced keywords (e.g. minItems
inside a string-typed schema). Addresses review feedback about
detecting schema mistakes like nodejs#61595.
@mertcanaltin
Copy link
Copy Markdown
Member Author

@mertcanaltin could it notice mistakes like #61595?

Yes, updated ata to v0.9.0 which now warns during compile() when keywords are placed at incompatible type levels. The exact case from #61595 (minItems inside a string-typed schema) is detected and reported as a compile warning.

@avivkeller
Copy link
Copy Markdown
Member

Can you add an update script?

I added thanks @avivkeller

Remember to update the GitHub workflow to call your added script

@mertcanaltin
Copy link
Copy Markdown
Member Author

mertcanaltin commented Apr 12, 2026

Can you add an update script?

I added thanks @avivkeller

Remember to update the GitHub workflow to call your added script

Thanks for helps, I added GitHub workflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants