Document read_auto operator#366
Conversation
|
📦 Preview · |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4d7d284ceb
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Add the reference entry for automatic reader detection, including strict detection behavior, fallback modes, probe limits, and examples. Assisted-by: GPT-5 (pi)
State that fallback=all chooses text or binary mode from the current probe bytes, not from the entire stream. Point users with binary payloads that start with a UTF-8 prefix to a larger probe or direct read_all binary mode. Assisted-by: GPT-5 (pi)
Replace the invalid load snippets with from_file subpipelines, matching the documented file-reading syntax for parsing byte streams. Assisted-by: GPT-5 (pi)
Document the default probe limit as 1Mi to match the TQL spelling users can configure. Assisted-by: GPT-5 (pi)
Add guide examples for rapid prototyping, mixed file drops, and TCP intake endpoints that accept several input formats. Expand the operator reference with guidance about when to choose automatic detection versus a concrete reader. Assisted-by: ChatGPT (pi)
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f5c39acde6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Document that fallback selection waits until the probe is final. This makes the long-lived stream behavior explicit and points users to lower probe limits or concrete readers when they need immediate text parsing. Assisted-by: GPT-5 (Codex)
Describe the two detection layers in the description: capability via dry runs of the actual parsers, and a specificity order that picks the most precise format among capable readers. Document that SSV and PRI-less Syslog never auto-detect, and that output keeps the selected reader's schema name. Assisted-by: Fable 5 (Claude Code)
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Assisted-by: Fable 5 (Claude Code)
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Assisted-by: Fable 5 (Claude Code)
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Assisted-by: Fable 5 (Claude Code)
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Document that YAML auto-detection requires a map document that read_yaml would turn into an event. Assisted-by: GPT-5 Codex (OpenAI Codex)
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f9f322198c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Set a smaller probe limit in the TCP read_auto example and point known long-lived plain-text streams to read_lines directly. Assisted-by: GPT-5 Codex (OpenAI Codex)
## 🔍 Problem - Users currently need to choose a concrete reader up front, even when the input format is obvious from the first bytes. - Generic readers such as `read_lines` and `read_all` are too weak to be safe defaults for automatic parsing. ## 🛠️ Solution - Add `read_auto` as a strict detector-driven reader selector for chunk input. - Add detector variants for the first supported JSON, text-line, delimited, and magic-byte formats. - Require an explicit `fallback="lines"` or `fallback="all"` for unknown input. - Add a read-detection extension point for reader plugins. - Add a changelog entry and focused integration coverage. ## 💬 Review - Focus on detector precedence and ambiguity behavior, especially JSON object vs NDJSON and explicit fallbacks. - Verified with `scripts/build.sh tenzir-unit-test` and `uvx tenzir-test --root test --match read_auto`. <sub> 📚 Docs PR: tenzir/docs#366 </sub>
🔍 Problem
read_autoadds user-facing automatic reader detection, but docs.tenzir.com has no operator reference for it.🛠️ Solution
read_autoreference page with strict detection behavior, fallbacks, probe limits, and examples.read_autoto the operator reference index.💬 Review
- Check the fallback semantics and supported format list against Add automatic reader detection tenzir#6191.
⚙️ Code PR: tenzir/tenzir#6191