Skip to content

CLI binary download: always use HTTP file handler instead of IncludeBinary#5887

Open
n0nick wants to merge 5 commits intoviamrobotics:mainfrom
n0nick:sm/cli-download
Open

CLI binary download: always use HTTP file handler instead of IncludeBinary#5887
n0nick wants to merge 5 commits intoviamrobotics:mainfrom
n0nick:sm/cli-download

Conversation

@n0nick
Copy link
Copy Markdown
Member

@n0nick n0nick commented Mar 25, 2026

  • Removes the IncludeBinary=true gRPC path from binary data export entirely. The API service rejects large binary payloads, causing a wasted round-trip before falling back to HTTP download. Now we always fetch metadata only and download files via the HTTP file handler URL.
  • Eliminates redundant metadata fetches in the dataset download path by passing full BinaryData objects through the worker pool instead of re-fetching by ID.
  • Replaces downloadBinary() with downloadSingleBinaryFile() (HTTP-only) and adds downloadBinaryDataFromFilter() parallel worker pool.

This is a redo of #5587 except verified this time :)

Manual verification

  • viam data export binary filter with small files (< gRPC limit)
  • viam data export binary filter with large files (> gRPC limit)
  • viam data export binary ids with specific binary data IDs
  • viam data export binary filter with gzipped files
  • viam dataset export with --only-jsonl flag
  • viam dataset export without --only-jsonl (full download + jsonl)
  • Verify metadata JSON files are written correctly
  • Verify --parallel flag works (try with 1 and with default 100)
  • Verify --timeout flag works

🤖 Generated with Claude Code

…inary

The API service rejects large binary payloads when IncludeBinary=true,
causing the CLI to make a wasted round-trip before falling back to HTTP.
This removes the IncludeBinary path entirely and always downloads binary
files via the HTTP file handler URL from metadata, eliminating the
rejected-request overhead.

Key changes:
- Replace downloadBinary() with downloadSingleBinaryFile() (HTTP-only)
- Extract getMatchingBinaryData() to return full BinaryData with URIs
- Add downloadBinaryDataFromFilter() parallel worker pool
- Refactor dataset download to use metadata already in hand
- Remove isLargeFileError and IncludeBinary codepaths

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@viambot viambot added the safe to test This pull request is marked safe to test from a trusted zone label Mar 25, 2026
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 25, 2026
n0nick and others added 3 commits March 25, 2026 16:55
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…taset JSONL

Clone the BinaryMetadata proto before setting the resolved FileName,
so that downstream consumers (binaryDataToJSONLinesFromMetadata) see
the original metadata and don't double-prepend the timestamp.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Mar 25, 2026
@n0nick n0nick requested a review from gloriacai01 March 27, 2026 13:40
@n0nick n0nick marked this pull request as ready for review March 27, 2026 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

safe to test This pull request is marked safe to test from a trusted zone

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants