Skip to content

[WIP] A Dataloader for Webdataset#26

Open
dead-water wants to merge 4 commits into
mainfrom
webdataset-dataloader
Open

[WIP] A Dataloader for Webdataset#26
dead-water wants to merge 4 commits into
mainfrom
webdataset-dataloader

Conversation

@dead-water

Copy link
Copy Markdown
Member

The past webdataset generation work assumed an index for sharded tar lookup. The decision was made to make this more friendly and compress per day such that shuffling can be grained to the day.

  • Making Webdatasets code
  • Ingestion code

@dead-water dead-water added this to the Marshall Meeting milestone Aug 6, 2024
@dead-water dead-water self-assigned this Aug 6, 2024
@dead-water dead-water linked an issue Aug 6, 2024 that may be closed by this pull request
Base automatically changed from vertexai to virtual-eve-camera-ready May 22, 2026 13:29
Base automatically changed from virtual-eve-camera-ready to main May 22, 2026 13:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Extend Webdataset creation to Live SDOML

2 participants