The v3.0 workloads {flux|dlrm|retinanet}_{b200|mi355} are currently missing S3 storage parameters. We would need to create new {flux|dlrm|retinanet}_{b200|mi355} workload yaml files for the currently support S3 libraries {s3dlio|minio|s3torch} and add the following fields (similar to unet3d).
storage:
storage_type: s3
storage_root: <s3-bucket-name>
storage_library: <{s3dlio|minio|s3torch}>
storage_options:
endpoint_url: <S3_URL>
region: us-east-1
s3_force_path_style: true
checkpoint:
checkpoint_folder: s3://<s3-bucket-name>/s3dlio/llama3-8b
FYI: @russfellows
The v3.0 workloads
{flux|dlrm|retinanet}_{b200|mi355}are currently missing S3 storage parameters. We would need to create new{flux|dlrm|retinanet}_{b200|mi355}workload yaml files for the currently support S3 libraries{s3dlio|minio|s3torch}and add the following fields (similar to unet3d).FYI: @russfellows