Skip to content

Add S3 storage and multi-hop transfer tutorial section#766

Open
alessio94 wants to merge 3 commits into
rucio:mainfrom
alessio94:alessio94-s3-tutorial-v3
Open

Add S3 storage and multi-hop transfer tutorial section#766
alessio94 wants to merge 3 commits into
rucio:mainfrom
alessio94:alessio94-s3-tutorial-v3

Conversation

@alessio94
Copy link
Copy Markdown

This PR supersedes #744 and #739.

Addresses feedback from PR #744:

  • Remove Jupyter notebook
  • Inline script content directly as code blocks
  • Replace inline script comments with prose descriptions
  • Remove docker exec wrapping; assume reader is in a Rucio admin environment
  • Narrow scope to S3 RSE setup only; remove environment initialization steps
  • Thanks for the feedback
    @voetberg

alessio94 and others added 3 commits March 23, 2026 14:13
Addresses feedback from PR rucio#744:
- Remove Jupyter notebook
- Inline script content directly as code blocks
- Replace inline script comments with prose descriptions
- Remove docker exec wrapping; assume reader is in a Rucio admin environment
- Narrow scope to S3 RSE setup only; remove environment initialization steps

This tutorial covers how to register S3-compatible storage (MinIO) as Rucio Storage Elements (RSEs), configure credentials for both Rucio and FTS, and set up RSE distances to enable multi-hop transfers between S3 and XRootD endpoints.

The examples use a Docker Compose playground environment with two MinIO instances (MINIO1, MINIO2) and three XRootD servers (XRD1, XRD2, XRD3). The commands assume you are already operating within a Rucio admin environment with the `rucio` and `rucio-admin` CLI tools available.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, rucio and rucio-admin have been merged since rucio 38, the admin role is handled by the permission policies. Instead I'd change this too

Suggested change
The examples use a Docker Compose playground environment with two MinIO instances (MINIO1, MINIO2) and three XRootD servers (XRD1, XRD2, XRD3). The commands assume you are already operating within a Rucio admin environment with the `rucio` and `rucio-admin` CLI tools available.
The examples use a Docker Compose playground environment with two MinIO instances (MINIO1, MINIO2) and three XRootD servers (XRD1, XRD2, XRD3). The commands assume you are already have an rucio instance with an admin account.


Register both MinIO instances as RSEs with S3 protocol configuration. The `gfal.NoRename` implementation is used because S3 does not support server-side rename operations.

```bash
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be reduced to a for loop (over both MINIO1 and 2), it would make this easier to read

--prefix /rucio/ \
--impl rucio.rse.protocols.gfal.NoRename \
--domain-json '{"lan": {"read": 1, "write": 1, "delete": 1}, "wan": {"read": 1, "write": 1, "delete": 1, "third_party_copy_read": 1, "third_party_copy_write": 1}}'
rucio rse attribute add MINIO1 --key sign_url --value s3
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These could use some inline comments to explain each of the options, or at least a link to the config params page in the section description

```bash
ID1=$(rucio rse show MINIO1 | grep '^ id:' | awk '{print$2}')
ID2=$(rucio rse show MINIO2 | grep '^ id:' | awk '{print$2}')
cat >/opt/rucio/etc/rse-accounts.cfg <<JSON
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't render to anything special, I'd instead just use the console tag to instead do

$ cat /opt/rucio/etc/rse-accounts.cfg
{
  "$ID1": {
    "access_key": "admin",
    "secret_key": "password",
    "signature_version": "s3v4",
    "region": "us-east-1"
  },
  "$ID2": {
    "access_key": "admin",
    "secret_key": "password",
    "signature_version": "s3v4",
    "region": "us-east-1"
  }
}


### Configuring RSE Distances for Multi-Hop

RSE distances tell Rucio which transfer paths are available and their relative cost. Setting a distance of 1 between MINIO RSEs and XRD3 establishes the multi-hop path: transfers from MinIO to XRD1 or XRD2 will route through XRD3 as an intermediate.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This explanation is technically correct, but a little overly verbose. I recommend putting in a mermaid chart instead to make this more visual.

Suggested change
RSE distances tell Rucio which transfer paths are available and their relative cost. Setting a distance of 1 between MINIO RSEs and XRD3 establishes the multi-hop path: transfers from MinIO to XRD1 or XRD2 will route through XRD3 as an intermediate.
RSE distances establish transfer paths between RSEs. Setting a distance of 1 between the source and an intermediate will ensure the intermediate transfer will always be preferred over longer direct transfers.
```mermaid
graph TD
MINIO[MINIO RSE]
XRD1[XRD1 RSE]
XRD2[XRD2 RSE]
XRD3[XRD3 RSE]
MINIO -.->|distance=1| XRD3
XRD3 -.->|distance=1| XRD1
XRD3 -.->|distance=1| XRD2

@voetberg
Copy link
Copy Markdown
Contributor

voetberg commented May 1, 2026

Hi @alessio94 , have you gotten a chance to look at these comments?

@alessio94
Copy link
Copy Markdown
Author

Hi @voetberg, thanks for the follow-up and apologies for the slow response.

I'll address the remaining comments, but I want to be transparent: this PR is now at its third or fourth iteration and I'm finding it difficult to prioritize given that my ATLAS qualification task effectively concluded weeks ago. I'd appreciate if you could consolidate any remaining feedback in one pass once I push the next update, so we can move toward a final review without further back-and-forth.

A couple of questions before I start: for the suggested changes you left inline, should I simply accept them as-is, or are there parts you'd like me to rework independently? Also, given how many commits have landed on main since this branch was created, would it make more sense to open a fresh PR on a rebased branch rather than continuing here?

I'll aim to push the changes within the next few days.

alessio94 added a commit to alessio94/documentation-rucio that referenced this pull request May 13, 2026
Added detailed instructions for configuring S3 storage and multi-hop transfers in Rucio, including setting up MinIO instances, registering RSEs, and verifying the setup.

This PR supersedes rucio#744, rucio#739, and rucio#766.

Addresses feedback from PR rucio#766.

Thanks for the review, @voetberg. I have implemented the requested documentation changes:

- Updated the introduction to reflect that `rucio` and `rucio-admin` have been merged since Rucio 38, and that admin access is now handled through permission policies.
- Reduced the duplicated MinIO RSE registration commands to a loop over `MINIO1` and `MINIO2`.
- Added explanatory comments for the RSE attributes used in the S3 protocol configuration.
- Reworked the `rse-accounts.cfg` section to separate the commands used to generate the file from the rendered configuration output, using a `console` block for the inspected file content.
- Simplified the explanation of RSE distances and added a Mermaid diagram to make the multi-hop topology clearer.
- Removed the Jupyter notebook material from the documentation, keeping the demo setup focused on the operator workflow.
@voetberg
Copy link
Copy Markdown
Contributor

I'll address the remaining comments, but I want to be transparent: this PR is now at its third or fourth iteration and I'm finding it difficult to prioritize given that my ATLAS qualification task effectively concluded weeks ago. I'd appreciate if you could consolidate any remaining feedback in one pass once I push the next update, so we can move toward a final review without further back-and-forth.

I will do my best on this, but if you make changes that need additional comments after this, I cannot make promises.

A couple of questions before I start: for the suggested changes you left inline, should I simply accept them as-is, or are there parts you'd like me to rework independently?

It makes more sense to just modify this existing PR with the suggested changes, you can simply use git commit --amend to add in those changes.

Also, given how many commits have landed on main since this branch was created, would it make more sense to open a fresh PR on a rebased branch rather than continuing here?

All the content on this page is new, so it shouldn't have to be rebased. The PR doesn't show any conflicts so making a new PR or rebasing won't do anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants