Skip to content

Replace CAS with CS and CLS#347

Draft
meroton-benjamin wants to merge 2 commits into
buildbarn:mainfrom
meroton:cdc-support-step-2
Draft

Replace CAS with CS and CLS#347
meroton-benjamin wants to merge 2 commits into
buildbarn:mainfrom
meroton:cdc-support-step-2

Conversation

@meroton-benjamin

Copy link
Copy Markdown
Contributor

This commit builds on top of our split and splice blob support to make
it a mandatory first class feature in Buildbarn. With this commit the
Content Addressable Storage (CAS) is created from two Storage
configurations that work in tandem. A Chunk Storage (CS) which is
content addressed and contains chunks of blobs, and a Chunk List Storage
(CLS) which is addressed by a blob digest and contains a manifest
describing the chunks that make up the blob.

All api calls are automatically translated to use Chunk Lists created
with RepMaxCDC. Effectively this means that large blobs no longer exists
in the storage layer, individual chunks of the large blobs are in turn
deduplicated in such a manner that the chunks are stored only once.

The automatic translation makes certain that clients that are not cdc
aware can still continue to use the storage backend without performing
any changes. Clients which support RepMaxCDC also gets a significant
reduction in the amount of blobs to transfer as they only need to
transfer modified chunks rather than the entire blob.

This commit adds support for the SplitBlob and SpliceBlob methods from
the Remote Execution v2 (REv2) api. SplitBlob and SpliceBlob can be used
to facilitate uploads and downloads of large files but a naïve
implementation like this has some major drawbacks as well.

The blobs must exist in both their chunked and non chunked form,
which may significantly increase storage requirements for large blobs.
The protocol gives no guarantee that a large blob stored in the CAS
exists in its chunked form which forces you to perform a fairly heavy
Split call that loads the entire large blob in order to decomposition it
into its chunks.

This implementation mostly exists as a stepping stone for a different
implementation where Buildbarn internally manages all blobs as chunked
blobs.
This commit builds on top of our split and splice blob support to make
it a mandatory first class feature in Buildbarn. With this commit the
Content Addressable Storage (CAS) is created from two Storage
configurations that work in tandem. A Chunk Storage (CS) which is
content addressed and contains chunks of blobs, and a Chunk List Storage
(CLS) which is addressed by a blob digest and contains a manifest
describing the chunks that make up the blob.

All api calls are automatically translated to use Chunk Lists created
with RepMaxCDC. Effectively this means that large blobs no longer exists
in the storage layer, individual chunks of the large blobs are in turn
deduplicated in such a manner that the chunks are stored only once.

The automatic translation makes certain that clients that are not cdc
aware can still continue to use the storage backend without performing
any changes. Clients which support RepMaxCDC also gets a significant
reduction in the amount of blobs to transfer as they only need to
transfer modified chunks rather than the entire blob.
Comment thread cmd/bb_storage/main.go

var parameterCache *cdc.TTLCache[cdc.Parameters]
if casConfiguration.ContentDefinedChunkingParameterCache != nil {
parameterCacheConfiguraiton := casConfiguration.ContentDefinedChunkingParameterCache

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: Configuraiton

Comment thread cmd/bb_storage/main.go
if err != nil {
return err
}
parameterCache = cdc.NewTTLCache[cdc.Parameters](

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is a cache that maps instance names to CDC parameters, right? I don't think it's correct to create such a cache globally. If you use DemultiplexingBlobAccess, you can rewrite instance names. Wouldn't that lead to potential collisions?

Also in the case of centralized storage nodes I don't think it makes sense to have any caching of CDC parameters. There's nothing to cache, as the storage node would just a single configuration globally. It's only the gRPC client backend that needs a cache.

Maybe better to just extend BlobAccess to have a new GetCDCParameters() method or something? Then let individual storage backends be responsible for caching this information (or not).

Comment thread cmd/bb_storage/main.go
clock.SystemClock,
evictionSet,
int(parameterCacheConfiguraiton.GetCacheSize()),
parameterCacheConfiguraiton.CacheDuration.AsDuration(),

@EdSchouten EdSchouten Jun 22, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing .CacheDuration.CheckValid()?

Comment thread cmd/bb_storage/main.go
)
if err != nil {
return util.StatusWrap(err, "Failed to create Content Addressable Storage")
return util.StatusWrap(err, "Failed to create Content Addressable Storage: Failed to create Chunk Storage")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Failed to create Chunk Storage for Content Addressable Storage"?


// Chunk list is marked for validation bypass, push it directy to
// downstream blob store.
if cdc.ChunkListValidationBypassed(ctx) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me the notion of whether a chunk list is known to be valid isn't really a properly of the calling context. It's more of a property of the chunk list that's passed in. Maybe better to either change the signature of BlobAccess.Put(), or add this to buffer.Buffer?

I take it that this logic is added to make sure that if a client performs a legacy Write() for a large object, that the frontend doesn't re-read the objects just to make sure that the ChunkList is valid, right? If so, how does this actually relate to buffer.Source? Maybe we should just treat ChunkLists created by bb-frontend itself as being buffer.BackendProvided(), and use that to skip validation?

// demands on the state of the Content Addressable Storage (CAS)
// after those methods have been called.
//
// SplitBlob requires that that the blob as well as all chunks of

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/that that/that/

// that the supplied chunk list composes into the blob. Notably it
// does not require the chunks to follow any particular chunking
// algorithm but our implementation ensures that after any call a
// proper rep max cdc chunk list is verified even if the caller

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/rep max cdc/RepMaxCDC/

// Storage (CAS).
ScannableBlobAccessConfiguration content_addressable_storage = 17;
// Optional: Blobstore configurations for the Content
// AddressableContentAddressa Storage (CAS).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo :)


// Was 'chunk_list_storage'. Has been moved into the
// content_addressable_storage.
reserved 22;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed, right?

Comment thread cmd/bb_storage/main.go
Comment on lines +66 to +71
if casConfiguration.ChunkStorage == nil {
return status.Error(codes.InvalidArgument, "The Chunk Storage is a mandatory part of the Content Addressable Storage.")
}
if casConfiguration.ChunkListStorage == nil {
return status.Error(codes.InvalidArgument, "The Chunk List Storage is a mandatory part of the Content Addressable Storage.")
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really a necessary requirement? I can imagine that for frontends that implement the full REv2 API it's necessary that both are provided. But for individual shards of our storage backends there is no requirement that each node provides both a CS and CLS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants