Skip to content

Support encryption of redactSalt in capture/derivation specs #2963

@jwhartley

Description

@jwhartley

Context

Surfaced in #2950 (review comment) while documenting the redactSalt override. The docs tell users to "treat the salt as sensitive" — but unlike endpoint configs, there's no automatic encryption for it today, so it ends up stored as plain text in the spec.

Current state

flowctl's encrypt_configs (crates/flowctl/src/draft/encrypt.rs) only encrypts three things on publish:

  1. Capture endpoint.config
  2. Materialization endpoint.config
  3. Materialization triggers (with HMAC-excluded fields handling)

Encryption is driven by JSON Schemas with secret: true annotations (the connector endpoint spec schema for endpoints, triggers_schema() for triggers).

The top-level redactSalt field on CaptureDef (crates/models/src/captures.rs:47) and DerivationDef (crates/models/src/derivation.rs:32) is Option<bytes::Bytes> and is not walked by encrypt_configs. The runtime reads redact_salt.to_vec() directly off the protobuf CaptureSpec/DerivationSpec (crates/runtime/src/capture/task.rs:124, crates/runtime/src/derive/task.rs:99), so there is no decryption boundary either.

Net: a user-supplied redactSalt is stored and shipped as plain bytes.

Why it matters

If anyone obtains the stored salt plus a candidate plaintext value (email, phone, SSN, etc.), they can compute the corresponding hash directly. The doc's "treat the salt as sensitive" guidance is hard to live up to when the value sits as plain text in the spec.

Proposed direction

Extend flowctl encrypt_configs to handle the top-level redactSalt field on captures and derivations, with a sops envelope analogous to the triggers path:

  • Define a small schema (e.g., redact_salt_schema()) that marks the salt as secret: true.
  • At publish time, wrap redactSalt in a sops envelope if not already encrypted.
  • Decrypt at the same boundary that decrypts endpoint configs today, so the runtime CaptureSpec/DerivationSpec continues to carry plaintext bytes (no runtime change required).
  • Update the redaction docs (currently in #2950) to drop the "treat the salt as sensitive — flowctl won't help you" caveat.

Open questions for whoever picks this up:

  • Storage shape: encrypted-at-rest in the spec (like endpoint configs) vs. an envelope that persists end-to-end and is unsealed in the runtime.
  • Whether the auto-generated per-task salt (when no override is supplied) should also be encrypted at rest using the same mechanism, for consistency.

Customer demand

No active customer ask today. Filing as backlog so we can point to it when the question comes up; per the PR thread, for now the docs leave the encryption guidance vague.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions