Skip to content
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
267 changes: 264 additions & 3 deletions 12-offer-encoding.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
* [Invoice Requests](#invoice-requests)
* [Invoices](#invoices)
* [Invoice Errors](#invoice-errors)
* [Payer Proofs](#payer-proofs)

# Limitations of BOLT 11

Expand Down Expand Up @@ -124,9 +125,9 @@ as the signature of H(`tag`,`msg`) using `key`.

Each form is signed using one or more *signature TLV elements*: TLV
types 240 through 1000 (inclusive). For these,
the tag is "lightning" || `messagename` || `fieldname`, and `msg` is the
the tag is "lightning" || `messagename` || `fieldname`, and `msg` is usually the
Comment on lines -127 to +128
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we revert the addition of "usually"?

Merkle-root; "lightning" is the literal 9-byte ASCII string,
`messagename` is the name of the TLV stream being signed (i.e. "invoice_request" or "invoice") and the `fieldname` is the TLV field containing the
`messagename` is the name of the TLV stream being signed (i.e. "invoice_request", "invoice" or "payer_proof") and the `fieldname` is the TLV field containing the
signature (e.g. "signature").

The formulation of the Merkle tree is similar to that proposed in
Expand Down Expand Up @@ -365,7 +366,7 @@ the onion message.

The second case is publishing an invoice request without an offer,
such as via QR code. It contains neither `offer_issuer_id` nor `offer_paths`, setting the
`invreq_payer_id` (and possibly `invreq_paths`) instead, as it in the one paying: the
`invreq_payer_id` (and possibly `invreq_paths`) instead, as it is the one paying: the
other offer fields are filled by the creator of the `invoice_request`,
forming a kind of offer-to-send-money.

Expand Down Expand Up @@ -896,6 +897,266 @@ sender of the invoice would have to guess how many msat that was,
and could use the `invoice_error` to indicate if the recipient disagreed
with the conversion so the sender can send a new invoice.

# Payer Proofs

Payer proofs are proofs of invoice payment; the human-readable prefix for
payer proofs is `lnp`.

The non-signature elements of a payer proof are identical to the
`invoice` tlv_stream, with the exception that `invreq_metadata` cannot
be included. Various fields are omitted for privacy: numbers
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/are/may be

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, really, "are". You can't produce a valid proof if you use different fields?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I guess "fields" seems ambiguous to me. I read it as "TLV record" thus why "may be". But here is "fields" referring to something else? Like the type of the TLV? There are some "MUST not include" below, but they don't seems to be for privacy.

Later in this sentence "TLV" is used but isn't referring to one TLV record but an entire TLV stream. When I see "TLV" in isolation I think of it as one TLV record where a stream contains a sequence of records. At least that's my internal terminology. :)

Seems two things need to be conveyed: (1) TLV records can be left out for privacy reasons and (2) by doing so we need to include some information to allow verification without revealing which TLVs were left out. But the current wording isn't clear on this to someone unfamiliar with the proposal, IMO.

corresponding to (but not identical to) their position in the TLV are
included, as well as the minimal hashes for missing merkle branches,
to allow verification of the invoicing node's signature.

To prove that this `payer_proof` was created by someone who has the
secret key used to request the invoice in the first place, they
include a signature using the `invreq_payer_id`: this signs a text
note and the invoicing node's signature (which already commits to the
Comment on lines +914 to +915
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we drop the "text note" here given it is just another TLV? Seems out of place otherwise.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep.

other fields).

## TLV Fields for `payer_proof`

1. `tlv_stream`: `payer_proof`
2. types:
1. type: 2 (`offer_chains`)
2. data:
* [`...*chain_hash`:`chains`]
1. type: 4 (`offer_metadata`)
2. data:
* [`...*byte`:`data`]
1. type: 6 (`offer_currency`)
2. data:
* [`...*utf8`:`iso4217`]
1. type: 8 (`offer_amount`)
2. data:
* [`tu64`:`amount`]
1. type: 10 (`offer_description`)
2. data:
* [`...*utf8`:`description`]
1. type: 12 (`offer_features`)
2. data:
* [`...*byte`:`features`]
1. type: 14 (`offer_absolute_expiry`)
2. data:
* [`tu64`:`seconds_from_epoch`]
1. type: 16 (`offer_paths`)
2. data:
* [`...*blinded_path`:`paths`]
1. type: 18 (`offer_issuer`)
2. data:
* [`...*utf8`:`issuer`]
1. type: 20 (`offer_quantity_max`)
2. data:
* [`tu64`:`max`]
1. type: 22 (`offer_issuer_id`)
2. data:
* [`point`:`id`]
1. type: 80 (`invreq_chain`)
2. data:
* [`chain_hash`:`chain`]
1. type: 82 (`invreq_amount`)
2. data:
* [`tu64`:`msat`]
1. type: 84 (`invreq_features`)
2. data:
* [`...*byte`:`features`]
1. type: 86 (`invreq_quantity`)
2. data:
* [`tu64`:`quantity`]
1. type: 88 (`invreq_payer_id`)
2. data:
* [`point`:`key`]
1. type: 89 (`invreq_payer_note`)
2. data:
* [`...*utf8`:`note`]
1. type: 90 (`invreq_paths`)
2. data:
* [`...*blinded_path`:`paths`]
1. type: 91 (`invreq_bip_353_name`)
2. data:
* [`u8`:`name_len`]
* [`name_len*byte`:`name`]
* [`u8`:`domain_len`]
* [`domain_len*byte`:`domain`]
1. type: 160 (`invoice_paths`)
2. data:
* [`...*blinded_path`:`paths`]
1. type: 162 (`invoice_blindedpay`)
2. data:
* [`...*blinded_payinfo`:`payinfo`]
1. type: 164 (`invoice_created_at`)
2. data:
* [`tu64`:`timestamp`]
1. type: 166 (`invoice_relative_expiry`)
2. data:
* [`tu32`:`seconds_from_creation`]
1. type: 168 (`invoice_payment_hash`)
2. data:
* [`sha256`:`payment_hash`]
1. type: 170 (`invoice_amount`)
2. data:
* [`tu64`:`msat`]
1. type: 172 (`invoice_fallbacks`)
2. data:
* [`...*fallback_address`:`fallbacks`]
1. type: 174 (`invoice_features`)
2. data:
* [`...*byte`:`features`]
1. type: 176 (`invoice_node_id`)
2. data:
* [`point`:`node_id`]
1. type: 240 (`signature`)
2. data:
* [`bip340sig`:`sig`]
1. type: 242 (`preimage`)
2. data:
* [`32*byte`:`preimage`]
1. type: 244 (`omitted_tlvs`)
2. data:
* [`...*bigsize`:`missing`]
1. type: 246 (`missing_hashes`)
2. data:
* [`...*sha256`:`hashes`]
1. type: 248 (`leaf_hashes`)
2. data:
* [`...*sha256`:`hashes`]
1. type: 250 (`payer_signature`)
2. data:
* [`bip340sig`:`sig`]
* [`...*utf8`:`note`]
Comment thread
rustyrussell marked this conversation as resolved.
Comment on lines +1015 to +1029
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the choices of parity on these types deliberate? The previous version had them all even.

Also, are any TLVs between 1001 and 999999999 valid? Say odd, ones?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parity of original fields is arbitrary, since you can't implement the proposal at all if you don't understand them.

And yes, I was thinking this should be generalized, to exclude all TLVs between 1001..1bn-1 inclusive, for futureproofing.


## Requirements

A writer of a payer_proof:
- MUST NOT include `invreq_metadata`.
- MUST include `invreq_payer_id`, `invoice_payment_hash`, `invoice_node_id`, `signature` and (if present) `invoice_features` from the invoice.
Comment thread
rustyrussell marked this conversation as resolved.
- MUST include `preimage` containing the `payment_preimage` returned from successful payment of this invoice.
- For each non-signature TLV in the invoice in ascending-type order:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of related to this, it is still unclear to me for TLV fields inside the payer proof signator range
(240-1000), Should we apply the standard even TLV rejection rule
(reject unknown even fields)?

This came up with a review on the LDK implementation lightningdevkit/rust-lightning#4297 (comment)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. If it's even, it means "you cannot process this if you can't understand it".

- If the field is to be included in the payer_proof:
- MUST copy it into the payer_proof.
- MUST append the nonce (H("LnNonce"||TLV0,type)) to `leaf_hashes`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is ambiguous. It could mean:

  • H(H("LnNonce"||TLV0) || H("LnNonce"||TLV0) || type) (tagged hash style), or
  • H("LnNonce" || TLV0 || type) (simple concatenation)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vincenzopalazzo I am confused why you're using H with only one parameter here. H is defined as a function with two parameters in Signature Calculation:

we define H(tag,msg) as SHA256(SHA256(tag) || SHA256(tag) || msg)

So I think this is meant:

SHA256(SHA256("LnNonce" || TLV0) || SHA256("LnNonce" || TLV0) || type)

Or maybe you mean something else?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, H("LnNonce"||TLV0, type) uses the two-argument H(tag, msg) already defined in the Signature Calculation section: SHA256(SHA256(tag) || SHA256(tag) || msg) where tag = "LnNonce"||TLV0 and msg = type.

The LDK reference implementation follows this interpretation — the nonce tag is SHA256("LnNonce" || first_record_bytes) and then each per-TLV nonce hash is computed via the standard tagged hash construction with the type bytes as the message.

That said, the comma between TLV0 and type in H("LnNonce"||TLV0,type) is easy to misread as string concatenation rather than an argument separator, especially for someone implementing this fresh. Might be worth a small clarification in the text (e.g., explicitly noting this is the same H(tag,msg) from the Signature Calculation section).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line only applies to TLVs in the invoice. Shouldn't there now be a line later saying leaf_hashes must also contain hashes of payer_proof fields? Maybe before the line about populating missing_hashes? IIUC, the reader wouldn't be able to construct those leaf hashes without TLV0.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, we need to figure out what to do for the leaf nonces of the payer proof tree.

If we do what you suggest (including in leaf_hashes nonces for the payer proof fields as well), it becomes a bit messier to rebuild the invoice merkle tree (we need to skip leaf_hashes that are for the payer proof tree), but the way we build the invoice merkle tree and the payer proof merkle tree is consistent and uses the same randomness.

Another alternative is to make the payer proof tree use different randomness for its leaf nonces (instead of the invreq_metadata), which better separates the invoice tree from the payer proof tree. I'm not sure what's best.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I think simplest will be to use any empty tlv0 for the payer proof (i.e. "0000" hex).

The reason for the nonce is to make fields unguessable when omitted. Making a proof of a proof seems out of scope!

- otherwise, if the TLV type is not zero:
- MUST append a *marker number* to `omitted_tlvs`
- If the previous TLV type was included:
- The *marker number* is that previous tlv type, plus one.
- Otherwise, if `omitted_tlvs` is empty:
- The *marker number* is 1.
- Otherwise:
- The *marker number* is one greater than the last `omitted_tlvs` entry.
- If `omitted_tlvs` is empty:
- MAY omit `omitted_tlvs` from the payer_proof.
- MUST NOT include non-signature TLV elements which do not come from the invoice.
- MUST populate `missing_hashes` with the merkle hash of the omitted branch of each internal node that has exactly one branch entirely omitted, in depth-first smallest-to-largest TLV order.
- MUST copy `signature` into the payer_proof.
- MUST set `payer_signature`.`sig` as detailed in [Signature Calculation](#signature-calculation) using the `invreq_payer_id` using `msg` SHA256(`payer_signature`.`note` || merkle-root).
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As currently specified, the payer_signature only signs the merkle root of the tree created from the invoice TLVs. It doesn't sign any of the payer_proof TLVs (preimage, omitted_tlvs, etc) which means that this signature doesn't guarantee authenticity: anyone can modify the payer proof without necessarily invalidating the payer_signature. I think this could be an issue.

I think it would be cleaner and safer if the payer_signature was instead a signature of the merkle root of the tree created from all of the payer_proof TLVs. We would exactly follow the steps of the "Signature Calculation" section, just like we do for invoices/invoice requests/offers, but using all of the payer_proof TLVs as leaves of the tree.

With that change, the payer_signature would still indirectly sign the invoice's merkle root, since it would sign omitted_tlvs, leaf_hashes and missing_hashes which allow re-building the invoice's merkle root. But it would also sign all fields of the payer proof, which means that it guarantees authenticity of the proof. Readers would start by validating this payer_signature to verify that the proof wasn't modified in-transit.

If we do that, we should extract the note field to a dedicated TLV instead of bundling it into the payer_signature TLV, this way it's included in the signature like every other TLV.

Happy to discuss this further during our next spec meeting: #1332

Copy link
Copy Markdown
Contributor

@vincenzopalazzo vincenzopalazzo Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK on this

The current scheme works for today's TLVs because the security is transitive: payer_signature only signs SHA256(payer_signature.note || merkle-root), so authenticity of preimage, omitted_tlvs, missing_hashes, and leaf_hashes only holds because each one has a separate binding. preimage is checked against invoice_payment_hash, and the rest reconstruct the invoice merkle root that the issuer's signature covers. It works only because every payer_proof TLV today is either part of the already-signed invoice or has an out-of-band binding to it. Any future payer-side TLV outside both categories silently loses authentication.

Extracting note to its own TLV is also good cleanup, since payer_signature is the only signature TLV in BOLT 12 that isn't a plain bip340sig.

On the rust-lightning side (lightningdevkit/rust-lightning#4297) this is a manageable refactor: compute the payer_proof merkle root over all payer_proof TLVs, add a payer_note TLV, drop the embedded note from payer_signature, and reorder verification (payer signature first, then reconstruct invoice root, then issuer signature). Happy to take it on once the spec lands.

See you at #1332.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, signature commits to the preimage, as it commits to the preimage's hash. The ability to omit field without invalidating the signature is a feature: the signature commits to the hashes, so you can replace a field with its hash, using omitted_tlvs, and vice-versa.

So, an attacker can reduce the proof, by omitting more fields. I was thinking in my implementation I could produce a complete proof as one of the return values from a successful pay command, and then the user could elect to conceal more.

However, the user can also just sign the damn thing again, and weird crypto corner cases tend to make for nasty surprises. The fact that I didn't think of this before @t-bast pointed it out is a red flag.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird crypto corner cases tend to make for nasty surprises.

That's exactly what I'm afraid of, corner cases we wouldn't foresee that would create issues or vulns for users...


A reader of a payer_proof:
- MUST reject the payer_proof if:
- `invreq_payer_id`, `invoice_payment_hash`, `invoice_node_id`, `signature`, `preimage` or `payer_signature` are missing.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/payer_signature/proof_signature

Also in a couple places in the test vectors.

- SHA256(`preimage`) does not equal `invoice_payment_hash`.
- `omitted_tlvs` are not in strict ascending order (no duplicates).
- `omitted_tlvs` contains 0.
- `omitted_tlvs` contains signature TLV element number (240 through 1000 inclusive).
- `omitted_tlvs` contains the number of an included TLV field.
- `omitted_tlvs` is not one greater than:
- an included TLV number, or
- the previous `omitted_tlvs` or 0 if it is the first number.
- `leaf_hashes` does not contain exactly one hash for each non-signature TLV field.
- There are not exactly enough `missing_hashes` to reconstruct the merkle tree root using the `omitted_tlvs` values (with `0` implied as the first omitted TLV).
- `signature` is not a valid signature using `invoice_node_id` as described in [Signature Calculation](#signature-calculation) (with `messagename` "invoice").
- `payer_signature`.`sig` is not a valid signature using `invreq_payer_id` as described in [Signature Calculation](#signature-calculation), using `msg` SHA256(`payer_signature`.`note` || merkle-root).


### Rationale

We disallow including `invreq_metadata`: that is the hashing nonce, thus allowing brute-force of omitted fields.

`invreq_payer_id` is the key whose signature we have to attach to the proof, and `invoice_node_id` and `signature` are needed to validate the original invoice. `invoice_features` may indicate additional details in future which would require additional fields to be in the proof. Note that `invoice_amount` is not compulsory, though it would probably be very useful in most cases.

The `note` in the `payer_signature` field allows a challenge-response system to be implemented: someone requiring proof can ask for a signature with a particular note. It can also be empty.

## Example for Payer Proofs

Consider a trivial TLV construct (not a valid invoice), with the
following fields:

0 - Omitted
10 - Included
20 - Omitted
30 - Omitted
40 - Included
50 - Omitted
60 - Omitted
240 - Omitted (signature field)

Here is the full signature Merkle tree, with omitted nodes
marked with `(o)`:

```
____x____
______/ \______
/ \
__x__ __x__
_/ \_ _/ \_
/ \ / \
x x* x \
/ \ / \ / \ \
/ \ / \ / \ \
/ \ / \ / \ \
0(o) 10 20(o) 30(o) 40 50(o) 60(o)
```

Note that the signature TLV 240 is not included in the merkle tree.

`leaf_hashes` contains the nonce hashes for the present non-signature TLVs:

1. H("LnNonce"||TLV0,10)
2. H("LnNonce"||TLV0,40)
Comment on lines +1119 to +1122
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this would need to include the payer_proof TLVs now?


Since two adjacent nodes (20 and 30) are both omitted, we can (and
must) simply provide the hash of the node above them, marked with an
asterisk.

Thus, `missing_hashes` contains the following hashes in left-to-right
order:

1. Merkle of H("LnLeaf",TLV0) and H("LnNonce"||TLV0,0)
2. Merkle of (Merkle of H("LnLeaf",TLV20) and H("LnNonce"||TLV0,20))
and (Merkle of H("LnLeaf",TLV30) and H("LnNonce"||TLV0,30))
3. Merkle of H("LnLeaf",TLV50) and H("LnNonce"||TLV0,50)
Comment thread
rustyrussell marked this conversation as resolved.
4. Merkle of H("LnLeaf",TLV60) and H("LnNonce"||TLV0,60)

The `omitted_tlvs` array is based on the omitted tlvs: [0, 20, 30, 50,
60]. It uses the minimal values which hide the real field numbers without changing their order, `0` is implied (as
it's always omitted), giving an array of [11, 12, 41, 42].

The algorithm for creating `missing_hashes` is most easily implemented
in a recursive fashion, traversing smallest-to-largest TLV
(left-to-right in the above representation). When you need to combine
two hashes where one side is entirely omitted and the other is not,
append that hash to `missing_hashes`. Note that this is not always the
same as having `missing_hashes` in ascending TLV order.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like I don't understand what this sentence means, and neither does Claude :). I'm sure the last sentence is the key, but it doesn't explain the algorithm that well!

In my implementation, I have simply built the invoice merkle tree and recursively annotated each node with a boolean to mark whether the subtree is fully omitted or not. Then I have implemented the most basic DFS, thinking that this was what was expected since DFS is mentioned (in pseudo-code):

def compute_missing_hashes(tree) -> List<Bytes32>:
  - if this (sub)tree is fully omitted, return its hash
  - otherwise, return compute_missing_hashes(tree->left) and append compute_missing_hashes(tree->right)

This yields the same set of missing hashes as the test vector, but with the order swapped by pairs...so I'm confused about what DFS / "smallest-to-largest TLV" means in this context.

Copy link
Copy Markdown
Collaborator

@t-bast t-bast May 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EDIT: I (well, Claude) managed to get it to match the spec test vectors, by always including hashes from lower depths of the tree before hashes from higher depths. If this is the intended behavior, it's not entirely intuitive, and unfortunately the example tree doesn't showcase this at all, so it's probably worth detailing this a bit more.

Note that I have only implemented the creation of a payer proof, not the algorithm in the other direction (validation of a payer proof). Maybe that's where it makes sense to have missing hashes in that order.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good point that t-bast brought up. We hit the same question while implementing this in rust-lightning, and after running our impl against the spec test vectors the conclusion lines up with t-bast's edit: this is post-order DFS, not pre-order.

Concretely, in lightningdevkit/rust-lightning#4297 (lightning/src/offers/merkle.rs::build_tree_dfs), we build missing_hashes like this:

  1. Recurse into the left subtree fully.
  2. Recurse into the right subtree fully.
  3. After both return, if exactly one side is fully omitted and the other is at least partially included, push the omitted side's hash.

Running this against bolt12/payer-proof-test.json (commit 0f2b026) produces byte-identical output to Rusty's updated vectors across proof_merkle_root, proof_signature, bech32, and the missing_hashes TLV (type 1003) for all four valid_vectors. Happy to post the regen diff if useful.

The "lower depths before higher depths" property t-bast noticed is just the post-order-traversal invariant: by the time we emit a missing-hash at depth d, every missing-hash from depth > d (anything inside that subtree) is already in the list. The "swap by pairs" most likely comes from emitting the hash on the way down (at the parent, before recursing into the visible sibling) instead of after the sibling has been processed.

Trace on the spec example (TLVs 0,10,20,30,40,50,60 with 10 and 40 included), to make the ordering unambiguous:

step 1: (0, 10)            ; 0 omitted, 10 included      -> push h0
step 2: (20, 30)           ; both omitted                -> (no push)
step 3: (0,10) | (20,30)   ; left visible, right omitted -> push h(20⊕30)
step 4: (40, 50)           ; 40 included, 50 omitted     -> push h50
step 5: (40,50) | 60       ; left visible, right omitted -> push h60

Final: [h0, h(20⊕30), h50, h60], which matches the spec text at lines 1131–1135.

What I see as worth clarifying in the spec text:

  • Either say post-order DFS explicitly, or
  • Add pseudocode that pins down the timing of the append (after both recursive calls have returned, not on the way down).

Right now "traversing smallest-to-largest TLV (left-to-right in the above representation)" reads as either pre-order or post-order depending on which sentence you anchor on, which is what t-bast hit.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's post order, and indeed the test vectors (which @t-bast found) do test this.

I really hate traversal language: I would have called this "depth first", since you do children before parents :(

I'm delighted with any clarifications you can offer. Perhaps pseudocode is best here?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what would help best would be to change the example tree in the Example for Payer Proofs section to actually showcase this property, because the current one yields the same result with both DFS and post-order DFS. I think it's the example that would make it obvious to anyone which algorithm is used, rather than how we name/describe it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm delighted with any clarifications you can offer. Perhaps pseudocode is best here?

I like to have a pseudo-code for it too, it make easy to understand, and it is resistant to language variation.

Claude suggested the following pseudo-code that looks neat

# Returns (subtree_hash, subtree_has_any_present_leaf).
# Side effect: appends to the global `missing_hashes` list.
def build_proof_tree(tlvs):
    if len(tlvs) == 1:
        leaf = tlvs[0]
        return (leaf_hash(leaf), leaf is present)

    # Same TLV-ordered split that signing uses.
    left_tlvs, right_tlvs = split(tlvs)

    # Recurse FULLY into both children before deciding whether to emit.
    (left_hash,  left_present)  = build_proof_tree(left_tlvs)
    (right_hash, right_present) = build_proof_tree(right_tlvs)

    # Emit at the boundary where exactly one side is entirely omitted.
    if left_present and not right_present:
        missing_hashes.append(right_hash)
    elif right_present and not left_present:
        missing_hashes.append(left_hash)
    # If both sides have at least one present leaf -> no append
    #   (the verifier reconstructs both sides from disclosed data).
    # If both sides are fully omitted              -> no append here
    #   (the parent will emit one hash that covers this whole subtree).

    return (branch_hash(left_hash, right_hash), left_present or right_present)


# Top-level call:
missing_hashes = []
build_proof_tree(non_signature_tlvs_in_ascending_type_order)
# `missing_hashes` is now ready to encode as TLV 1003.

Reconstruction is the exact opposite: when you need to combine a hash
where one side is entirely omitted and the other is not, pull a hash
from `missing_hashes`. If there are insufficient `missing_hashes`, or
it isn't empty when you have completed the merkle tree, the number of
`missing_hashes` was incorrect.

See the [Payer Proof Test Vectors](bolt12/payer-proof-test.json) for more
examples.

## Rationale

Using the invoice as a base enshrines information about the payment including important offer and invoice_request fields. However, many fields are not useful (such as payment paths), or may compromise privacy (such as invreq_payer_note containing delivery address information), so being able to elide them while still allowing signature validation is vital.

Avoiding including TLV0 (which is required to be unguessable), and publishing the nonce-leaf-hashes for each included TLV means that you cannot brute-force the values of any unknown leaves. For example, while you know the merkle of H("LnLeaf",TLV50) and H("LnNonce"||TLV0,50), you cannot determine H("LnNonce"||TLV0,50).

The requirement to include minimal hashes (rather than one for every unknown leaf) minimizes the size, especially when many consecutive fields are omitted. As the exact TLV types of omitted TLVs are unimportant (as long as ordering is maintained), we renumber them to be minimal, as further obfuscation of values.

The `payer_signature` proves that the same key signed this proof as signed the invoice_request: the `note` field provides room for an arbitrary challenge or self-identification.

# FIXME: Possible future extensions:

1. The offer can require delivery info in the `invoice_request`.
Expand Down
Loading