extension-bolt: simple taproot channels (feature 80/81) by Roasbeef · Pull Request #995 · lightning/bolts

Roasbeef · 2022-05-30T16:37:35Z

This PR puts forth two concepts:

The concept of an "extension BOLT", which is a single documentation extension to the base BOLT spec that references existing "base" BOLTs with new slightly modified functionality. This presents an alternative to littering the main set of BOLTs with a series of "if" statements, which can be somewhat unwieldy for larger changes, and also harder to review/parse.
A new set set of feature bit, channel type, funding output changes, commitment changes, and HTLC script changes under the umbrella of "simple taproot channels", a.k.a the minimal amount of changes needed to get us to "taprooty level 1".

The extensions described in this document have purposefully excluded any gossip related changes, as the there doesn't yet appear to be a predominant direction we'd all like to head in (nu nu gossip vs kick the can and add schnorr).

Most of the changes here described are pretty routine: use musig2 when relevant, and create simple tapscript trees to fold in areas where the script has multiple conditional paths. The main consideration with musig2 is ofc: how to handle nonces. This document takes a very conservative stance, and simply proposes that all nonces be 100% ephemeral, and forgotten, even after a connection has been dropped. This has some non-obvious implications w.r.t the retransmission flow. Beyond that, it's mostly: piggy back the nonce set of nonces (4 public nonces total, since there're "two" messages) on a message to avoid having to add additional round trips.

The other "new" thing this adds is the generation/existence of a NUMs point, which is used to ensure that certain paths can only be spent via the script spend path (like the to remote output for the remote party, as this inherits anchor outputs semantics).

This is still marked as draft, as it's just barely to the point of being readable, and still has a lot of clean ups to be done w.r.t notation, clarify, wording, and full specification.

Roasbeef · 2022-05-30T18:02:44Z

Some things that came up in meatspace discussions:

Need to ensure we specify co-op close interaction re needing to check diff types of co-op close transactions (remote party trimmed and output, etc)
Maybe we should remove the NUMs point for the to_remote output, and just use the musig2 funding key there: in the best interest of the local party to just never sign for that
We may need to actually send the next nonce in the commit_sig message (only one of them?) to ensure that after a commitment dance, both parties are able to immediately send a sig.

Crypt-iQ · 2022-05-30T18:46:59Z

I think the commit_sig should contain the sender's "remote nonce" and the revoke_and_ack contain the sender's "local nonce".

Also since funding_locked will be sent repeatedly with scid-alias when that is merged and deployed, then there should probably be language to define that the nonces are only sent the first time?

instagibbs · 2022-05-30T19:12:21Z

let's try to pick naming conventions for nonces that doesn't make me cry over the asymmetry

ZmnSCPxj · 2022-05-31T16:31:17Z

Some points:

This interacts with the 2-of-3 goal of @moneyball . If one participant uses a 2-of-3 and owns ALL 3 keys, then it is fine and we can just have MuSig2 with both channel endpoints. But the 2-of-3 goal is that one channel endpoint is really a nodelet-like setup: there is one sub-participant with 2 keys and another "server" participant with 1 key, a la GreenWallet. This requires composable MuSig2. Now I think composable MuSig2, if it can be proven safe, just requires two Rs just like normal non-composable MuSig2, but we probably need to go pester the MuSig2 authors --- I think they wrote up how composable MuSig2 would work, but only internally and they never actually published the details. This is important because we may need to have variable number of Rs, not just two.

--

This interacts with VLS as well @ksedgwic . The nonce r behind R = r * G needs to be retained, since we pre-send the R and on the actual signing MUCH later we use the r. VLS cannot have the host store r since exfiltration of the r together with a complete R, s implies exfiltration of the private key. But this is a per-channel state and constrained devices might not have enough space for each channel. What could be done would be to put the per-channel states into a Merkle tree and have the VLS constrained device store only the Merkle tree root, then every time the r has to be rotated (at each reconnect or at each signing event) update the Merkle tree root in the constrained persistent storage --- it has to be persistent since the host could keep the connection alive while power-cycling the VLS device. You also have to be careful of "UPDATE IN PLACE is a poison apple" and replicated storage. Now as mentioned we cannot store r directly so what the VLS has to do would be something like: generate random scalar q, compute Q = q * G, compute r = ECDH(k, Q) where k is the private key held by the signer device, and then store Q on the host so it can recover r later without the host being able to recover r as well.

A similar technique may also be useful for the server in the 2-of-3 of @moneyball; rather than maintain a state for each channel of each client, the client could store the per-channel Q that the server generates and uses ECDH to get r and then R = r * G for each channel.

ZmnSCPxj · 2022-05-31T21:18:01Z

So I talked to @jonasnick, and as I understand it, we can work with just two Rs even in the composition case, probably will also work in FROST, maybe. This should be safe but we do not have a written out proof, because it seems the proof is complicated. So at least it looks like we do not need a variable number of Rs, just two from each side should work.

Roasbeef · 2022-05-31T22:49:26Z

Re recursive musig2: I'm gonna give the implementation a shot (outside the LN context, just the musig-within-musig) just to double check my assumptions re not needing to modify the (revised) nonce exchange flow.

antonilol · 2022-06-01T07:42:28Z

i made a pull request on this pull request with script fixes

Roasbeef#1

antonilol · 2022-06-02T06:01:09Z

Maybe we should remove the NUMs point for the to_remote output, and just use the musig2 funding key there: in the best interest of the local party to just never sign for that

Why not the revocation key? When i publish an old state, the remote party can claim my output and htlcs with the key path, but not his own output, and also has to wait a block. If we set the internal key to the revocation key it will give the remote party more privacy, nobody on chain can see which outputs were to local and to remote (and htlcs if they are swept along). It will also give more consistency with other output as they also have the revocation key as internal key.

it will also be cheaper (or get a higher fee rate with the same amount of sats), this only requires a signature from a taptweaked revocation key (65) instead of a signature (65), the script (36) and the controlblock (34) (incl length prefix)
this will save 70 wu (17.5 vB) (keep in mind this is only applies to revoked commitments, for normal force closes we want to enforce the 1 OP_CSV)

antonilol · 2022-06-02T10:44:58Z

#995 (comment) makes it invisible for outside observers to identify the to_remote output in case of a revoked commitment. if there are some htlcs on it that are long expired and the second stage is broadcasted (like in the fee siphoning attack), the funds go to the local delayed pubkey + relative timelock. outside observers can now see which output was the to_local one, just search the output of an htlc 2nd stage tx in the commitment transaction.

example ctx: 15c262aeaa0c5a44e9e5f25dd6ad51b4162ec4e23668d568dc2c6ad98ae31023 (testnet)

the transaction with the expired htlc reveals the to_local output. (it is already revealed by the script, but this wouldnt be the case with a revoked taproot ctx)

this can be fixed by tweaking the local delayed pubkey with the hash of vout of the htlc on the ctx (something you can see on-chain, to make restoring backups easier) and some secret (so only you can do this, not outside observers). this secret can be static across commitments and stored in a static channel backup (this is one way i came up with, but there are of course more ways to change this key and still make restoring backups easy enough)

EDIT: no secret is needed, instead a taptweak like tweak can be done. everywhere where a local delayed pubkey is used, it is tweaked with sha256(pubkey || output index) (or a tagged hash) where output index refers to the output index of the output on the commitment transaction, this way there are no duplicates because there can't be two outputs at the same index.

for clarity: htlc outputs that send funds to the local delayed pubkey use a tweaked local delayed pubkey where the output index of the htlc output on the commitment transaction is used, not the htlc success or timeout tx

Crypt-iQ · 2022-06-10T19:14:02Z

for clarity: htlc outputs that send funds to the local delayed pubkey use a tweaked local delayed pubkey where the output index of the htlc output on the commitment transaction is used, not the htlc success or timeout tx

this would preserve privacy, but you'd also need to do this for the to_local and the local anchor output since if those are claimed, the delayed pubkey is also leaked. if the user doesn't claim their anchor, then only the counter-party would be able to claim their anchor (thereby leaking the local delayed pubkey) rather than anybody with the ability to watch the chain after the 16 CSV has elapsed. The keys could all be tweaked, but then perhaps there is more UTXO bloat if the anchors aren't claimed

antonilol · 2022-06-13T09:38:47Z

for clarity: htlc outputs that send funds to the local delayed pubkey use a tweaked local delayed pubkey where the output index of the htlc output on the commitment transaction is used, not the htlc success or timeout tx

this would preserve privacy, but you'd also need to do this for the to_local and the local anchor output since if those are claimed, the delayed pubkey is also leaked. if the user doesn't claim their anchor, then only the counter-party would be able to claim their anchor (thereby leaking the local delayed pubkey) rather than anybody with the ability to watch the chain after the 16 CSV has elapsed. The keys could all be tweaked, but then perhaps there is more UTXO bloat if the anchors aren't claimed

hmmm true, so it is either privacy, with no key reuse or no utxo set bloat.

btw another idea about anchors and less utxo set bloat:
with non taproot channels, anchors can always be claimed, because funding keys are used
with taproot, funding keys can also be used
party A and B both have a public key, the funding key becomes P = A * H(H(A || B) || A) + B (musig2 keyagg)
the anchor pubkey for A is A * H(H(A || B) || A) and for B is just its public key B
if one of the anchors is spent outside observers can calculate the other anchor because A = P - B and B = P - A (P = funding key)
the to_local and to_remote can also use these keys to ensure that if neither of the anchors is spent they can be cleaned up but that will cost some privacy

Crypt-iQ · 2022-06-13T15:16:42Z

P = A * H(H(A || B) || A) + B (musig2 keyagg) the anchor pubkey for A is A * H(H(A || B) || A) and for B is just its public key B if one of the anchors is spent outside observers can calculate the other anchor because A = P - B and B = P - A (P = funding key) the to_local and to_remote can also use these keys to ensure that if neither of the anchors is spent they can be cleaned up but that will cost some privacy

The KeyAgg routine here specifies some tweaking so the aggregation above may not always be the same (https://github.com/jonasnick/bips/blob/musig2/bip-musig2.mediawiki#key-aggregation). I think in your example an observer has a 50% chance of identifying B's funding_pubkey since it isn't tweaked. I am not sure if knowledge of the funding_pubkey actually gives an observer anything as I think these would have sort of an ephemeral nature as they are generated for the funding flow. A user could generate them outside of the funding context (say in their bitcoind wallet and regularly use the key for receiving/sending payments) and use them in the funding flow, but I don't see why a user would do that

antonilol · 2022-06-13T15:28:11Z

P = A * H(H(A || B) || A) + B (musig2 keyagg) the anchor pubkey for A is A * H(H(A || B) || A) and for B is just its public key B if one of the anchors is spent outside observers can calculate the other anchor because A = P - B and B = P - A (P = funding key) the to_local and to_remote can also use these keys to ensure that if neither of the anchors is spent they can be cleaned up but that will cost some privacy

The KeyAgg routine here specifies some tweaking so the aggregation above may not always be the same (https://github.com/jonasnick/bips/blob/musig2/bip-musig2.mediawiki#key-aggregation). I think in your example an observer has a 50% chance of identifying B's funding_pubkey since it isn't tweaked. I am not sure if knowledge of the funding_pubkey actually gives an observer anything as I think these would have sort of an ephemeral nature as they are generated for the funding flow. A user could generate them outside of the funding context (say in their bitcoind wallet and regularly use the key for receiving/sending payments) and use them in the funding flow, but I don't see why a user would do that

if this is a problem B can tweak the key before using it without A even knowing (also A has to do this because the lexicographically smaller key is tweaked), but i dont think it is, lnd uses a separate bip32 tree for this (separate from the wallet)

(btw without taproot funding pubkeys were revealed every time a channel was closed)

antonilol · 2022-06-13T15:30:55Z

The KeyAgg routine here specifies some tweaking so the aggregation above may not always be the same (https://github.com/jonasnick/bips/blob/musig2/bip-musig2.mediawiki#key-aggregation).

afaict the algorithm in this bip is generalized for 32 byte pubkeys and more than 2 signers, the 'simple' musig2 with the pubkey's with the parity bit known looks like this equation i used, btw i got it here https://github.com/t-bast/lightning-docs/blob/master/schnorr.md#musig2

instagibbs

Some old comments I forgot to submit

Most recent comment is noting that partial sigs are 32 bytes, so this needs explicit defining somewhere, since signature types seem to assume 64(may have missed it).

antonilol · 2022-06-27T20:45:20Z

#995 (comment)

P = A * H(H(A || B) || A) + B (musig2 keyagg) the anchor pubkey for A is A * H(H(A || B) || A) and for B is just its public key B if one of the anchors is spent outside observers can calculate the other anchor because A = P - B and B = P - A (P = funding key)

nvm this wouldn't work because keys are only revealed when swept without signature

to make this problem somewhat easier i suggest to remove the to_remote_anchor when a to_remote exists, and no longer 1 OP_CSV the to_remote output. the remote party can fee bump using his to_remote output. the only scenario where a to_remote_anchor would be needed is when there is at least 1 non dust htlc attached and the remote party has no (or below dust limit) to_remote output. this can happen when sending a non dust htlc directly after channel opening (the remote party always needs to have the channel reserve as balance, but doesn't have this just after opening)

now that the to_remote_anchor is out of the way (except for 1 case), things get easier
the to_local_anchor's internal pubkey will be local_delayed_pubkey, it is revealed after the csv delay when swept by the local party

this special case can of course be seen from both sides:

the local party who opens a channel, sends an htlc an force closes. both parties need an anchor output here. the remote party has no to_remote output because he has no balance and cant fee bump with that. the to_remote_anchor will have the remote_htlc_pubkey as internal key because it is revealed when expired (a second sig is needed there for 2nd stage), and fulfilled
(i switch local and remote here) the local party who just got a channel opened by the remote party and sent an htlc. the local party force closes. there is no to_local to reveal the anchor key, so the to_local_anchor's internal key will be the local_htlc_pubkey, the remote party doesn't need an anchor because he has a to_remote output

even more rare: revocation

no anchor keys are revealed here because with the revocation key the taproot key path is used. i don't know to make anchor sweepable in this case

long story short:

extension-bolt: simple taproot channels (feature 80/81) #995 (comment) wont work
only a to_remote_anchor when no to_remote exists but the is at least 1 htlc, its interal pubkey is the htlc pubkey
anchor use the delayed pubkey as internal key
if no to_local output exists the htlc pubkey is used

Questions/feedback welcome!

See lightning/bolts#995

erickcestari · 2026-03-11T17:54:45Z

+        * `htlc_timeout`:
+        ```
+        <local_delayedpubkey> OP_CHECKSIG
+        <to_self_delay> OP_CHECKSEQUENCEVERIFY OP_DROP


The prod tapscript variants that use OP_CHECKSIGVERIFY instead of OP_CHECKSIG ... OP_DROP leave the time-lock value as the final (and only) stack element. Script success then depends on that value being non-zero (since 0 is falsy in Bitcoin script).

This affects three scripts:

to_delay_script: <key> OP_CHECKSIGVERIFY <to_self_delay> OP_CHECKSEQUENCEVERIFY - final stack is [to_self_delay]

Accepted HTLC timeout: <key> OP_CHECKSIGVERIFY 1 OP_CSV OP_VERIFY <cltv_expiry> OP_CLTV - final stack is [cltv_expiry]

2nd-level HTLC outputs: <key> OP_CHECKSIGVERIFY <delay> OP_CSV - final stack is [delay]

Note that to_remote_script is not affected since it hardcodes 1 OP_CHECKSEQUENCEVERIFY, which always leaves a truthy [1] on the stack.

In practice this is safe, since to_self_delay must be positive per BOLT 2 negotiation, CLTV expiries are always future block heights, and CSV delays are always > 0. But unlike the legacy OP_CHECKSIG ... OP_DROP scripts where the final stack element was always the signature check result (1), these scripts have an implicit invariant that the time-lock parameters must be non-zero for the script to succeed.

Should we add a brief note in the spec calling out this invariant? Something like:

Note: because these scripts use OP_CHECKSIGVERIFY (which consumes the boolean result) followed by a terminal OP_CHECKSEQUENCEVERIFY or OP_CHECKLOCKTIMEVERIFY (which leaves its argument on
the stack), the time-lock value serves as the final truthy stack element. Implementations MUST ensure these values are non-zero.

There is also a minor specification consistency issue previously pointed out by @sstone and currently under review in an LND PR by @gijswijs. The HTLC-Timeout second-level output description (under “HTLC-Timeout Transactions”) still shows the legacy OP_CHECKSIG … OP_DROP pattern, while the test vectors use OP_CHECKSIGVERIFY for both success and timeout second-level outputs. The prose likely needs updating to match the test vectors.

Should we add a brief note in the spec calling out this invariant?

I think this is already explained in the accepted HTLCs section (https://github.com/Roasbeef/lightning-rfc/blob/simple-taproot-chans/bolt-simple-taproot.md#accepted-htlcs)?

I think this is already explained in the accepted HTLCs section (https://github.com/Roasbeef/lightning-rfc/blob/simple-taproot-chans/bolt-simple-taproot.md#accepted-htlcs)?

That note explains why CSV is omitted from certain HTLC paths. It doesn't address the stack semantics point. The broader issue is that across all three CHECKSIGVERIFY + terminal timelock scripts, the timelock value itself ends up as the final stack element, so script success implicitly requires it to be non-zero. That invariant isn't called out anywhere currently.

It's not strictly necessary since these values are always non-zero in practice, but a brief note would help readers who are tracing the stack logic and wondering what the final truthy element is.

Roasbeef · 2026-03-12T00:25:01Z

Gijs noticed what looks to be a typo (?) in the spec (I think incomplete search and replace when we modified the scripts). lnd uses the same scripts for the second level HTLC, but right now the spec has a combo:

timeout:

<local_delayedpubkey> OP_CHECKSIGVERIFY
<to_self_delay> OP_CHECKSEQUENCEVERIFY

success:

<local_delayedpubkey> OP_CHECKSIG
<to_self_delay> OP_CHECKSEQUENCEVERIFY OP_DROP

We have interop, so safe to assume we're using the uniform version (the Miniscript generated OP_CHECKSIGVERIFY version).

Roasbeef · 2026-03-12T00:29:38Z

these scripts have an implicit invariant that the time-lock parameters must be non-zero for the script to succeed.

We went back and forth a ton re this in the past. The scripts were modified to be slight more compatible with Miniscript, which generated this variant.

Optimistically pushed a commit to fix this (note that the test vectors were generated assuming uniform scripts for success+timeout).

t-bast · 2026-03-12T09:11:33Z

Gijs noticed what looks to be a typo (?) in the spec (I think incomplete search and replace when we modified the scripts). lnd uses the same scripts for the second level HTLC, but right now the spec has a combo:

timeout:

<local_delayedpubkey> OP_CHECKSIGVERIFY
<to_self_delay> OP_CHECKSEQUENCEVERIFY

success:

<local_delayedpubkey> OP_CHECKSIG
<to_self_delay> OP_CHECKSEQUENCEVERIFY OP_DROP

We have interop, so safe to assume we're using the uniform version (the Miniscript generated OP_CHECKSIGVERIFY version).

Good catch, I missed that one! We're always using the first version in eclair:

<local_delayedpubkey> OP_CHECKSIGVERIFY
<to_self_delay> OP_CHECKSEQUENCEVERIFY

So it's likely indeed just an issue in the spec, not in the implementations. I think you forgot to push your fix @Roasbeef?

Roasbeef · 2026-03-13T00:38:33Z

I think you forgot to push your fix

Indeed! Just pushed.

See lightning/bolts#995

t-bast

I'm getting a match on commitment transactions with a3ea39c, but I'm unable to generate the same HTLC signatures. Can you share how you're generating those signatures? I'm using deterministic schnorr sigs in eclair (not providing any auxrand data to secp256k1), are you by any chance using randomized signatures in lnd, which would make those test vectors non deterministic?

Roasbeef · 2026-03-16T23:28:06Z

@t-bast yeah so lnd uses a version of RFC 6979 for nonce generation. I'll modify the test vectors to use the BIP 340 auxrand version instead.

EDIT: pushed up!

See lightning/bolts#995

t-bast

ACK 13b2110, the test vectors match what eclair generates in ACINQ/eclair#3144 🎉

I think this is ready to go 🚀

See lightning/bolts#995

Roasbeef · 2026-04-06T20:16:42Z

lnd branch for interop: lightningnetwork/lnd#9985

sstone · 2026-04-08T09:13:29Z

lnd branch for interop: lightningnetwork/lnd#9985

All basic functional interop tests pass between lightningnetwork/lnd#9985 at lightningnetwork/lnd@db32cbb and ACINQ/eclair#3144 at ACINQ/eclair@535ea49:

opening channels (initiated by eclair or lnd)
sending and receiving payments
mutual closing of channels (initiated by eclair or lnd)

t-bast · 2026-04-08T09:15:02Z

Nice! We should be good to go to merge this PR once the commits are squashed then!

t-bast · 2026-04-17T14:04:12Z

@Roasbeef now that lightningnetwork/lnd#9985 has been merged, can you squash and merge this PR?

See lightning/bolts#995

In this extension BOLT, we specify the initial flavor of taproot channels to be deployed. This channel type uses musig2 aggregated keys and signatures for the funding output, making it a normal single signature key path spend. All outputs are then updated to use P2T2 (segwit v1) outputs. The coop close process has been simplified to always terminate, and the co-op close transaction now also flags RBF to make way for future schemes that enable the process to be restarted which enables co-op close fee bumping. A top-level key spend output is used to the revocation of HTLC outputs. The revocation for the local output uses a script path to ensure that information needed to sweep the anchors by 3rd parties is always revealed on chain.

ysangkok · 2026-05-04T20:10:41Z

bolt-simple-taproot.md now says "Extension BOLT XX: Simple Taproot Channels". Wasn't this supposed to get a BOLT number assigned?

Use the official feature bit and name for taproot channels and the corresponding channel types. Activate taproot channels support by default (without support for announcing such channels yet). See lightning/bolts#995

cdecker reviewed May 30, 2022

View reviewed changes