Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The principle of just-in-time xonly #32

Closed
LLFourn opened this issue Jun 29, 2022 · 21 comments
Closed

The principle of just-in-time xonly #32

LLFourn opened this issue Jun 29, 2022 · 21 comments

Comments

@LLFourn
Copy link

LLFourn commented Jun 29, 2022

I believe that when specifying Bitocin related protocols we should only use xonly keys at the last moment when we need to produce something that goes on chain e.g a key or signature. In retrospect xonly keys may have been a mistake but I think if we take this approach we can minimise the complexity they introduce and maybe make that 1 byte we save worth it!

Of course is not how the current spec is written. It takes in xonly keys into key aggregation and outputs a key in a kind of hybrid state that can be used to sign with as an xonly key but can also be tweaked as a ordinary key or xonly key. Applying the JIT xonly approach here means taking a two stage key generation approach. First, the inputs to key aggregation would be ordinary keys and the output would be an ordinary key. This ordinary key could then have bip32 tweaks applied to it. Eventually it could be turned into into a xonly key that could have xonly tweaks applied to it (no bip32 tweaks) and could be used to start a signing session.

Other than the complexity of all the negation accounting that has to go on with the way the spec is currently written what concerns me most is that people are going to want to do things like "nested MuSig" where each MuSig key may be another MuSig key (or even a FROST key). I can't clearly see how badly this will compound the problem but on the other hand I didn't expect xonly keys to introduce this much complexity in the first place.

So here's how I see pros and cons:

Pros:

  • Less likely to be sprawling complexity with nested schemes or composition with other schemes.
  • Only needs a single gacc negation flag (instead of the xor of three) applied at signing time. (see below)
  • BIP32 applied to keygen input and output is not ambiguous. Since MuSig is creating a xonly key to sign with then how should I apply BIP32 to it? I was thinking I should just put 02 in front of the xonly key. The correct answer is that there is GetBytes33 which returns the full internal ordinary key. It feels like MuSig can't make up its mind whether it works with xonly keys or not.
  • Set the precedent for the JIT xonly approach early so others can avoid mistakenly thinking that xonly inputs to ptotocols is what the experts think is the way to go.

Cons:

  • Introduces a two-stage MuSig key state
  • Breaks association of xonly as the public key for all schnorr signatures (although FROST will do this anyway so this is a pro to me)
  • xonly keys would have to be converted to ordinary keys before being used as inputs in MuSig. You might need a different keypair type for MuSig than for Schnorr.

For me the pros overwhelm the cons. Of course I have left out the "con" that this will require some work to rewrite and re-engineer the spec. I had planned to implement all of this in the python spec and make a PR but I was informed that I should make this suggestion as soon as possible because it was looking to be finalised soon.

Less negation variables

It is obvious why this change would get rid of the need of the gp/has_even_y(P) flag . The less obvious result is how gacc and g can be merged into one. This comes because you can set the initial value of gacc to has_even_y(Q) when you go from ordinary aggregate key to xonly. In fact you can do this simplification without changing the type of input keys so this suggestion is somewhat orthogonal. For example here I implement the transition from a ordinary aggregate key to a BIP340 xonly aggregate key. This still uses xonly input keys. Though since this change implies a multi-stage keygen context anyway it makes sense that these things go together.

Let me know what you think and whether it's worth pursing doing the python implementation. Sorry again for suggesting big changes this late in the process.

@niftynei
Copy link

niftynei commented Jul 5, 2022

fwiw "JIT xonly" is what I would have naively expected any signing implementation to do for all the reasons you've listed. Context: I've spent all of a few hours now (unsuccessfully) attempting to rewire Jesse Posner's python FROST impl to be xonly. JIT xonly is how I'm planning to do it (ordinary keys as inputs, xonly-ness only expressed as a property of the aggregated participant set).

I lowkey expected the other implementations (Nick Farrow's Rust PR/Posner's libsecp frost branch) would do it this way; I found it pretty confusing that they use xonly's at the participant subkey level as well.

xonly keys may have been a mistake

nodding

Apologies if I've gotten something a bit wrong here, I'm not exactly an expert at key manipulation.

@jonasnick
Copy link
Owner

Thanks for the detailed feedback @LLFourn. "Just-in-time xonly" is what we set out to do, but then realized:

  1. If some protocols only use x-only public keys for one reason or another, then it would be nice if we could support aggregating x-only keys. This is a violation of the JIT x-only principle. But it would be useful for protocols that pull public keys directly from the blockchain and aggregate them. Of course, I don't know if such a protocol would ever make sense. Also, one can imagine protocols that start their life with plain BIP-340 signing and therefore only send x-only keys around. In a new protocol version, nodes want to do MuSig but can't do that easily without changing the protocol to exchange full public keys. Thus, violating the principle is not unjustified. It's only a tiny increase in complexity (in the spec, not for users). So we did that.
  2. Applying plain tweaking after x-only tweaking is not something I can imagine being useful. But why artificially disallow that when we don't have to. It didn't even seem more complex (although Make keygen a two stage process #34 shows that allowing this adds minimal complexity to the negation logic in the spec), and therefore we did not disallow it.

Con:

  • xonly keys would have to be converted to ordinary keys before being used as inputs in MuSig. You might need a different keypair type for MuSig than for Schnorr.

In libsecp, we don't have an API for converting x-only keys into ordinary keys (see e.g. issue bitcoin-core/secp256k1#1097). That's mainly because there was no need for it so far, and x-only keys appear more understandable if only one direction (ordinary -> x-only) is possible. Besides allowing the key aggregator to convert x-only keys to ordinary keys, the signer must also negate the signing key if necessary before providing the key to the MuSig signing function. If we want to describe this "wrapper" algorithm somewhere in the BIP anyway, then why not directly include it in the main spec, i.e., allow x-only keys as input to KeyAgg?

I do understand the appeal of your proposal. But also, I can see someone complaining the week after we implement the changes that their favorite (niche) use case is broken or that keygen needlessly became more complex.

@real-or-random
Copy link
Collaborator

Sigh yeah, x-only keys save a byte on chain but it seems the price we pay is a high engineering complexity. I think it's fair to say that noone had really anticipated this [1].

The purpose of x-only keys are really just to save a byte. It's a tiny optimization, taking into account that space on the blockchain is really expensive. But it's only expensive there. In other places, e.g., normal networks, we can typically afford a byte without even thinking about it. With that in mind, the idea that the sign byte should be thrown away only just before hitting the chain makes sense, and this has been my thinking for a while now. So I tend to agree with this approach philosophically. Practically speaking, keeping the sign bit is just helpful in a variety of settings. First of all, it's just more convenient to keep the sign along with the public key instead of keeping a separate variable that track if you need to negate the secret key. Moreover, keeping the sign is algebraically the right thing to and you may run into problems if you want to do arithmetic with your keys [2].

On the engineering side of things, @jonasnick lists good reasons why we ended up what we have. It boils down to these two aspects:

  1. taking x-only keys as an input
  2. supporting plain and x-only tweaks in an arbitrary order

Both of these provide the caller with more flexibility and are almost for free within the spec. The spec is just minimally more complex but the caller does not need to care about the complexity. MuSig will handle all the complexity internally.

So for me, the discussion here boils down to this question:

  • Do we want to have an "all bells and whistles" BIP that takes every possible use case into account because, as @jonasnick says, someone may ask for it in a week?
  • Or do we want to have an opinionated BIP that advocates the idea of JIT x-only and as @LLFourn says, "set the precedent for the JIT xonly approach early so others can avoid mistakenly thinking that xonly inputs to protocols is what the experts think is the way to go"?

I tend towards the latter approach and have an opinionated spec. If we take "all bells and whistles" approach instead, we could still try to educate people about JIT. But from my experience, spec and API decisions (e.g., in secp256k1) have much more influence on what people will than any prose text (no matter it's a mailing list post, an informational BIP, or API docs) [2].

For example, I suspect that much of the current perception of the purpose x-only keys stems from the API in BIP340 and in secp256k1. All the algorithms take x-only keys, so this makes the impression that "x-only" is the new shiny thing that should be used. I still believe that BIP340 is a very good conservative signature scheme: As a signature scheme, it specifies a public key generation and serialization and if you use it standalone just like it's in the BIP340, this is rock solid. But BIP340 tends to ignore that protocols often need more than a standalone signature scheme, and this requires advanced key manipulation, e.g, key derivation, tweaking and key aggregation. Though the "clean room" approach of BIP340 has its merits (I believe that consensus rules should be timeless and somewhat independent of current trends in the ecosystem), I admit it appears strange in hindsight because i) BIP32 tweaking is in common use and BIP340 acknowledges this, ii) BIP340 was proposed together with BIP341 and BIP342 as part of Taproot, and at least BIP341 supports tweaking.

As a side note, for me the discussion is 90% about item 1 ("taking x-only keys as an input"). When it comes to item 2, I don't think that disallowing or allowing plain tweaks after x-only keys will make much of a difference or change how engineers and protocol designers perceive x-only key. If we disallow it now, it could easily be added later if someone needs it.

No matter what we decide on, I really believe we should invest the time now and get this right. It's not too late but I fear once MuSig is in production, it will be too late.

Another thing we should do is to educate people. This has two aspects.

  • First, they should learn about the hard facts: We can't get rid of all of the complexity that x-only keys and tweaking introduced but in the end, I don't think it's even that complicated. A simple way to approach it is: Look at the entire chain of key derivation (BIP32/plain tweaking, MuSig KeyAgg, Taproot/x-only tweaking, whatever you imagine). Whenever you negate the a public key (which you do because you encounter odd y but need an xonly key with even y) negate the secret key."
  • Second, we should provide guidance on how to design protocols. Should protocol designers use x-only keys in their protocols or should they use compressed keys? Good cryptographic practice recommends to be conservative and if all you ever need is a rock solid signature scheme, and you don't care about tweaking and key aggregation, then use BIP340. (One of our goals in BIP340 was to provide a good signature scheme that can be used beyond Bitcoin consensus.) But if you're more reckless, and I tend to think that this applies to almost every protocol in this quickly evolving and innovative space, you should typically use compressed keys simply because they offer more flexibility and avoid some of the complexity that comes with advanced key manipulation. It's a little bit like a when you specify a version field for a new network protocol. You'd probably take at least 32-bit unsigned integer for simplicity. You probably don't except that you'll ever need more than 10 versions but you also don't care about "wasting" 3 bytes (or actually 3.5 bytes = 28 bits). In the same way you don't care about sending 33-byte keys instead of 32-byte keys. Yes, this is not a perfect analogy: you'd need to convert 33 bytes before you pass them to signing or verification but that's just a single function call. But this single function call may make your life much easier in other places or in the future.

No matter what the outcome here is, I'll volunteer to write a mailing list post or blog post. If there's real interest and others would like to join, this could be turned into an informational BIP.

[1] It's so bad, we can't even agree on a spelling. BIP340 spells it "X-only", some people write "x-only", and some write "xonly", probably in order to save a byte.
[2] See @ajtowns' TAPLEAF_UPDATE_VERIFY idea. Though I believe there's a nicer way to workaround the problem described there. I'll reply to the email soon.
[3] For example Fedimint, where @elsirion says "Federation-internal keys that are used for schnorr sigs and MuSig2, they are always kept as x-only since that avoids conversions (MuSig takes x-only keys as input)."

@instagibbs
Copy link

instagibbs commented Jul 7, 2022

One comment I have is that PSBT taproot fields only deal with x-only pubkeys, which may also frustrate a standardization of which keys are used at what stage, since PSBT is implicitly a "not yet on chain" format.

As-is this fits with @jonasnick saying:

But it would be useful for protocols that pull public keys directly from the blockchain and aggregate them. Of course, I don't know if such a protocol would ever make sense.

which means you can use PSBT to do this aggregation.

@LLFourn
Copy link
Author

LLFourn commented Jul 8, 2022

Thanks @jonasnick for the response. It was very helpful to understand your thinking.
The crucial point for me is here:

In libsecp, we don't have an API for converting x-only keys into ordinary keys (see e.g. issue bitcoin-core/secp256k1#1097). That's mainly because there was no need for it so far, and x-only keys appear more understandable if only one direction (ordinary -> x-only) is possible.

I want to make the case that x-only -> ordinary has to be made easy and used liberally. If we only have ordinary -> x-only then it necessarily means that people will get "stuck" in x-only land because you can only go in but never get out like a roach motel. But x-only land sucks because it's not a cyclic group so you need to do all this negation tracking to maintain the isomorphism to the scalars you are tracking. I may be biased but I think crypto engineers have a hard enough time getting things right without this problem!

From the user's perspective (the most important one) what you said makes total sense to me. Torture the implemntor on behalf of the user is often the right way to go, especially in crypto engineering. In general, pushing the problem down to the crypto layer allows applications to be simpler and make less mistakes. I think that this case is an exception to the that rule because:

  • (i) the degree of torture is significant. Implementing prime-order group crypto where the inputs and outputs can't be half the group elements is really annoying while the complexity of calling x-xonly -> ordinary is not much to ask and seems hard to get wrong.
  • (ii) it will cause the proliferation of x-only which further exacerbates (i). If MuSig uses x-only keys because someone might want to use BIP340 keys as input then also everyone who might want to use MuSig key aggregation or BIP340 will now also have to use x-only keys as input and so on and so forth.
  • (iii) When x-only proliferates it compounds the complexity of the composing those x-only schemes together. Take for example the "nested" MuSig key aggregation idea. It seems simple to do ordinary key inputs and outputs (and two-stage keygen): just multiply the schnorr challenge by gacc of the outermost (x-only) key before doing anything and proceed as if x-only didn't exist. I fear that things will be far worse with the spec as it currently is though I could be wrong.1

Besides allowing the key aggregator to convert x-only keys to ordinary keys, the signer must also negate the signing key if necessary before providing the key to the MuSig signing function. If we want to describe this "wrapper" algorithm somewhere in the BIP anyway, then why not directly include it in the main spec, i.e., allow x-only keys as input to KeyAgg?

Ah ok now I recall that BIP340 doesn't define a kind of "keypair" structure or a function that independently allows you to negate the scalar if it's image does not have an even y. Indeed this would at least require describing that algorithm so that people with secret keys in the form that BIP340 specifies can convert them to "standalone" xonly secret keys (i.e. a keypair). I think we have to embrace the idea of making as many conversion functions as it takes so people can leave x-only land as soon as possible.

@real-or-random wrote:

The purpose of x-only keys are really just to save a byte. It's a tiny optimization, taking into account that space on the blockchain is really expensive. But it's only expensive there. In other places, e.g., normal networks, we can typically afford a byte without even thinking about it. With that in mind, the idea that the sign byte should be thrown away only just before hitting the chain makes sense, and this has been my thinking for a while now. So I tend to agree with this approach philosophically. Practically speaking, keeping the sign bit is just helpful in a variety of settings. First of all, it's just more convenient to keep the sign along with the public key instead of keeping a separate variable that track if you need to negate the secret key. Moreover, keeping the sign is algebraically the right thing to and you may run into problems if you want to do arithmetic with your keys [2].

After thinking along the same lines I came to the conclusion that x-only on-chain wasn't a mistake. Taking x-only keys in BIP340 was the mistake. There would be almost no issue if when Bitcoin deserialized an x-only key from the network it simply prepended a 0x02 byte on them before doing anything with them. The exception might be taproot internal keys where we should have found a bit somewhere to store y-evenness to allow the internal key to be ordinary.

First, they should learn about the hard facts: We can't get rid of all of the complexity that x-only keys and tweaking introduced but in the end, I don't think it's even that complicated. A simple way to approach it is: Look at the entire chain of key derivation (BIP32/plain tweaking, MuSig KeyAgg, Taproot/x-only tweaking, whatever you imagine). Whenever you negate the a public key (which you do because you encounter odd y but need an xonly key with even y) negate the secret key."

I agree if we can make the collateral damage of x-only to be just follow this rule then we've got out relatively unscathed. The main reason I like two stage keygen is because you can express this principle clearly in code. For example, the code for applying an xonly tweak becomes:

g = 1 if has_even_y(Q_) else n - 1
gacc_ = g * gacc % n
tacc_ = g * (tacc + t) % n

So when we tweak the key to get an xonly key we also flip the sign of pre-image components we are tracking (tweaks and key) if needed. My main problem with allowing ordinary tweaks after x-only tweaks is that it means you cannot closely stick the principle (or at least my imagination of how it should be applied).

No matter what the outcome here is, I'll volunteer to write a mailing list post or blog post. If there's real interest and others would like to join, this could be turned into an informational BIP.

I think this would be really useful. Thanks!

[1] It's so bad, we can't even agree on a spelling. BIP340 spells it "X-only", some people write "x-only", and some write "xonly", probably in order to save a byte.

It looks like I have used at least two variations so far this post.

@instagibbs wrote:

One comment I have is that PSBT taproot fields only deal with x-only pubkeys, which may also frustrate a standardization of which keys are used at what stage, since PSBT is implicitly a "not yet on chain" format.

Is there any field in PSBTs thus far that you think violates the JIT X-only principle? I think all the TR fields specified so far are for keys that must be x-only for consensus. PSBTs will need musig and frost fields which will handle their keys and sigs separately I expect so there may be no problem here.

Footnotes

  1. I'll try and implement nested MuSig to see with one or both schemes to see if I'm correct over the next week. I think there is no security proof for nested MuSig but I guess we all expect that it will be secure?

@rustyrussell
Copy link

I had assumed x-only maximalism: eschewing a byte which holds almost no cryptographic strength and has a weird representation (02 vs 03, really?). Avoiding ever transporting it would be a long-term win!

This assumption that x-only was The Future had two results on my work:

  1. bolt12 uses x-only pubkeys for node ids and payer keys.
  2. I implemented ECDH using x-only pubkeys (for centurymetadata.org).

Details:

  1. bolt12 node ids meant that we had to look at the public node graph and see how to get the compressed key (02 or 03), with the assumption that new hidden nodes could simply be 02 (or try both...). payer-keys are simply used to create signatures to prove various things, which just seems to work.

  2. ECDH(secret, x-only-pubkey): if secret would produce an 03 key, I negated it. But also only did an SHA256 of the x-only part, not the compressed key. In a future world with x-only pubkeys everywhere and flying cars, explaining that you have to prefix 02 before doing SHA256 would be just weird, right?

I had assumed Lightning and the world would simply migrate to x-only, and maybe half the people would negate their secrets and be done.

But if x-only keys are a weird bitcoin-only optimization, this approach is wrong! Oops. It would be better, if so, to fix this soon.

@ajtowns
Copy link

ajtowns commented Jul 11, 2022

  1. If some protocols only use x-only public keys for one reason or another, then it would be nice if we could support aggregating x-only keys.

You can trivially convert an x-only key to a compressed key by prefixing with 02 though? That gives you point(xonly)==cpoint(compressed), so you prefix with 02 if sending over the network, call cpoint() with data from the network, and just call point() if it's your own key, everything should work consistently. Presumably if you've generated that x-only key as -secret*G, then you already remember somewhere that you've done that negation (or will figure it next time you use the private key)...

For musig2, I would have thought the simplest way to spec would be:

  • input/transmit public keys as compressed points
  • choose a tweak, and generate an aggregate key as a point
  • generate/transmit the nonces as compressed points
  • do a bip340 compatible signature for the aggregate key, and either calculate s = r + H(R,P,m)*p or s = -r + H(R,P,m)*p depending on whether exactly one of the aggregate key or aggregate nonce turned out to be odd, and, if the aggregate key was odd, when aggregating the partial sigs, negate s.

You don't have x-only tweaks in this model, just tweaks. For taproot, if P is odd, you want Q = -P + H(P.x, m)G which is -Q = P + -H(P.x, m)G which just means you're negating the tweak before putting it into musig, and negating Q (before possibly negating Q again) after getting it out. (Alternatively, just allow passing in g=1 or g=-1 directly, letting the caller decide if it wants to make that conditional on has_even_y(Q) or not, which I think regains the "plain/xonly tweaks in any order" feature)

(I don't think there's any actual value to the ability to have odd nonces here; but seems better to be consistent with the pubkeys?)

  • Second, we should provide guidance on how to design protocols. Should protocol designers use x-only keys in their protocols or should they use compressed keys?

My feeling, that I can't entirely justify, is that anytime that you want to do maths on the points, you want them to be points; but if you're only doing a signature/lookup then xonly is fine. That's not "JIT x-only" though -- the way taproot works by starting with an x-only point, then revealing a parity bit works fine too. The problem that TLUV hit was (arguably) that we only added one parity bit, when there were actually two x-only points we were trying to do maths with when validating Q = P + H(P,M)*G.

For the maths Rusty's doing with ECDH, I don't think doing things as x-only or not makes any difference at any point -- the x-coord of x*Y, -x*Y, x*-Y or -x*-Y are all the same -- so doing all the points as x-only seems sensible. So maybe it's just something that depends on the details of the protocol being designed?

@jonasnick
Copy link
Owner

Since we're discussing whether to recommend x-only public keys or ordinary public keys for usage outside of Bitcoin consensus, it's worth noting that the latter has an unfortunate downside. If the verifier drops the first byte of the ordinary public key and runs BIP-340 verification, then signatures are valid with two different public keys (namely 02||x and 03||x). It's unclear to me where this would be an actual problem, but it's something to mention in @real-or-random's post.

Assuming that we switch to a JIT x-only world (at least for protocols that want more than BIP-340), a clarifying mailing list post / BIP could contain:

  1. A specification of the x-only -> ordinary conversion and an explanation of when (not) to use it
  2. A specification of an algorithm that produces a new secret key from a given secret key x as follows
    f(x):
    * Let P = x⋅G
    * Let x' = x if has_even_y(P), otherwise let x' = x
    * Return x'
    

Then, crypto BIPs that work with ordinary public keys can refer to this document to cover the "parties only exchanged x-only keys"-usecase. Besides MuSig or FROST, one could imagine a ring signatures BIP that only accepts ordinary keys, but is intended for ring signatures over the UTXO set.

Another argument for preferring ordinary public keys instead of x-only keys (e.g. as input to key aggregation) is that crypto BIPs will be more similar to or even compatible with specifications in the non-bitcoin world. For example, if BIP-MuSig2 is not changed, a potential BIP-FROST would have to decide whether it wants to be more close to the FROST IETF spec and potentially share code or more close to BIP-MuSig2 for consistency in the Bitcoin world. JIT x-only can improve that situation.

@LLFourn

There would be almost no issue if when Bitcoin deserialized an x-only key from the network it simply prepended a 0x02 byte on them before doing anything with them.

If I understand your proposal correctly, this would require two modes of signing. If has_even_y(P) is false for point P=x*G and secret key x and you create a signature for on-chain you need to negate your secret key. If you create an off-chain signature you don't. That doesn't seem ideal either.

@ajtowns

You don't have x-only tweaks in this model, just tweaks.

I cannot follow. What exactly is the advantage of this?

@real-or-random
Copy link
Collaborator

Since we're discussing whether to recommend x-only public keys or ordinary public keys for usage outside of Bitcoin consensus, it's worth noting that the latter has an unfortunate downside. If the verifier drops the first byte of the ordinary public key and runs BIP-340 verification, then signatures are valid with two different public keys (namely 02||x and 03||x). It's unclear to me where this would be an actual problem, but it's something to mention in @real-or-random's post.

It's not exactly elegant that each (signature, message) pair is valid under two public keys, and I agree it looks footgunny. But I pretty much doubt it's a problem in practice. ECDSA has the same property (pubkey recovery will give you two pubkeys), and I'm not aware of a single problem that this created.

But if I remember correctly, avoiding this "two valid pubkeys" property was one of the reasons why we have designed BIP340-verify to accept x-only keys, instead of accepting ordinary keys and dropping a byte. Maybe accepting ordinary keys would have been the better choice. As @LLFourn pointed out, It may be instructional to think of BIP340 to work with ordinary keys, and simply see x-only as a storage optimization: if the first byte of the public key is not relevant, then why transmit it?

@real-or-random
Copy link
Collaborator

For the maths Rusty's doing with ECDH, I don't think doing things as x-only or not makes any difference at any point -- the x-coord of x*Y, -x*Y, x*-Y or -x*-Y are all the same -- so doing all the points as x-only seems sensible. So maybe it's just something that depends on the details of the protocol being designed?

Indeed, ECDH is an exception because the situation is a little bit simpler here: If you take only the x-coord of the shared secret point, then the inputs don't really matter. This can even allow for faster algorithm sometimes. (For details, see bitcoin-core/secp256k1#994 but read the entire thread... I needed a few postings to understand the problem...)

I think ECDH is not only marginally relevant for this discussion, which is complicated enough just with signatures. But this exception demonstrates, we can only give guidelines to protocol designers and users. Regardless of whether you agree with JIT x-only, there's no absolute rule when to use ordinary keys and when to use x-only keys. Even drawing a line between "on chain" and "off chain" seems hard in some cases such as PSBT.

@ajtowns
Copy link

ajtowns commented Jul 12, 2022

Since we're discussing whether to recommend x-only public keys or ordinary public keys for usage outside of Bitcoin consensus, it's worth noting that the latter has an unfortunate downside. If the verifier drops the first byte of the ordinary public key and runs BIP-340 verification, then signatures are valid with two different public keys (namely 02||x and 03||x). It's unclear to me where this would be an actual problem, but it's something to mention in @real-or-random's post.

When you do a bip340 verification, the "public key" it's validating the signature against is the x-only value, and there's only one of those that will validate -- or at least, that's how I'd view things. You could then say that "okay, but there's two possible inputs to the musig2-bip340-sign that generate the same pubkey", but then I'd argue the same's true in lots of places: if you encode the pubkey as compressed that gives you the same result as if you'd encoded it uncompressed; and if you start of with privkey s, you get the same result as if you'd started off with privkey -s.

Another argument for preferring ordinary public keys instead of x-only keys (e.g. as input to key aggregation) is that crypto BIPs will be more similar to or even compatible with specifications in the non-bitcoin world. For example, if BIP-MuSig2 is not changed, a potential BIP-FROST would have to decide whether it wants to be more close to the FROST IETF spec and potentially share code or more close to BIP-MuSig2 for consistency in the Bitcoin world. JIT x-only can improve that situation.

Specs that use JIT x-only would be closer to the security proofs for the algorithm they're speccing as well, perhaps?

You don't have x-only tweaks in this model, just tweaks.

I cannot follow. What exactly is the advantage of this?

Oh, I guess you could keep having x-only tweaks if you wanted, I was more thinking that you could leave it for the musig2 user to decide. The advantage is just that it reduces the special cases you need in the musig2 spec; but if you need to spec it out somewhere else anyway, maybe that isn't an advantage.

@robot-dreams
Copy link
Collaborator

Thanks for the helpful discussion so far. I'll add my current thinking (mostly for future reference):

  • I agree with the benefits of taking ordinary keys as input to key aggregation, in order to avoid having the API implicitly communicate "you should be using xonly keys everywhere".

  • I don't yet agree that we should make it easy to go xonly -> ordinary, as I really do believe ordinary -> xonly is a one way operation. I heard an analogy that going ordinary -> xonly is like going xpub -> ordinary (i.e. dropping the first byte compared to dropping the chain code).

  • Currently, an "xonly tweak" operations consists of (make_even, apply_tweak), which results in internal keys being ordinary keys (so we need a final make_even at the end). In contrast, in @LLFourn's approach, we need an initial make_even to move to the 2nd stage, and then an "xonly tweak" operation consists of (apply_tweak, make_even), which results in internal keys being xonly keys. Would keeping the previous approach where internal keys are ordinary keys, and the final make_even is a separate step, be more in line with the JIT philosophy?

  • I'm somewhat on the fence about the general idea of moving complexity out of the specification and assuming users can keep track of when negation is needed (though this would in fact be the "Worse is Better" approach :-P).

Overall, I think it does makes sense to switch to ordinary keys as inputs to key aggregation. If a week later someone complains about a particular use case, that will give us helpful information about what additional guidance would be helpful to provide, and "ordinary keys as input + guidance for xonly edge cases" seems more helpful than "xonly keys as input, implicitly encourage people to use xonly everywhere" (or a crazier idea like "give maximum flexibility to users by allowing BOTH ordinary and xonly keys as input, but recommending ordinary keys by default, and just handle the resulting complexity within the spec").

@Roasbeef
Copy link

A bit late in the discussion here (kinda having trouble following the discussion at this point tbh), but just wanted to provide my 2 sats. Just for reference we (Lightning Labs) have so far integrated musig2 into:

  • lnd itself: experimental API that exposes musig2 sessions
  • Pool (to be deployed): all accounts will use aggregated keys between them and the auctioneer with musig2 used for the happy path
  • Loop: new sub-swap type that uses P2TR and aggregated keys, musig2 used as a happy path to make the swaps cheaper and blend in more

In the BOLT spec draft for an initial version of taproot-enabled channels, "ordinary" pubkeys are used everywhere (most of the existing open_channel+accept_channel messages stay the same w/ the new nonce fields added), with both sides implicitly mapping to an x-only key when aggregation is needed. I have some draft code for this, but it isn't fully functional yet, but I haven't run into any major snags w/ this approach.

W.r.t the "added complexity" of the accumulator values in musig2 when handling tweaks: when it was first added to the spec I thought it was super confusing, but then when I went to code it up, it was actually pretty simple (accumulate these values during the main loop, etc).

In terms of where we stand w.r.t the "JIT x-only" approach, we already do this in a sense, depending on what type of boundary is drawn. Our boundary is any normal key can be used (btcec.PublicKey) as inputs into the set of routines in the new schnorr package. Once things cross that API boundary, a conversion happens to ensure we're always operating on the x-only version of the key. We went with this direction as it meant that existing code/logic to handle public keys could stay mostly the same (no new explicit btcec.XPubKey type or w/e). The main implication of this, is that at times behind the scenes we do a full round trip (schnorrDecide(schnorrEncode(key))), so we can ensure we're always working w/ a valid x-only keys. We don't ever go from "x-only to ordinary", and don't have an API at the btcec level for that (tho you can hack one together by just appendig the byte and using the normal compressed decode/encode).

W.r.t Loop+Pool, from an API perspective, in order to maintain backwards compatibility, we always send the fully compressed key around between client and server. This lets us use the same proto field, with the map to x-only keys only happening when we need to stick something in a script, or aggregate keys (snip off the first byte and parse as an x-only coord). For the most part, the integration of musig into these two services were done by engineers that didn't directly work on any of the schnorr stuff, so they just saw it all as a black box w/ the provided API during their integration. FWIW, the distinction between x-only and ordinary keys at times only came up when writing serialization unit tests: "how should we compare these two values?".

@jesseposner
Copy link

A 32-byte x-only key is as well specified and unambiguous as a 33-byte compressed key, so it should always be possible to convert in either direction. It's the representation of the private key that creates the ambiguity.

If a keypair wants to start its life as a 33-byte compressed key, and then later be used as a 32-byte x-only key, the private key must also potentially change (i.e. negated) or a parity bit must be tracked. If the negated private key, or the parity bit, is stored, then bi-directional conversion at the public key level does not create any ambiguity at the private key level. Perhaps we should classify a keypair as a bip340 keypair or non-bip340 keypair, rather than thinking of a single secp256k1 keypair type as compatible with both x-only and ordinary keys and handling the negation in signing APIs.

With FROST, we can guarantee that the private key is for the 32-byte x-only aggregate public key by handling the negation logic during keygen instead of at signing. This permits an x-only -> ordinary function without the private key shares needing to track a parity bit when we go ordinary -> x-only because the private key shares will always combine to the private key for the 32-byte x-only public key without any negation required.

@LLFourn
Copy link
Author

LLFourn commented Jul 19, 2022

@robot-dreams wrote:

I don't yet agree that we should make it easy to go xonly -> ordinary, as I really do believe ordinary -> xonly is a one way operation. I heard an analogy that going ordinary -> xonly is like going xpub -> ordinary (i.e. dropping the first byte compared to dropping the chain code)

Hmm I'm not sure about this analogy. There are ways of going from ordinary -> xpub (just make up a random chaincode). The difference is it's not useful to ever do that. Here it is useful so I think it should be made easy. It is also more well defined: we have an unambiguous choice (02) for the data to add in the inverse function.

@jonasnick wrote:

If I understand your proposal correctly, this would require two modes of signing. If has_even_y(P) is false for point P=x*G and secret key x and you create a signature for on-chain you need to negate your secret key. If you create an off-chain signature you don't. That doesn't seem ideal either.

Yeah you're right I it's less ideal than I imagined. It does sound a bit better to me though. It feels like the right way to pay the cost of having x-only keys rather than muddling up the implementation of the core cryptographic schemes.

@real-or-random wrote:

But if I remember correctly, avoiding this "two valid pubkeys" property was one of the reasons why we have designed BIP340-verify to accept x-only keys, instead of accepting ordinary keys and dropping a byte. Maybe accepting ordinary keys would have been the better choice. As @LLFourn pointed out, It may be instructional to think of BIP340 to work with ordinary keys, and simply see x-only as a storage optimization: if the first byte of the public key is not relevant, then why transmit it?

IMO we shouldn't introduce the possibility for two valid public keys for one signature just to make the APIs more consistent. Like you said we just have to accept BIP340 the way it is and I think this is too much of a footgun.
Actually, I wonder what you'd think of pairing "JIT-xonly" with the principle of "never lie to the user" i.e. if we're just going to drop the y coord in the function always make it xonly in the API in the first place. This implies that ECDH which does "x-only" arithmetic should take x-only keys as well.

LLFourn added a commit to LLFourn/secp256kfun that referenced this issue Aug 16, 2022
@jonasnick
Copy link
Owner

My takeaway from this thread so far is:

  1. We want to switch from x-only to plain public keys as input to the MuSig2 algorithms, done in Switch from X-only to plain public key inputs #37 (merged).
  2. (does not affect the BIP): We want to recommend using x-only public keys if users plan only to use signatures. Otherwise, we recommend plain public keys.
  3. (does not affect the BIP): We should describe a process from x-only public keys to ordinary public keys. But this is only relevant for niche applications (others in this thread disagree on this).

The suggestion to make keygen a two-stage process hasn't been merged into the BIP so far (#34). I'm not convinced that this is worth doing. I agree that it would slightly simplify the negation logic. But more importantly, users would have to deal with multiple keygen contexts and understand what they're for. @LLFourn Do you still want to pursue this?

@LLFourn
Copy link
Author

LLFourn commented Sep 5, 2022

Hey @jonasnick. The proximate problem is that the spec tests specify the behavior of applying plain tweaks after x-only tweaks. This means I can't pass that test with my implementation. This is of course not really a big deal for me but I do think that the spec would be improved by not making a spectest for this behavior:

  1. If you are doing this then you are doing something wrong. Good implementations should error out rather than let you continue to waste your own time.
  2. The semantics of it don't make sense to me. What is getPlainPubkey meant to do after an xonly tweak? The spec has decided it returns the internal 33 byte form of the point. You could easily have expected it to return the even-y form of the key you are signing under (this is what I expected at an earlier revision).
  3. It necessitates a very difficult to understand section of the spec. I think this section can and should be removed if the changes in Make keygen a two stage process #34 are applied as the logic for negation becomes much more self-explanatory (it would still be great to have @real-or-random's great xonly explainer document and link to it though).

This doesn't mean that the spec has to be changed in the way I suggested in #34. It could simply be a sentence forbidding calling GetPlainPubkey and applying a plain tweak after an xonly tweak has been applied. Implementations themselves can figure out how to interpret that. So I recommend at least (i) changing the spec tests from being in the form:

{
            "tweak_indices": [0, 1],
            "is_xonly": [false, true],
}

to

{
            "plain_tweak_indicies": [0],
            "xonly_tweak_indicies": [1]
}

and (ii) adding a sentence or two explaining the implementors decision here. Like:

Bitcoin protocols are naturally constructed such that plain tweaks (e.g. BIP32) are applied prior to x-only tweaks (e.g. BIP341) but for simplicity of our exposition the algorithms we define here allow retrieving the plain public key via GetPlainPubkey and applying plain tweaks even after an x-only tweak has been applied. Implementations may prohibit this as it probably represents a mistake by the programmer.

I still think that changing the code and spec like in #34 would be beneficial to remove that section from the spec but certainly understand if you don't think it's worth it at this point. I will live even if it remains in its current state.

@jonasnick
Copy link
Owner

jonasnick commented Sep 8, 2022

@LLFourn Thanks again for your detailed feedback. That's very valuable.

  1. It necessitates a very difficult to understand section of the spec

Not sure if we could get rid of it with #34. You still accumulate state consisting of gacc and tacc, and it'd be good to explain how that yields the correct signing key.

(i) changing the spec tests from being in the form

What's the advantage of doing this?

(ii) adding a sentence or two explaining the implementors decision

Mentioning somewhere that not all implementations support every feature of the BIP seems fine. Variable length messages are similar; there may be implementations that only need 32-byte messages. In your suggested sentence, it's unclear how "plain tweaks are applied prior to x-only tweaks" implies that GetPlainPubkey after x-only tweaking is a mistake. The libsecp function secp256k1_xonly_pubkey_tweak_add outputs a plain public key. In #68, I tried a minimal change that adds a comment to the test vector for plain after x-only tweaking (and increases test coverage for implementations that do not support plain after x-only).

@LLFourn
Copy link
Author

LLFourn commented Sep 14, 2022

@LLFourn Thanks again for your detailed feedback. That's very valuable.

  1. It necessitates a very difficult to understand section of the spec

Not sure if we could get rid of it with #34. You still accumulate state consisting of gacc and tacc, and it'd be good to explain how that yields the correct signing key.

IMO #34 follows the rule that if you negate the image to maintain an even y-coordinate you negate the pre-image components in the next lines of code to align the pre-image. I don't think you need to a section explaining how following that rule yields a correct scheme in this spec. It could belong in a more general document about x-only that @real-or-random suggested.

(i) changing the spec tests from being in the form

What's the advantage of doing this?

To make invalid tweak combinations (plain after xonly) unrepresentable in the test vectors.

(ii) adding a sentence or two explaining the implementors decision

Mentioning somewhere that not all implementations support every feature of the BIP seems fine. Variable length messages are similar; there may be implementations that only need 32-byte messages. In your suggested sentence, it's unclear how "plain tweaks are applied prior to x-only tweaks" implies that GetPlainPubkey after x-only tweaking is a mistake.

The only reason to GetPlainPubkey is to do plain tweaks which I claim is always a mistake after doing an x-only tweak. For me it's quite confusing that GetPlainPubkey() is no necessarily equal to lift_x(GetXonlyPublickey()). This is why when I implemented it I didn't want the two methods to exist at the same time on the same type.

The libsecp function secp256k1_xonly_pubkey_tweak_add outputs a plain public key. In #68, I tried a minimal change that adds a comment to the test vector for plain after x-only tweaking (and increases test coverage for implementations that do not support plain after x-only).

Thanks! I think this can be closed with #68.

@real-or-random
Copy link
Collaborator

real-or-random commented Sep 16, 2022

It could belong in a more general document about x-only that @real-or-random suggested.

By the way, I still want to do this but I'll be traveling a lot during the next 1.5 months, so I don't think I'll find time for this before November. If anybody else wants to start a doc earlier, please feel free to go ahead, of course.

@jonasnick
Copy link
Owner

Closing with #68.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants