Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the -sys crate optional #703

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Kixunil
Copy link
Collaborator

@Kixunil Kixunil commented Jul 2, 2024

Downstream projects, such as bitcoin are using types from this crate and need to compile the C code even if they only use the types for checking validity of the data and don't perform cryptographic operations. Compiling C code is slow and unreliable so there was desire to avoid this.

This commit introduces pure Rust implementation of basic non-cryptographic operations (is point on the curve, decompression of point, secret key constant time comparisons) and feature-gates cryptographic operations on secp256k1-sys which is now optional. To make use of this, downstream crates will have to deactivate the feature and possibly feature gate themselves.

The implementation requires a U256 type which was copied from the bitcoin crate. We don't depend on a common crate to avoid needless compilation when the feature is turned off.

This is a breaking change because users who did cryptographic operations with default-features = false will get compilation errors until they enable the feature.

@Kixunil Kixunil added enhancement 1.0 Issues and PRs required or helping to stabilize the API labels Jul 2, 2024
@Kixunil Kixunil force-pushed the no-crypto branch 7 times, most recently from f465167 to 6ee130a Compare July 2, 2024 18:25
@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 2, 2024

Build failures are unrelated, I'm going AFK and I'm not sure if I'll come back until tomorrow, so @tcharding if fixing the CI is kind of task you'd like to do go ahead with a separate PR that fixes those things and I'll rebase this on top of it.

@apoelstra
Copy link
Member

@Kixunil thanks so much!!! I will dig into this.

I wonder -- there are two ways we might go about this:

  • Have the pure-Rust impls live in rust-bitcoin (or bitcoin-primitives rather), which would have an optional rust-secp dep.
  • Have the pure-Rust impls live here; then rust-bitcoin (or bitcoin-primitives) would have an unconditional dep on rust-secp which in turn would have an optional dep on secp256k1-sys.

What do you think?

@apoelstra
Copy link
Member

Or a third option -- we introduce another crate bitcoin-crypto. But that feels too much like a "bitcoin-utils" crate to me. I think that the public key and signature types ought to count as primitives and live in bitcoin-primitives. I'd also like to keep script::Builder::push_key which means that the primitives crate needs to have access to the key type.

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 2, 2024

Having it here is more useful - it helps also people who only need this crate and not whole bitcoin.

  • have an unconditional dep on rust-secp

Yes, for keys at least. But disable signing&co if the feature is off.

public key and signature types ought to count as primitives and live in bitcoin-primitives

I didn't want this originally but since we decided to wrap the keys I'm pretty open to it. I'll need to think if I can find a good reason to split them up.

I'd also like to keep script::Builder::push_key which means that the primitives crate needs to have access to the key type.

primitives -> keys dep doesn't sound too bad to me

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 3, 2024

I've been thinking a bunch about the dependencies and I found a potential use cases for having keys separated. For xpub -> address "calculators" that just let users generate addresses without dealing with checking the chain (donations, trusting friends etc) it'd be nice to not have to build code related to transactions etc. I think bip32 should be independent of that and possibly even Address (have the script_pubkey() method conditioned on primitives). Also Address requires keys for validation so if you're building an RPC client for something like BTCPayServer that doesn't deal with chain transactions but pre-validates inputs to give quicker feedback you don't really want to build a bunch of unrelated stuff. Further, keys can be no-alloc which primitives can't. (Though maybe we should at least provide Script?)

So I think bitcoin-keys makes sense. But now back to the idea of putting pure-Rust implementation into them. It's a strong no from me since the keys are already used in other applications, in particular LN node IDs are keys and conversions from Bitcoin keys would be super weird.

But there's another thing: in a different issue I've suggested that the -sys crate should be only used for system building and vendoring with renaming&stuff should be separated. The only way to achieve this is to have vendoring code and system-bindings code in separate crates because the -sys crate needs to tell Cargo it's a system bindings library to make it exclusive while the vendored code is supposed to not be exclusive. The vendoring library then needs to optionally depend on the -sys one and disable vendoring when the -sys feature is enabled.

In principle secp256k1 could be the vendoring library. However that forces it to have a build script which is maybe not nice? We could get rid of that requirement by just having optional secp256k1-bindings library implementing vendoring and depending on the -sys crate (effectively renaming the secp256k1-sys feature in this crate to secp256k1-bindings). Though it could be too much to have an entirely new crate just for that. Thankfully build-dependencies can be optional, so at least we don't always have to use the cc crate.

Finally we need to somehow keep the vendoring and -sys crate API in sync. Maybe the vendoring crate can provide the externs and the -sys crate would only take care of building but I'm not sure if that works. Will also likely need exact version specifier. If it doesn't we might be forced to depend on another common crate (secp256k1-header?).

So ultimately the dependency graph can one of these:

  • primitives, address, bip32 -> bitcoin-keys -> secp256k1 -> secp256k1-sys
  • primitives, address, bip32 -> bitcoin-keys -> secp256k1 -> secp256k1-bindings -> secp256k1-sys
  • primitives, address, bip32 -> bitcoin-keys -> secp256k1 -> secp256k1-bindings -> secp256k1-headers, secp256k1-sys -> secp256k1-headers
  • primitives, address, bip32 -> bitcoin-keys -> secp256k1 -> secp256k1-headers, secp256k1-sys -> secp256k1-headers

@apoelstra
Copy link
Member

Ok, great, that sounds good to me. ACK having bitcoin-keys though despite the name I would also like it to have signatures in it (and sighash flags). With an unconditional secp dep. Depending where we put the sighash types themselves, may also need an optional hashes dep.

As for splitting -sys into -bindings and -sys, I'm fine with this in theory but I don't really understand what you're getting at. What is the difference between "system building" and "binding"? It seems to me that the only thing secp256k1-sys does is binding, and having the renames etc is inherently part of that. Do you mean the actual C code should be in a separate crate? The C code is intimately tied to the bindings so I don't see why. And I'm not sure what you mean by "vendored code is not supposed to be exclusive". It definitely needs to be exclusive because C has no namespacing so you can only have one copy (with a given rename prefix).

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 3, 2024

I would also like it to have signatures in it (and sighash flags). With an unconditional secp dep.

In my mind, those are a different layer. Not all applications working with keys work with signatures since keys are a fundamental building block of addresses. However splitting them seems like too much so I guess it'll be fine. Will do more thinking.

By exclusive I mean you can only have one version. You can only link to a system library once (precisely because C has no namespacing). Cargo should know this but currently doesn't because we're missing it from Cargo.toml because putting it there would break vendored builds with multiple secp256k1 versions.

@apoelstra
Copy link
Member

apoelstra commented Jul 3, 2024

Cargo should know this but currently doesn't because we're missing it from Cargo.toml because putting it there would break vendored builds with multiple secp256k1 versions.

I'm confused. I thought this is what the links key in Cargo.toml was for, and we definitely do populate that.i

In my mind, those are a different layer. Not all applications working with keys work with signatures since keys are a fundamental building block of addresses. However splitting them seems like too much so I guess it'll be fine. Will do more thinking.

Agreed. Signatures and sighash types are such small/simple types and they're mostly useless without keys (agreed that keys are useful without signatures, ofc) so giving them their own crate feels like overkill. Even bitcoin-keys we may get some complaining about, but I'm convinced by your point that it can be no-alloc which is pretty great.

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 3, 2024

we definitely do populate that.

With a fake versioned name. The real library is called libsecp256k1.

@apoelstra
Copy link
Member

Yes, because we intend to be exclusive with respect to any specific verison of libsecp, but not across versions, since different versions might as well be completely different libraries as far as we are concerned.

I'm still not understanding what is wrong with our current approach.

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 3, 2024

I'm still not understanding what is wrong with our current approach.

The field is wrong when we deactivate symbol renaming. In an ideal world we would've set it programmatically but we can't.

@apoelstra
Copy link
Member

The field is wrong when we deactivate symbol renaming.

We cannot deactivate symbol renaming because we are compiling a C library. If you don't rename the symbols they will collide from the symbols from other versions of libsecp256k1.

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 3, 2024

We cannot deactivate symbol renaming

We literally have a feature to do that, so that users can link to the system library (what the -sys suffix actually means).

@apoelstra
Copy link
Member

If you mean the rust_secp_no_symbol_renaming cfg flag (not a feature), this exists for linking with libbitcoinconsensus and linking into Bitcoin Core, which have their own static copies of secp256k1 and whose maintainers can be trusted to make sure they are choosing the correct version of the library.

This library has never been intended to be used with "the system library" (I guess, some random git commit of libsecp that some distro maintainer packaged against our wishes and without our input) and I am not aware of anybody using it that way.

We can rename -sys to something else if you'd like (though I'd like to see a citation for what it "actually" means) but if we want to support linking to the system library that deserves its own separate issue where we investigate whether major distributions are packaging released version of libsecp and separating their symbols properly, how we can distinguish the different versions, etc. I think we have such an issue, because this has come up before a few times. It's a much bigger and separate project from just making -sys optional, which is what this issue is about.

@apoelstra
Copy link
Member

We have one, #657, which you opened. We should move discussion there about supporting dynamic linking.

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 4, 2024

The reason I brought this up now is that it potentially affects the name of the feature so if we ever want to stabilize this crate it needs to be resolved.

But maybe what we should really do is separate feature namespace (in the future, when we bump MSRV) and name the feature crypto because if we end up with the vendored code in this crate it won't have a meaningful dependency to describe what it is. (cc is not meaningful IMO). We can also do it without namespace and say that the crypto feature is supposed to be stable but the sys one isn't. (It'll also make typing the feature less annoying.)

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 4, 2024

Rebased on top of #705 to get it to compile.

@apoelstra can we merge #705 soon? It's likely blocking a bunch of work.

@Kixunil Kixunil marked this pull request as draft July 4, 2024 16:40
@apoelstra
Copy link
Member

Yep -- I just need to get my local CI up to date (I did some major changes about a month ago and haven't updated all my repos..) then I can do it. Should be less than an hour.

Downstream projects, such as `bitcoin` are using types from this crate
and need to compile the C code even if they only use the types for
checking validity of the data and don't perform cryptographic
operations. Compiling C code is slow and unreliable so there was desire
to avoid this.

This commit introduces pure Rust implementation of basic
non-cryptographic operations (is point on the curve, decompression of
point, secret key constant time comparisons) and feature-gates
cryptographic operations on `secp256k1-sys` which is now optional. To
make use of this, downstream crates will have to deactivate the feature
and possibly feature gate themselves.

The implementation requires a U256 type which was copied from the
`bitcoin` crate. We don't depend on a common crate to avoid needless
compilation when the feature is turned on.

This is a breaking change because users who did cryptographic operations
with `default-features = false` will get compilation errors until they
enable the feature.
@apoelstra
Copy link
Member

Ok, looking more closely at this PR. Can you break it up into a series of commits?

  • Introduce the non_c module and add fuzztest(s) checking that it parses exactly the things that the C code does (in fact maybe this should be its own PR, since it will have a large set of "real" comments while the rest will have comments about code organization, feature-gating, etc)
  • A commit that makes secp-sys optional, disabling Context and all types without it
  • A commit that makes the Context impl C-optional
  • A commit that makes PublicKey impl C-optional
  • A commit that makes SharedSecret impl C-optional

Also we should double-back and expose the Signature types, which are much easier than the pubkeys.

Overall this looks great. The nits that jump out at me are things like:

  • We should have a Zp::SEVEN constant alongside ZERO and ONE
  • I think the Zp::is_even function is wrong -- it looks like it's doing a check whether the value is high or not but it should just be checking the parity of the LSB
  • We need to MIT-license anything that is copied from upstream

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 5, 2024

I'll be happy to split the first part and add the fuzz tests but honestly I think splitting the feature gating stuff is a bunch of useless work. It's just a bunch of one-line changes. Can I skip that part?

Also we should double-back and expose the Signature types, which are much easier than the pubkeys.

Oh yeah, just compare the parameters.

  • We should have a Zp::SEVEN constant alongside ZERO and ONE

Weird, I swear I remember adding it...

  • I think the Zp::is_even function is wrong -- it looks like it's doing a check whether the value is high or not but it should just be checking the parity of the LSB

I found it super hard to figure out what it really means. Using the LSB caused some test to fail while the current code works and it's what I got from some SO post. I guess I'll dig deeper, perhaps into the C code.

  • We need to MIT-license anything that is copied from upstream

Can we MIT-license just the copied functions? Also maybe @KanoczTomas would be willing to relicense his work, we know each other personally.

@apoelstra
Copy link
Member

I'll be happy to split the first part and add the fuzz tests but honestly I think splitting the feature gating stuff is a bunch of useless work. It's just a bunch of one-line changes. Can I skip that part?

I meant to have pretty-much all the feature-gating stuff in the first commit. The others are meaningfully changing the logic by replacing the implementation when the C code is off.

Oh yeah, just compare the parameters.

It's a little bit more work than that but not much. You can look at the secp logic for parse_der and parse_der_lax IIRC for how to do it for ECDSA signatures. Schnorr signatures might have no validation at all, not sure. But if they do it'd be checking that s is in range and r is a valid x coordinate, both already implemented.

I found it super hard to figure out what it really means. Using the LSB caused some test to fail while the current code works and it's what I got from some SO post. I guess I'll dig deeper, perhaps into the C code.

The C code definitely just does &1 in fe_impl_is_odd.

Can we MIT-license just the copied functions? Also maybe @KanoczTomas would be willing to relicense his work, we know each other personally.

Yes, fine to just MIT-license the copied functions. I don't know @KanoczTomas and AFAICT he hasn't written any of the libsecp code. If you got it from somewhere else I didn't realize that.

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 5, 2024

I meant to have pretty-much all the feature-gating stuff in the first commit.

I thought you wanted to just add the Rust code behind fuzzing feature in the first commit.

parse_der and parse_der_lax

Oh, yeah, I completely forgot that was the reason I didn't put it in the PR - I wasn't in the mood to implement DER. :D

The C code definitely just does &1 in fe_impl_is_odd.

Yeah, I managed to find it but wasn't able to figure out what's wrong yet.

AFAICT he hasn't written any of the libsecp code

I copied some Rust code here from him but he also copied some from somewhere else. Note that most of my code is based on toy-secp256k1 which is completely my own, from scratch (except for U256 type and multiplicative inverse).

@elichai
Copy link
Member

elichai commented Jul 6, 2024

@Kixunil could you maybe add a few sentences on the use case for a pure rust compilation? (ie why do some users don't want to link in libsecp if they're using pubkeys etc.)

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 6, 2024

@elichai I think the most obvious example is parsing addresses. Addresses may contain keys, so we need to check them but there's no need to produce or validate signatures. And even within context of a larger application, one might want to pre-validate addresses or other types on frontend to avoid latency of reporting an error. (Very good for UX.)

@elichai
Copy link
Member

elichai commented Jul 6, 2024

@elichai I think the most obvious example is parsing addresses. Addresses may contain keys, so we need to check them but there's no need to produce or validate signatures. And even within context of a larger application, one might want to pre-validate addresses or other types on frontend to avoid latency of reporting an error. (Very good for UX.)

And you don't want to rely on LTO for that?
(ie if this was a big rust library you'd want it separated by features?)

@apoelstra
Copy link
Member

@elichai LTO doesn't make the C code inherently memory-safe, nor does it help compiling with immature toolchains (e.g. wasm-pack), nor does it reduce the code/API surface.

@Kixunil
Copy link
Collaborator Author

Kixunil commented Jul 6, 2024

And you don't want to rely on LTO for that?

More like I don't want to incur the cost of compiling and then throw away the result. Try to cold-compile this PR with and without the flag. The time difference is huge.

@kayabaNerve
Copy link

I don't, or I do yet minimally, use secp256k1. I almost entirely use the Rust k256 library as I produce signatures/manage keys on my end (not via the bitcoin/secp256k1 libs). Accordingly, this'd remove a dependency and I always appreciate that.

@apoelstra
Copy link
Member

I think this is a fairly common case and I'd also like to fix it -- while it's "safe" to use non-production-ready crypto libs for verification, in the sense that you'll never leak secret key data, in general if your verification code differs from consensus code that can still put you in a dangerous situation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.0 Issues and PRs required or helping to stabilize the API enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants