Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Submission of Botan / Cryptography Community Project: Leg 1 Proposal #57

Merged

Conversation

ldillinger
Copy link
Contributor

Per instructions, I have solicited feedback from the community on discourse (first draft, final draft) and reddit (first draft, final draft), I am now officially submitting this community project proposal.

@david-christiansen
Copy link
Contributor

Thank you very much for the proposal - I've been following your work with interest.

For those reading along, here's a link to the rendered proposal.

Copy link
Contributor

@david-christiansen david-christiansen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very exciting proposal!

It generally seems to be something that's fairly uncontroversially great. I have a few concerns from a due dilligence perspective:

  • What are your skills and background that give you the expertise necessary to know that the bindings aren't undermining important properties? Even a link to a CV would be good.
  • What is the plan for keeping this maintained for the coming years? "Give it to the community" almost never works.
  • It would be really useful if you had a user lined up who would be using these bindings "in anger" while you write them. This would be a stakeholder. Having users makes libraries much better!
  • I worry about bugs here - what are the challenges to using some of the good Haskell techniques to make bugs less likely? Why is using NIST test vectors so difficult, rather than just the thing that is always done?

I think this sounds basically good, but could use some precision. What does the rest of the committee think?


Cryptography is an essential part of modern computing and information security, but it requires care, nuance, and knowledge to properly implement, and is extremely easy to mess up. It covers a diverse range of functions, with individual algorithms often having their own particular quirks that must be acknowledged to properly and safely use. Failure to properly implement cryptography carries the risk of severely compromising the security of any system in which it is used. Manually implementing common cryptographic primitives such as hashes, ciphers, digital signatures, big-integer arithmetic is thus considered extremely unwise, both in terms of security and performance.

Cryptography is even harder for functional programming languages. Properly implementing techniques such as sanitizing memory, performing operations in constant time, are already difficult enough in languages such as C, and in a lazy functional language it is made even more difficult by such things as the potential for references to be kept alive within thunks. As a result, the space of functional cryptography is relatively under-explored and under-developed, having favored other languages where such things are easier to manage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: why are thunks worse that closures here? Is it just their ubiquity?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, That is just colloquialism / lax terminology. I am not sure whether thunk vs closure matters - the concern is that secure data may be captured, whether itself as part of a closure, or because its cleanup was be deferred in a thunk and not forced, and either way it can result in sensitive data staying in memory for longer than intended.

Lazy evaluation presents a challenge towards managing and clearing sensitive data* from memory, as care must be taken in order to force finalizers / cleanup appropriately. This is in addition to the challenges that strict languages already face (eg, ejecting from caches / registers).

* keys, secure texts, personally identifiable information

I should probably also emphasize that this effect pertains to lazy evaluation, not functional languages in general.


Cryptography is even harder for functional programming languages. Properly implementing techniques such as sanitizing memory, performing operations in constant time, are already difficult enough in languages such as C, and in a lazy functional language it is made even more difficult by such things as the potential for references to be kept alive within thunks. As a result, the space of functional cryptography is relatively under-explored and under-developed, having favored other languages where such things are easier to manage.

However, if we accept this downside as a part of a trade-off, and explore the ways that functional programming is beneficial to cryptography (monads, type systems, referential transparency), there is then also the potential for functional cryptography to be made *easier*. The functional machinery of Haskell is well-suited for adding important contextual control to cryptography, and can help prevent or eliminate many issues and errors in ways that languages such as C cannot. I believe that there is room for significant improvement, given a directed effort to develop the space.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the term "contextual control" - it would be nice to expand that out a bit here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good idea.

I am colloquially referring to the collective properties and design advantage of functional purity, type systems, and monadic interfaces, and how these provide a stricter (as in law, not as in evaluation) context to write functions in than the free-for-all that many imperative / OOP languages can offer. As a high-level example, we could separate off / restrict cryptographic operations into their own monad, in order to securely store and access sensitive values without allowing them to escape, not unlike how ST hides IO to safely provide locally mutable references.

I'll rework this a bit.


# Problem Statement

Cryptography in Haskell lacks significant capability beyond basic primitives. This places a significant burden on developers to properly implement various security techniques, and exposes end users to significant risk in the event of a lapse in security. Some companies have built their own solution to this, but there is no community-driven, community-owned solution.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Cryptography in Haskell lacks significant capability beyond basic primitives. This places a significant burden on developers to properly implement various security techniques, and exposes end users to significant risk in the event of a lapse in security. Some companies have built their own solution to this, but there is no community-driven, community-owned solution.
Cryptography in the Haskell library ecosystem lacks significant capability beyond basic primitives. This places a significant burden on developers to properly implement various security techniques, and it exposes end users to significant risk in the event of a lapse in security. Some companies have built their own solution to this, but there is no community-driven, community-owned solution.

What's wrong with a company building and supporting this? Is it a licensing issue, one of incentive alignment, or something else? I'd generally want to have someone get paid to do this kind of work, after all!

Copy link
Contributor Author

@ldillinger ldillinger Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is nothing wrong with a company building it per se*, rather the issue is of ownership of the resulting ecosystem, and the resulting incentive misalignment:

  1. cryptography and security are a preventative measure and a cost center for most companies
  2. companies for which cryptography / security are a product have a strong incentive to marshal users into a walled garden at first in the name of security, but then later in the name of profit
  3. Developing an ecosystem is a significant up-front investment, and so investors may expect ownership of the ecosystem in order to recoup their cost
  4. There is an incentive to wait for someone else to develop an open ecosystem, instead of expending resources yourself to do it

As such, for-profit approaches are incentivized to either invest in a private ecosystem, or invest in an established open ecosystem, but are perversely incentivized against building a new open ecosystem.

For reference, technically I am a company here, as I have my own LLC set up in order to protect my intellectual property, due to a previous employer having built a product based on it behind my back while acting interested in investing in my work - it was a harsh lesson. Not all companies are like that, but the profit incentive is exceedingly strong, and so care must be taken to outline the ownership of the resulting ecosystem, in order to engender trust. I should make some revisions here to better elucidate this.


Cryptography in Haskell lacks significant capability beyond basic primitives. This places a significant burden on developers to properly implement various security techniques, and exposes end users to significant risk in the event of a lapse in security. Some companies have built their own solution to this, but there is no community-driven, community-owned solution.

Cryptography in Haskell is also fragile and outdated, having seen considerable flux / churn over the years as various libraries have been developed and then deprecated or abandoned in favor of newer ones. This has placed the long-term stability of important libraries such as `tls` and `x509` at risk of falling behind as compilers are upgraded, old standards are updated, and new standards are implemented.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Cryptography in Haskell is also fragile and outdated, having seen considerable flux / churn over the years as various libraries have been developed and then deprecated or abandoned in favor of newer ones. This has placed the long-term stability of important libraries such as `tls` and `x509` at risk of falling behind as compilers are upgraded, old standards are updated, and new standards are implemented.
Haskell cryptographic libraries are also presently fragile and outdated, having seen considerable flux / churn over the years as various libraries have been developed and then deprecated or abandoned in favor of newer ones. This has placed the long-term stability of important libraries such as `tls` and `x509` at risk of falling behind as compilers are upgraded, old standards are updated, and new standards are implemented.

What does "fragile" mean here? I'd expect it to mean that the library doesn't work well (e.g. segfaults when given slightly bogus input), but it sounds to me like you're instead talking about the development process for the libraries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is accurate. I have two factors in mind when I say 'fragile':

  1. Difficulty in ensuring adoption of new libraries when old ones are deprecated - the cryptography ecosystem in haskell suffers from this extensively as the organization of modules and libraries has changed over time while code has moved from cryptohash / cryptocipher to crypton and cryptonite. As a result, there are multiple cryptography libraries with almost the same interface, because of a shared lineage, and it is not always clear which should be used preferentially. This promotes rot as dependents of unmaintained libraries finally stop working.

  2. These libraries have highly concrete implementations / bad abstractions / interfaces which are strongly influenced by their implementation, but have propagated due to shared lineage. While the interfaces themselves haven't really changed much, the same code has been moved from library to library, in part because it violates the abstraction barrier such that if we change backends, it could require breaking the interface - which makes the logic itself brittle. Yet, there is still so much built on it, which also must change in order to move forward.


There are several reasons for this:

1) Cryptography involves a lot of stateful low-level bit twiddling and random IO, something Haskell is not known for being good at. Classes like `Bits` and `FiniteBits` classes are critical for cryptography, but are terribly awkward to implement (like `Num`). Also, `ByteString` does not implement them for nuanced reasons.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is random IO in the sense of "random access to mutable arrays" rather than "IO happens sometimes", right?

A link to the ByteString issue here would be convenient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should rephrase this as cryptographic non-determinism rather than 'random IO', and differentiate it from 'worldly' non-determinism, which is your more traditional IO. This is probably way more of an answer than you expected, but there is also a very fine distinction between entropy and pseudorandom generators, and existing Haskell libraries do not do a very good job of separating it all out, especially when we use pseudorandom generators as a substitute for entropy sources.

Ideally, non-deterministic values should just be exposed as an input, and the resulting function would be pure and deterministic. However, nonce mismanagement has caused many security lapses, and so many functions / protocols / libraries remove user randomness / nonce handling in favor of hiding it as an implementation detail, either taking a random generator state as an argument, or even hiding it as an internal global / IO reference, which makes the resulting function non-deterministic / impure. A good example of this is bcrypt, which takes a random generator context as an argument.

Properly representing the difference between determinism, ideal entropy, and pseudorandom generators requires differentiating functions of the following forms, even though they are all related:

deterministicFoo :: Int -> a -> b
randomFoo :: (Random m) => a -> m b
pseudoRandomFoo :: Pseudorandom g => g -> a -> (g, b)
impureFoo :: a -> IO b

Note that these are not the same when it comes to cryptography and entropy. deterministicFoo represents a single sampling event, randomFoo represents an event that hasn't been sampled yet and will be different every time you do, and pseudoRandomFoo represents an event that has been sampled so its predetermined but you don't know the result yet. You can simulate Random with a secret state a la (State g, Pseudorandom g) => ... but there is a lawful difference between them.

For reference, the Botan FFI RNG objects are closer to entropy sources (although they internally use PRNG with automatic reseeding from system entropy sources).


Regarding Bits and FiniteBits, this is worth getting into and I made a very rough sketch of a Boolean class that may have spawned a conversation or two in the devlog and in a side thread. Essentially, Bits (and the fixed-width FiniteBits) is a class that combines two concepts:

  1. Boolean / Heyting algebra, for which there is no concept of any 'internal bits' or structure and you are just operating on some object as a whole
  2. Things that are contructed from / encoded into a set of 'internal bits' using some encoding, for which individual bits are indexable / addressible, and thus individually boolean-operable. There is also a link here to indexing / representable, but that is getting way out of scope.

Notably, the documentation itself further states that "The Bits class defines bitwise operations over integral types.". Technically this means that only things that are integer numbers should be Bits and FiniteBits, but Godel numbering rears its ugly head, and that just adds the question of whether Bits's boolean operations are intended to apply to an object or its encoding, and as a result Bits and FiniteBits restricts it to objects for which the boolean operations are isomorphic to boolean operations over its encoding.

Given this understanding, ByteString does not have a canonical Bits instance because it has no canonical integer representation - a given bytestring could represent many such integers, whether it be big endian or little endian, it could have a sign bit, or be two's complement. Its bits might not be contiguous or even ordered. It could easily satisfy a Boolean instance, but the issue of indexing individual bits and integral type requirement interferes with a full instance of Bits.

As if that wasn't confusing enough, we could go further and involve finite fields and bit-field and how bit-fields are fields of bits but not all fields of bits are bit-fields. Whether or not we should tie Bits and FiniteBits to Field and FiniteField is a whole extra conversation.


- Single point of failure

This is a proposal for one full-time engineer; this constitutes a single point of failure. To combat this, the official github repo will be owned by the Haskell Cryptography Group, and another member of the Haskell Cryptography Group will be selected to hold backup keys and permissions to the website, server, and official repositories as needed. In the case of an emergency for which I am unreachable, this will allow for others to take over as necessary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the HF staff also be able to help recover from sudden loss of maintainers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct - I will add it.


- Extended X509 support
- Stream ciphers
- Test vectors for algorithms, especially [CAVP / FIPS / NIST-approved](https://csrc.nist.gov/projects/cryptographic-algorithm-validation-program)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like something that could be extremely valuable in its own right. What if you create a general lib for testing crypto, and then use it in your test suite and also make it available for other tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - the value of test suites has been discussed and recognized elsewhere in this PR, and will be emphasized.

Comment on lines 324 to 327
- botanium
- botanite
- crypto-schemes
- crypto-schemes-botan
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm much less convinced of the case for funding these libraries - these sorts of wrappers seem to be something easier to build outside the project. Or am I missing something important?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These libraries are being elided from this proposal in the name of clarity / focus / avoiding confusion.

- Tracked issues
- Unit tests

A project website will also be created.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? What value does it bring beyond that in the Github repo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was that it is low-effort and aids in discoverability. Many successful Haskell libraries have at least a minor web presence, and it could be as simple as a page or site hosted on github.io linked to from the haskell cryptography group page. It would also be a place to link to funding / donations mechanisms and to acknowledge supporters and contributors.

This can be moved to optional deliverables / nice-to-haves, or elided completely.


## Budget

The first leg of this proposal presents a minimum budget of $7000 USD per month, for one full-time engineer, for a duration of 3 months, for a total of $21,000 USD. This budget is based on cost-of-living and industry experience, and will cover housing, food, bills, taxes, and other life necessities for one engineer, as well as any project necessities such website and server hosting. This budget is roughly equivalent to $40 / hr or $84k / yr at full-time of 40 hrs / wk. Industry rates for an engineer of the necessary skill are on average considerably higher, and so we consider this budget to be reasonable. The exact legal contract / arrangement is left to the Haskell Foundation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about ongoing project management? Do we need to line up a volunteer to make sure the project is staying on track?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a point of contact to regularly check in would be of assistance in keeping the project on track. I will add to the propsal something to this effect.

@ldillinger
Copy link
Contributor Author

ldillinger commented Sep 20, 2023

Thank you for the all of the feedback and questions! I'll be responding to it today as I go through it all.

@ldillinger
Copy link
Contributor Author

ldillinger commented Sep 22, 2023

Thanks for your patience - it takes me a bit of time to compose a reply to everything, and the response I've gotten is both wonderful and slightly terrifying, as to an ant that has suddenly found itself under a microscope.

None of this response is binding statements, feel free to discuss any point point for adjustment, as I am merely trying to lay out my intent.

Starting at the top:

What are your skills and background that give you the expertise necessary to know that the bindings aren't undermining important properties? Even a link to a CV would be good.

I have uploaded my resume and cover letter for your perusal.

I don't have any particular schooling in cryptography, but I have had a lifelong interest, and a great deal of practical industry experience in applying it. I've been the most security conscious person on every team that I've been on, and I've done lots of practical work like setting up secure auth for APIs, building login systems from scratch and properly hashing passwords, implemented secure key storage mechanisms, session tokens, 2FA, SSH, etc. You wouldn't ask me to build you a novel hash or cipher, but I can use primitives appropriately to implement things like sparse merkle proofs safely.

What is the plan for keeping this maintained for the coming years? "Give it to the community" almost never works.

What is being given to the community, is control - a pre-emptive fork to avoid any such churn or confusion as experienced with crypton/ite, as those events are in part what lead to my starting this project.

I will still be developing from and maintaining my own private fork of this code, but I will keep it synced with the official repo for as long as I am developing and maintaining it. I do have a direct use case for these libraries (and the greater proposal) as well, and if they are successfully completed, I will have a personal interest in seeing them maintained for several years to come.

I intend this short term proposal to ensure the completion of bindings to the Botan libraries, while giving us time for the long-term discussion of future legs to occur - including such things as long-term maintenance and future work involving higher abstractions. We could add the development of a future roadmap to the deliverables, to emphasize this.

It would be really useful if you had a user lined up who would be using these bindings "in anger" while you write them. This would be a stakeholder. Having users makes libraries much better!

I consider myself one such stakeholder (my own work has been slowed by many of the issues mentioned in this proposal, and I will be angrily using this library in the future to re-implement a bunch of old code), but I believe there are other potential such users: there are a few important libraries that are currently using crypton/ite (especially any X509 certificate stuff) that may be could see dependencies switched to botan when it is complete: tls, hpack, warp are examples. Part of the reason the hypothetical libraries like botanite are mentioned is to help explore what an answer to this question might look like.

I worry about bugs here - what are the challenges to using some of the good Haskell techniques to make bugs less likely?

Ensuring prompt release and avoiding accidental retention of sensitive data due to lazy evaluation is arguably the thing that I am most concerned about. It makes it difficult to analyze, and the scary problems problems aren't solvable on the Haskell layer anyhow, they're deeper. We can zero something out and order it cleared from memory, but that's just a command given to the OS and it doesn't necessarily do anything to clear values from higher caches and registers, which is were a lot of attacks now occur - and this isn't Haskell specific.

However, Haskell does have many highly-used and highly-tested resource-control mechanisms that we can use to ensure that the orders are sent out appropriately, and guarantee that at least from within the Haskell program itself there is no ill behavior. Safely wrapping a cryptography library will test my Haskell skills regarding strictness and FFI more strongly than it will test my cryptography knowledge.


I am transcribing the rest of my responses from my notes, but I must halt for a bit as I have a medical appointment, and wanted to get these big ones out first.

@gbaz
Copy link
Collaborator

gbaz commented Sep 25, 2023

I think we really should bring in other cryptography stakeholders and experts from the ecosystem here to judge some of the high-level goals more clearly.

I naively have a some questions regarding why extending nacl work further is insufficient. The proposal details its lack of support for post-quantum algos. What else does it lack? In particular, the main (not only) use-case we have for this stuff is afaik A) jwt tokens and the like and B) SSL encryption. Is there another use case I'm missing? And for that use case alone, would extending and building on the nacl bindings suffice?

In particular, if we pick say the narrow goal of "replacing all crypton/ite dependencies in widely used libraries" is botan the shortest path to get us there, or can we go through nacl?

@ldillinger
Copy link
Contributor Author

ldillinger commented Sep 25, 2023

I naively have a some questions regarding why extending nacl work further is insufficient.

NaCl / libsodium is not a general purpose cryptography library; it is a highly opinionated and minimalist cryptography suite designed for safe and ergonomic use at the expense of control.

  • It makes all of the decisions for you
  • It only implements a handful of common primitives
  • It only implements a single algorithm variant for each primitive
  • Interop with non- NaCl systems is difficult to impossible if NaCl doesn't support the algorithm or primitive.

In fact, NaCl only provides:

  • SHA-2 (or sometimes Blake2b) hashing
  • HMAC-SHA-512/256 message authentication
  • AES-GCM cipher encryption
  • Salsa20 stream encryption
  • Ed25519 / X25519 public key signatures / agreement
  • Poly1305-Wegman-Carter one-time authentication
  • Salsa20-Poly1305 AEAD
  • X25519-Salsa20-Poly1305 public key encryption

Granted, NaCl does a good job of choosing a commonly used set of algorithms, but it is unsuitable as a general purpose cryptography library because it does not support anything except for these few select algorithms and primitives. If you need SHA3 hashing, or CBC mode for decryption, or RSA public / private keys, or arbitrary named elliptic curves, then NaCl can't really help.

In particular, if we pick say the narrow goal of "replacing all crypton/ite dependencies in widely used libraries" is botan the shortest path to get us there, or can we go through nacl?

This is speaking as a huge fan of NaCl who has used it preferentially for almost a decade. It is excellent at what it does, but it is limited, limited enough that we cannot use NaCl to replace crypton/ite, as it simply lacks the coverage.

On the other hand, Botan does provide nearly the same coverage as crypton/ite in terms of algorithms and features, though there are some differences / gaps. It is not exactly a drop-in replacement, but there would be a high level of compatibility possible depending on how much of the crypton/ite interface you wanted to keep.

You might be interested to know that the hypothetical botanium library that is mentioned in the proposal (under Optional / Future Technical Content) is effectively asking "What if NaCl / libsodium / saltine were re-implemented using Botan?"


I hope I've not left everyone hanging after my last response; it has taken me a few days to recover and start getting back to responding to everything here. I'm not gone though, just thinking about what I want to say!

@m5
Copy link

m5 commented Sep 26, 2023

Hi! I work on security at a haskell fintech, and I'm excited to see this proposal.

In my mind, what I'd love to see for haskell are:

1. Really good sodium/nacl bindings.

Like @ldillinger said, these are well-respected, highly-opinionated libraries that guide developers towards building robust crypto systems.

I don't think it would make a lot of sense to extend our nacl bindings with more algorithms, because the idea behind nacl is that the developer doesn't need to worry about the algorithms -- they've chosen great defaults for most crypto use cases, and provided a high-level interface to help you implement it correctly.

I like @stouset's comment on libosidum that hits the same points eloquently.

But the entire reason NaCl is recommended so frequently is because it makes it difficult for novice cryptographers to go too far off the rails (at least insofar as making obvious, known-bad choices like non-authenticated encryption, appending data to a key before hashing for authentication, or using weak primitives like MD5 or SHA1). Using something like NaCl obviously won't ensure that a novice will get it right, but it can absolutely prevent entire known classes of getting it wrong.

That said, I also think we need:

2. A really good crypto kitchen sink.

I really like this reddit comment on a thread asking "Best Python package to use for cryptography"? That says:

If you can choose the key agreement/cipher/signature protocols, then libsodium is a reasonable choice. If you need to interoperate with existing commonly used protocols (RSA, X.509, etc.) then Cryptography is a reasonable choice.

I think that'd be a really great position for haskell to be in as well.

I'd love to be able to use nacl for all our encryption needs, but we have to work with various partners who didn't make the same choices nacl did. For instance we need a SHA3 implementation for one partner, ECDSA for another, bcrypt because our existing passwords are hashed with it (and arguably it's still the best).

Cryptonite has served us well here, but it always seemed odd and slightly frightening that the primitives seemed to be custom C/Haskell written for the project.

This is the niche that I could see the botan-bindings project filling. I like the suggestion from arybczak that ideally, eventually the primitives from crypton would be switched over to the new library, so that everyone could benefit from them.

I would say, picking botan to back the crypto kitchen sink would be an unusual choice. A quick survey of languages I'm familiar with:

  • Python's sink is backed by OpenSSL
  • Ruby's sink is just raw OpenSSL bindings
  • Rust has native crypto efforts, but the popular Ring library is based on BoringSSL (an OpenSSL fork)
  • Go's crypto is native, the maintainer says it was done that way because their C cross-compilation story made OpenSSL bindings impractical.

It makes me think that OpenSSL bindings would at least be the "nobody gets fired for buying IBM" choice here.

That said, I do think the Botan bindings would put us in a better position than we're in today, and significant effort has already been made towards them. While Botan doesn't have as many eyes on it as OpenSSL, it certainly has more eyes than crypton's current implementation, and @droidmonkey lays out a good case for its security:

In 2017 it was reviewed and approved by the German BSI (Federal Office for Information Security) for government use and is used by open source projects such as strongSwan, ISC KEA, and Shadowsocks-qt5, and companies including Rockwell Automation, Panasonic, Mazda, IBM, Bosch, PSPDFKit, and Rohde & Schwarz among others

@droidmonkey
Copy link

I was tagged, so I'm going to drop my 2 cents. I replaced GPG with the very comprehensive crypto library (botan) in KeePassXC several years ago.

NaCl / libsodium is not a general purpose cryptography library; it is a highly opinionated and minimalist cryptography suite designed for safe and ergonomic use at the expense of control.

This is very true and the reason we could not use NaCl. However, this reason was driven by previous decisions by KeePass (orginal) when picking their cryptography standards. If we could reinvent kdbx, I wouldn't pick anything outside of what NaCl provides.

Botan has a fantastic API, but it also suffers from some legacy baggage (don't we all). It also relies on string-based primitive selection (eg, "AES-256/CBC") instead of a builder pattern or enums. This essentially makes the API completely undiscoverable without careful reading of the documentation.

@gbaz
Copy link
Collaborator

gbaz commented Sep 26, 2023

This is speaking as a huge fan of NaCl who has used it preferentially for almost a decade. It is excellent at what it does, but it is limited, limited enough that we cannot use NaCl to replace crypton/ite, as it simply lacks the coverage.

This is very helpful, thanks. Just to put a fine point on it, let me be more specific: Would it be possible to use what nacl provides to just replace cryptonite for serving/retrieving https content, or is that already too far outside of its capabilities (or somehow too limited for the universe of servers and clients that now exist)?

And I suppose I should followup on the comment by @m5 -- OpenSSL wasn't considered in your list of possible alternatives (and I know there are bindings, of... ok... quality to it in haskell). Can you give an assessment of how it does or doesn't meet your criteria as well?

Thanks for spelling all this out!

@gbaz
Copy link
Collaborator

gbaz commented Sep 26, 2023

In terms of getting wider feedback: it would be good if a new post could be made to the discourse on a new thread announcing that this is an official proposal seeking comment.

Additionally, I understand that you are a member of the haskell cryptography group. It would be good if we could get an enumerated list of the members (couldn't find one on the website) and make sure all are invited to comment explicitly.

I understand from a reddit thread that Mercury was interested in this. I would also be interested in their comments on here.

@m5
Copy link

m5 commented Sep 26, 2023

👋 Mercury here 😄

In addition to what I wrote above, I would say we'd likely be early-but-cautious adopters of the bcrypt implementation, and we'd likely start pivoting our other needs to the new sink as the risk tradeoff began tipping in its favor.

@m5
Copy link

m5 commented Sep 27, 2023

Would it be possible to use what nacl provides to just replace cryptonite for serving/retrieving https content

I really don't think so. Https includes a really broad set of primitives, and nacl provides a really small set of primitives.

I could imagine the primitives in NaCl could suffice to interact with some subset of https clients, but there's also this definitive "No" from tptacek on HN, and NaCl definitely doesn't implement everything used by hs-tls from crypton.

That said: I wouldn't recommend reimplementing hs-tls with botan either.

If we have someone with the time and energy to move hs-tls away from crypton, I'd strongly recommend moving to back it with openssl directly instead. Implementing TLS is the primary use case of openssl, and it provides high level APIs to make TLS easy to work with. Like check out http-client-openssl. With the HsOpenSSL bindings and 144 lines of haskell, Snoyman avoids the entire tls/crypton ecosystem.

My understanding is that the benefit of the current tls library is that it's self-contained within the haskell ecosystem, and doesn't require dynamic linking with any C libraries, while an openssl binding would, but the same would be true of a botan-based TLS reimplementation.

As far as our neighbors go:

  • Ruby: They use openssl directly
  • Python: They use openssl directly, rather than going through their crypto sink
  • Rust: They have a really nice native_tls crate that uses Windows or OSX's native tls implementations if available, falling back to OpenSSL. They also have a tls reimplementation backed by their BoringSSL crypto sink.
  • Go: They implemented TLS from scratch because they couldn't bind OpenSSL at the time.

Anyway. Major tangent. I still think we need a good crypto sink, but I'd suggest not thinking about this proposal as a way to solve any tls ecosystem issues.

@gbaz
Copy link
Collaborator

gbaz commented Sep 27, 2023

I don't think the above is a tangent at all, and definitely appreciate the clarification!

The major use of crypton/ite through the haskell ecosystem is tls. I agree that even without this, there's many other needs for good crypto primitives. However, the scale of impact is very different, and thats worth understanding to have a proper discussion on this proposal -- i.e. I think the vast majority of transitive deps on crypton/ite are precisely for tls. It used to be that getting the linking for openssl libraries was very fiddly and not reliable across different platforms -- maybe its less so now, and maybe we can just try to encourage users to migrate to that for their tls needs...

@ldillinger
Copy link
Contributor Author

ldillinger commented Sep 27, 2023

@m5

I still think we need a good crypto sink, but I'd suggest not thinking about this proposal as a way to solve any tls ecosystem issues.

Agreed. In addition to this, the TLS support in Botan will need some C++ patching because the C FFI support is incomplete, additional work that is hard to estimate - hence why it is an optional deliverable. TLS is a nice extra to have in Botan, just as is having a Multiple Precision Integers implementation available, but they are not the primary goal. Edit: But they are still very much a goal!

@gbaz

It used to be that getting the linking for openssl libraries was very fiddly and not reliable across different platforms -- maybe its less so now, and maybe we can just try to encourage users to migrate to that for their tls needs...

I have been giving more thought to the OpenSSL libraries; although this initial proposal is for Botan bindings, the higher abstractions that I am developing against it (eg, the crypto-schemes stuff that isn't directly related to Botan) could use OpenSSL as an alternative backend.

Botan was selected in part due to this thread about reviving dead botan bindings, from which the project was started, and for which this proposal was written.

And finally I missed this earlier:

@droidmonkey

Botan has a fantastic API, but it also suffers from some legacy baggage (don't we all). It also relies on string-based primitive selection (eg, "AES-256/CBC") instead of a builder pattern or enums. This essentially makes the API completely undiscoverable without careful reading of the documentation.

Boy do I know it. I have made a point of trying to smooth this over with proper data types in botan, though it is still under progress, the ergonomics are significantly increased, while still providing the exact specificity necessary. It is best to understand Botan primitive names not as single identifiers, but as composite formulas because it is actually a parseable format that supports parameters.

For example, to specify a key-derivation function using HKDF with HMAC with SHA3 and not just SHA3 but specifically SHA3-512 variant, we specify HKDF $ HMAC $ SHA3 SHA3_512 and then that gets converted to "HKDF(HMAC(SHA-3(512)))" but we could have used a different MAC or Hash had we wanted to.

Haskell ADTs make it far easier to handle now that the formulae have been more or less decoded, but given the sheer number of algorithm combinations, this approach is flat out necessary.

@m5
Copy link

m5 commented Sep 27, 2023

Ah, so to clarify my position above with new information:

  1. I assumed it would be equally fiddly to link openssl, botan, and sodium. That still seems most likely to me, but if that's not the case, it'd weaken my argument that we should replace hs-tls with openssl.

  2. I hadn't realized Botan had a TLS implementation -- I assumed we'd be implementing from primitives like hs-tls does with crypton. If the FFI issues can be resolved, my "strong suggestion" to use openssl for tls is reduced back down to the level of my "openssl is very popular" thought from earlier.

@arybczak
Copy link

I don't have much to add to what was already said, but since it was my posts on Discourse that initiated the whole effort, I'll say that I fully support the addition of botan bindings to the Haskell ecosystem as I think they are a promising candidate to be future foundations for Haskell cryptography and IMO they're very much needed, especially since cryptonite is now officially unmaintained.

For the record, I'm already using botan-low at work (for AES).

@Kleidukos
Copy link

Kleidukos commented Sep 27, 2023

@ldillinger I took a look at the proposal and most of it seems reasonable, so for the sake of conciseness I shall only point out what really stood out to me.

I will be speaking in my own name, unless specified otherwise.

  • botan-bindings: Good scope, very similar in spirit to what @kozross and I did for libsodium-bindings.

  • Regarding the higher-level libraries, my advice would be that they are idiomatic but refrain from using typeclasses. Indeed my experience with other libraries of the space is that typeclasses are used to define an expected naming scheme for common operations, but there is an inevitable point where GHC can't handle the polymorphism and the user must resort to using type annotations. At this point we have come full circle back to monomorphic functions.
    My recommendation: If you still wish to use typeclasses everywhere, please expose the methods' implementation as normal functions too.

  • The future work section is a bit more problematic in my eyes, especially the compatibility layer with cryptonite and libsodium. Libsodium in particular is very opinionated, and you would rely on botan providing a superset of libsodium at all times, even in its later incarnations. For instance, Libsodium 1.0.19 exposes AEGIS-128L and AEGIS-256. Can Botan offer them?


With my Haskell Cryptography Group hat on:

We are honoured that this is done in collaboration with the group to ensure the sustainability of the botan project. Thank you for your trust, and we look forward to supporting you and the community.

@ldillinger
Copy link
Contributor Author

@Kleidukos I think that clinches it, I've been feeling more and more that I should elide the higher-level general cryptography abstractions from this proposal; it is distracting from the intended focus and scope, which are the libraries botan-bindings, botan-low, and botan.

This means that the typeclasses, cryptonite, and libsodium stuff can wait for their own proposal when the time comes; I haven't done the best job of explaining their intent yet, and I don't wish to obfuscate, so for the purposes of this proposal, I think we should ignore botanium, botanite, crypto-schemes, and crypto-schemes-botan; they'll continue to exist, be worked on or split off from the repo, but consider them as you would a weather forecast - a prediction that is more accurate the closer it is made.

I do have a whole plan for them - I'm not just implementing bindings for idle curiosity, and they do give me a good working 'downstream user' to test with - but I think we can leave higher-level abstractions for later, and narrow the scope a little. Not going away, just set aside for the moment, which will give me time to explain them better too.

Re: botan-bindings and libsodium-bindings: I took inspiration from that. I've went through the source code of just about every haskell cryptography libraries over the course of this project, and I have been picking out all of the best pieces :)

@gbaz
Copy link
Collaborator

gbaz commented Sep 28, 2023

speaking in terms of the difficulty of sometimes getting ffi to work, it would be great if a project goal were to have CI running on linux, os x, and windows all, and that it had clear instructions on os x (via brew i assume) and on windows (via cygwin) for users to ensure the botan library was available and could be linked to. making such efforts could very much mitigate the "cost" of migrating from a pure haskell library to one which depends on ffi.

@dpwiz
Copy link

dpwiz commented Sep 28, 2023

Stakeholder missing: supply-chain attacker.

As an attacker I want to silently insert some compromised code in the codebase to help me with penetrating downstream codebases. I would submit PRs which propose some enhancements or solutions for open issues and try to sneak in changes that would make my life easier later.

@ldillinger
Copy link
Contributor Author

@gbaz that is a very good suggestion. Is there a preferred solution for this, or would Github CI / Actions be sufficient for this purpose? I've never used them before, but I've built my own git server with a bunch of custom git hooks, so I can hardly imagine it being worse, though I'd need to consider cost as well.

@dpwiz That is an excellent point, I shall include it.

@gbaz
Copy link
Collaborator

gbaz commented Sep 28, 2023

I can't speak to the details of how to do this all with github ci / actions, but its certainly possible. one place the HF can definitely contribute is defraying or covering any expenses regarding this, or otherwise bringing in the botan CI into our existing coverage of these setups.

@Bodigrim
Copy link
Collaborator

@gbaz that is a very good suggestion. Is there a preferred solution for this, or would Github CI / Actions be sufficient for this purpose?

GHA are a good choice for a multiplatform CI. Even major Haskell projects fit well into the free tier.

@ldillinger
Copy link
Contributor Author

Your patience in my recent absence is appreciated; I often struggle with social / communication even in the best of times, but recovering from my appointment took the wind out of my sails for a bit longer than I expected.

I have at last gotten through and responded to all of @david-christiansen 's and everyone else's feedback, and am am midway through transcribing my thought-notes to update the PR to account for everything. I am aware that David has by now left the Haskell Foundation, so I would appreciate it if others could also go over my responses to his feedback, in addition to the pending PR update that I will do my best to finish tomorrow.

@Bodigrim
Copy link
Collaborator

Bodigrim commented Oct 4, 2023

I am aware that David has by now left the Haskell Foundation, so I would appreciate it if others could also go over my responses to his feedback, in addition to the pending PR update that I will do my best to finish tomorrow.

To avoid any doubt: despite a temporarily absense of an ED, the HF board tracks the proposal. Thanks for your efforts here! There is no particular rush, take your time.

@ldillinger
Copy link
Contributor Author

This proposal has been significantly updated in response to all of the feedback provided.


# Abstract

This community project proposal is for full-time funding for the development of a suite of Haskell cryptography libraries and tooling suitable a wide range of uses including data integrity, privacy, security, and networking. This is necessary to relieve the burden [of correctly implementing cryptography] from Haskell developers seeking to provide privacy and security to their users. We propose the development of a community-owned set of cryptography libraries of increasing capability, starting with bindings to a compatible open-source cryptography library. This is expected to require a considerable effort over time, and will be broken up into individual proposals, each of which will be accompanied by its own specific set of goals.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the text in square brackets supposed to be a link? The formatting seems a bit off.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops that is a bit of my scratch note formatting that escaped containment. It means I was considering an alternate sentence fragment / rewording it.

Copy link
Collaborator

@Bodigrim Bodigrim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ldillinger, overall looks very good to me.


# Problem Statement

The Haskell cryptography ecosystem lacks significant capability beyond basic primitives. This places a significant burden on developers to properly implement various security techniques, and it exposes end users to significant risk in the event of a lapse in security. There are commercial solutions, but there is no community-driven, community-owned solution, and one is unlikely to be developed without deliberate community effort (see Appendix A - Commercial Incentive Misalignment).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which commercial solutions do you have in mind? It would be nice to link such prior art if any.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mostly thinking of Cardano, but also that there are commercial cryptography libraries that could probably be paid to develop Haskell bindings for their proprietary code. An alternative wording might be "There are ways to solve this commercially".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My impression of Haskell cryptography in industry is that the majority of companies just use cryptonite / crypton or other preexisting packages. Is Cardano any different? Could they possibly chime in and share their experience? @michaelpj @angerman

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cardano is all open source, the library of ours that contains crypto bindings is Apache licensed. But it is very geared towards our usage, e.g. it contains some bindings to functions that only exist in our fork of libsodium 🙈 So I don't think our stuff is a "commercial solution" in that it's not commercial and it's probably also not a solution to your problems!

I might have the history wrong, but I think we used to use cryptonite but now just write our own bindings to relevant C libraries. I don't know why we stopped using cryptonite, but probably some combination of worries about maintainership and it significantly increasing our audit surface.


As a result, there are multiple cryptography libraries with similar interfaces, and it is not always clear which should be used preferentially. This promotes rot as downstream users unknownly rely on unmaintained libraries that slowly stop working.

Furthermore, a number of important libraries in the ecosystem (eg, `cryptonite`) have recently been abandoned by their author. Their backend is written in C, which presents a long-term maintenance burden with a high knowledge and skill requirement. NaCl bindings such as `saltine` are a high-quality alternative, but solve only a limited set of problems.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there more examples of abandoned cryptography libraries? The statement sounds unjustifiedly negative.

Copy link
Contributor Author

@ldillinger ldillinger Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not trying to call him out, but I am referring to Vincent Hanquez, the developer and maintainer of cryptonite and many other libraries, who has recently quit Haskell entirely. The crypton fork had already occurred prior to this, due to a lack of maintenance and refusing a request for transfer of ownership, and those events sparked a conversation that in turn started this project.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I just don't follow how the abandonment of Vincent's packages other than cryptonite is relevant to the proposal.

There is no harm in being specific about objective facts. But vague statements of this sort can create wrong impression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure of how to bring it up properly without creating a strong impression. There is the original event earlier this year that lead to the forking of crypton and cryptonite, but also his recent archiving of all of his repositories, which the community has not had time to react to.

Worse, cryptonite is not the only cryptography package of his. Most of those appear to be deprecated precursors to cryptonite that can be ignored, but there are several other cryptography-adjacent libraries such as asn1-encoding and memory which are not deprecated, which libraries such as tls and crypton-x509 currently rely on as dependencies.

Do you have a suggestion on how to word this? This is my best attempt so far:

Furthermore, a number of important libraries in the cryptography ecosystem (eg, cryptonite, asn1-encoding, and memory) have recently been archived by their author and will no longer be receiving updates.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me.

In the grand scheme of things, packages come and go all the time, it's a proper operation. If they are important enough, someone will care enough to maintain a fork.


Furthermore, a number of important libraries in the ecosystem (eg, `cryptonite`) have recently been abandoned by their author. Their backend is written in C, which presents a long-term maintenance burden with a high knowledge and skill requirement. NaCl bindings such as `saltine` are a high-quality alternative, but solve only a limited set of problems.

The resulting brittleness has placed the long-term stability of important libraries such as `crypton`, `tls` and `x509` at risk of falling behind as compilers are upgraded, old standards are updated, and new standards are implemented. Additionally the instability creates an unattractive environment for developers, effectively driving them away to other ecosystems that can provide the necessary stability.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crypton or cryptonite? The former (and tls, and x509-crypton) seems doing well.

Copy link
Contributor Author

@ldillinger ldillinger Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The brittleness I speak of is long-term. crypton is a fork of cryptonite, and it has inherited the design and burden of the original, and there is a difference between maintenance and development. There are no current issue with maintenance, but the development of significant improvements such as the addition of new algorithms has been made more difficult without the original author. tls and x509-crypton have made the jump to crypton, but if cryptographic standards were to progress further (eg, begin requiring post-quantum algorithms), it is probable that maintenance would no longer be sufficient.

(8.84 secs, 2,138,266,840 bytes)
```

It was quick and dirty testing in GHCi so your mileage may vary, but the results indicate that bindings to Botan may be significantly faster and consumes less memory than `crypton/ite`, if other modules / functions are similarly performant. Further implentation, testing, and benchmarking will be necessary to confirm this.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH benchmarks in GHCi are meaningless. The proposal would benefit from benchmarking compiled code via tasty-bench or criterion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I'll put together a proper benchmark of a few things (bcrypt, hashing).

Specific libraries worth noting are:

- Predecessors to `cryptonite` - all deprecated
- [cryptohash](https://hackage.haskell.org/package/cryptohash)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also a family of cryptohash-* packages, all kept on life support.

Copy link
Contributor Author

@ldillinger ldillinger Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm not quite sure what their exact relationship is to cryptohash (eg, do they predate it or the other way around) but they are both by the same author (Vincent Hanquez) and so presumably share a lot of code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into it and it appears that those are stripped-down, algorithm-specific versions of cryptohash to reduce the footprint. I've added brief mention of this.

- Unit tests
- CI tests for Linux, MacOS, and Windows

The official repo will be owned by the [Haskell Cryptography Group ](https://github.com/haskell-cryptography) and maintained by myself as a member of the group. Another member of the Haskell Cryptography Group, or a member of the Haskell Foundation, will be selected to hold backup keys and permissions to the website, server, and official repositories as needed, in order to give project members access and to reduce any conflicts about ownership or maintenance.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot be sure and I appreciate that it's a delicate matter, but if I read your intention right, I think it would be better served by explicitly granting ownership to HF and maintainership to HCG.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That might be more appropriate, and aligns with my intent.

Essentially, `Bits`` (and the fixed-width `FiniteBits``) is a class that combines two concepts:

`Boolean / Heyting`` algebra, for which there is no concept of any 'internal bits' or structure and you are just operating on some object as a whole
Things that are contructed from / encoded into a set of 'internal bits' using some encoding, for which individual bits are indexable / addressible, and thus individually boolean-operable. There is also a link here to indexing / representable, but that is getting way out of scope.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting seems off.


Notably, the documentation itself further states that "The `Bits` class defines bitwise operations over integral types.". Technically this means that only things that are integer numbers should be `Bits` and `FiniteBits`, but Godel numbering rears its ugly head, and that just adds the question of whether `Bits`'s boolean operations are intended to apply to an object itself or its encoding, and as a result `Bits` and `FiniteBits` are restricted to objects for which the boolean operations are isomorphic to boolean operations over its encoding.

Given this understanding, `ByteString` does not have a canonical `Bits` instance because it has no canonical integer representation - a given bytestring could represent many such integers, whether it be big endian or little endian, it could have a sign bit, or be two's complement. Its bits might not be contiguous or even ordered. It could easily satisfy a `Boolean` instance, but the issue of indexing individual bits and integral type requirement interferes with a full instance of `Bits`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my CLC hat I'm hugely sceptical about introducing class Boolean as a superclass of Bits. I mean, I absolutely agree with the motivation, but we just don't have a good migration story for changing type class hierarchy.

One solution would be to resurrect default superclasses proposal.

I recall that bitvec (which is morally a ByteString) has an audacity to define instance Bits. Is it if any help? Could we just wrap ByteString into a newtype and define whatever is missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal for class Boolean spawns from the same work as but is not a part of this proposal. David asked for clarification / a link to the topic, so I added an addendum just in case but it can be ignored entirely.

Re: bitvec I already have a lightweight class for sized byte vectors in crypto-schemes, but it is mostly for type safety and bitvec may be faster. I'll definitely be keeping that in mind.

@LaurentRDC
Copy link
Contributor

This proposal is one of the clearest I've read, and I appreciate all of the effort that has been made to accommodate non-experts like myself.

I trust that the technical choice of extending Botan and the structure of the proposed packages is satisfactory to would-be direct users. What I am somewhat unclear on is how this particular project impacts the wider community. A few people have mentioned that they would start using botan directly, which is great. In the proposal, it is written:

The following are goals, but may not be complete by this proposal:

  • To provide a suitable alternative interface / backend (as an option) for crypton, tls, x509

@ldillinger Would you be amenable to making this an explicit goal, with an extended timeline? This could be mutually beneficial if you were going to do this anyways.

If an extended workload and timeline don't make sense, is there a desire within the Haskell Cryptography Group to take the work resulting from this proposal and run with it @Kleidukos ? If so, could we add this in a 'Future work' section in this proposal?

@Kleidukos
Copy link

is there a desire within the Haskell Cryptography Group to take the work resulting from this proposal and run with it @Kleidukos ?

Oh we will definitely provide a roof and give administrative help, however it would be ideal if a community of contributors was fostered so that the project can fulfil its destiny. :)


3) There is a persistent notion that functional languages are too slow for cryptography, despite it being based on historical reasons that have been mostly since addressed. Well-written modern Haskell approaches C in terms of speed, and has a much higher ergonomic-use and type-safety factor, but the public image has not caught up yet, and so the ecosystem has not received the investment and attention necessary to attract the attention of skilled engineers for developing either pure implementations or foreign bindings.

4) A fair number of cryptographic projects in recent years has been specifically towards *cryptocurrencies*, rather than *cryptography in general*, and their results are at large fairly useless if they cannot be used outside of their ecosystems.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this could be rephrased slightly.

I.e. I think some of the cryptocurrency work has probably usefully affected some libraries (maybe?); it's just the scope of a lot of crypto work in cryptocurrencies can be idiosyncratic and not generally useful in scenario A, B, C ...

This could go hand-in-hand with a change to this section (above):

NaCl bindings such as saltine are a high-quality alternative, but solve only a limited set of problems.

to list the problems that saltine focuses on (I actually don't know!). For example, personally in my daily life I work with JWTs and Cookies; both of which require cryptography. I recently learned (from @Kleidukos ) that the clientsession library depends on cryptonite (likewise jose. So for me, I only need these libraries to improve, not the lower-level ones.

I.e. it would help me to understand a bit more what is the list of things that I can't do with saltine but can do with botan.


Following that, there is also `Z-Botan`, which has been apparently unmaintained for several years. Reviving it was considered, but would require a comprehensive and near-total rewrite in order to divest itself of `Z-Haskell` as a dependency. New bindings could instead draw significantly from `Z-Botan`, while presenting a much smaller maintenance and dependency surface.

Furthermore, none of these libraries contain implementations of post-quantum cryptography schemes; although existing quantum computers are of insufficient capability to break commonly used algorithms such as Ed25519, this will not remain true forever, and it would be best to have quantum-resistant implementations ready for adoption long before such attacked become practical.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... before such attacked become ...

attacked -> attacks.

It's a nice point, and in this context it's often worth pointing out that post-quantum cryptography is necessary now because it is sufficient to capture the encrypted traffic now! and then decode it later once there are powerful-enough quantum computers; i.e. data now is not safe against future quantum computers.

@silky
Copy link
Contributor

silky commented Oct 13, 2023

I agree with @LaurentRDC that this is a lovely proposal; well written and well-resarched, as well as being important and interesting!

Nice work @ldillinger ; I think this will stand as a great example to point people to for how these proposals can be done :)

As a member of the HF TWG, the main point I would like to discuss (as @LaurentRDC also suggested) is adding a goal to actuall intergrate this backend (botan?) into a more "consumer" facing library like hs-jose or clientsession. These are just two libraries that I have used myself personally, recently, and I would consider extremely promising if this could be done; as well as being a good demonstration that the API that has been built is actually useful (i.e. would these use botan or botan-low? I can't tell at this point).

For me, I think it would be worth-while to add this as a step and just increase the time/money requested from the HF accordingly. It seems important enough to do in the first instance to be worth it, to me (this is of course with me having no say on whether or not that funding goes ahead!)

@gbaz
Copy link
Collaborator

gbaz commented Oct 13, 2023

I am sympathetic to the desire to add an integration goal, but do not want to increase scope. It could be added to the future work or intentions section more explicitly. Additionally I want again to motivate a "future work" goal of eventually providing a tls layer for haskell web things, as this is where the cryptonite footprint is felt most strongly.

@LaurentRDC
Copy link
Contributor

It could be added to the future work or intentions section more explicitly

That's a good compromise in my eyes

@Bodigrim
Copy link
Collaborator

Bodigrim commented Oct 13, 2023

I'd advice against increasing required time/money commitment at this stage.

Speaking from the experience, integration goals are tricky to deliver, because you are at mercy of third parties. Stating it as future work is a better option.

@Bodigrim
Copy link
Collaborator

CC @kazu-yamamoto for crypton / tls story.

@ldillinger
Copy link
Contributor Author

Thank you all for your responses! I am quite happy that my effort has been well received. I'll be making a few minor changes based on what has been mentioned, and plan to have a minor update to the proposal tomorrow reflecting this.


@LaurentRDC / @silky - I would be amenable to making this an explicit goal, but I concer with @Bodigrim that it should be in a future proposal. A crypton compatibility library was initially considered for inclusion in this proposal, but it was removed from scope to narrow the focus. It is a highly desirable goal, but it is difficult to make a good estimate of the effort required at this time due to implementation / API differences. I have done some investigation, and so I do know that while botan is not a 100% perfect drop-in replacement, it is a high match for algorithm coverage. We will be in a much better place to assess this goal at the end of this proposal, and it will be kept in mind for the duration.

I will add a Future Work section with this in mind.

@silky
Copy link
Contributor

silky commented Oct 13, 2023

Let me provide a bit of reasoning behind my thoughts on asking for integration to be a goal immediately:

  • I think integration in some existing "high-level" library demonstrates the utility of some components of this proposal; insofar as they provide a space to test the various API designs. I feel without them being tested in action, it's hard to know if the proposal has even worked at all, in some important sense.
  • I don't think it needs to be 100% complete and released; maybe the most we could aim for is a PoC to demonstrate that the API design here works well enough, pending some large amount of busywork; in that sense I don't think it needs to be too tied to external dependencies like the maintainers of those other libraries merging PRs, for example.
  • I get the comment about not wanting to increase the time/money component. I wonder then if there's a way that the kind of "external" utility of this work can be tested in any other way, a bit earlier, at the cost of something else? Or am I missing another way that this kind of API will be tested?
  • I feel like leaving integration to an entirely different proposal commits the HF to more potential cost than if the integration were tested earlier. For example it'd be a shame if (I think pretty unlikely, but still possible) the Leg 1 were completed, and then Leg 2 - Integration - begins, and immediately stalls for reasons that we would've found earlier, thereby forcing a significant re-work. As i said, I think it's unlikely, but from experience I also know that some kind of integration as a "first"-ish goal can be quite important in setting the scope and requirements for how things actually practically work.

I hope this provides a bit more context to my thoughts. I think all I'm after is some sign that integration is "tested" in this leg; I think there's lots of scope for what that could actually look like.

And I agree, certainly a good "Future Work" could be some of these larger-ticket full-integration items, that I agree could be quite hard to predict for various reasons.

@ldillinger
Copy link
Contributor Author

@silky How do you feel about adding it as an explicit goal, but not as a deliverable? That confirms that it is something we wish to achieve, while acknowledging that it may not be achievable within the proposed time frame? I certainly have goals and aspirations beyond the current proposed scope, but it is difficult to forecast beyond what is already proposed.

There are I think two questions - one of a compatibility layer for drop-in replacement of crypton, and another of being an alternative backend for critical libraries such as tls et al, regardless of whether it requires adaptation / modification. Both are worth consideration, but both also are somewhat reliant on the success of the immediate proposal.

@silky
Copy link
Contributor

silky commented Oct 14, 2023

@silky How do you feel about adding it as an explicit goal, but not as a deliverable?

I think if it's an explicit goal it should be a deliverable; but for me there are operational issues anyway, because I don't think the deliverables mean you won't be paid if you don't deliver them all, as that would be a bit unfair I think (but the operationality of the funds is up to the new HF ED/transitionary board in any case).

So I'd say at the moment the weight of opinion seems to be for leaving it as-is; and that's fine for now; I'll be curious to confer with the others members of the TWG next week. But just to be clear, our thoughts (and especially mine!) are only guidance anyway; ultimately the decision rests elsewhere.

There are I think two questions - one of a compatibility layer for drop-in replacement of
crypton, and another of being an alternative backend for critical libraries such as tls et al,
regardless of whether it requires adaptation / modification. Both are worth consideration,
but both also are somewhat reliant on the success of the immediate proposal.

Yes this makes sense to me and I agree.

I think the only thing I'm arguing for is something along the lines of "I've attempted some kind of integration and it's helped me learn x,y,z". Concretely, for example, it explains the utility of botan over botan-low; i.e. it seems to me to be an open question as to which one is best to integrate into other libraries with. But it could be I'm missing some views from my learned colleagues in the TWG, so look forward to conferring with them next week and seeing what comes out of it :)

@Bodigrim
Copy link
Collaborator

I don't think the deliverables mean you won't be paid if you don't deliver them all, as that would be a bit unfair I think (but the operationality of the funds is up to the new HF ED/transitionary board in any case).

I’m afraid in an extreme scenario deliverables could mean pretty much this. There is of course some leeway in their interpretation and parties are expected to cooperate in good faith, but promising X and not delivering X at all is undesirable.

In my HF hat, I do recommend to stick to the current scope and duration. It’s easier to negotiate an extension once an initial contract was fulfilled.

@ldillinger
Copy link
Contributor Author

ldillinger commented Oct 15, 2023

Then it seems prudent to keep the scope and the deliverables as-is, and to instead outline some of what has been mentioned here in a 'Future Work' section that I will publish / update before EOD in the morning.

… statement, improved quantum computing mention, improved botan vs botan-low description, added Future Work section
@ldillinger
Copy link
Contributor Author

I have updated the proposal again in response to the recent feedback:

  • Improved NaCl description
  • Improved cryptocurrency section in problem statement
  • Improved quantum computing mention
  • Improved botan vs botan-low description
  • Added Future Work section

Barring any additional feedback, I am satisfied with this as a final draft of this proposal.

@kazu-yamamoto
Copy link

Good work.
If this really happens, I will try to migrate tls from crypton to botan.

@ldillinger
Copy link
Contributor Author

As there has been no additional feedback since the last update, I suppose now it is time to move on to the final step of this proposal, and officially ask the TWG committee to give a recommendation.

@LaurentRDC
Copy link
Contributor

As it happens, the TWG is meeting today to discuss this proposal. Stay tuned

@LaurentRDC
Copy link
Contributor

@ldillinger The TWG has voted to approve this proposal.
The Haskell Foundation Board will meet next week to figure out the logistics; until then, we will keep this proposal open.

@goldfirere
Copy link
Contributor

Thanks, TWG, for seeing this through!

Just to be clear in communication, especially to @ldillinger, the TWG approval is a recommendation to the HF, not a binding commitment. Of course I want to and expect to follow this recommendation, but we need to stage this among other priorities. We'll discuss at the board meeting coming up on Thursday, and I expect to have more clarity by the end of the week. I'll also follow up more with Leon directly over email.

@ldillinger
Copy link
Contributor Author

@goldfirere That is understood - I read the proposals documentation thoroughly.

@arybczak
Copy link

arybczak commented Nov 2, 2023

Any news?

@ldillinger
Copy link
Contributor Author

The Haskell Foundation has accepted this proposal 🎉

@goldfirere
Copy link
Contributor

Should this be merged? I'm not super familiar with this repo and don't want to misstep. @gbaz

But yes 🎉 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.