Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce HmacAuthed with secured request's content #57

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

michivi
Copy link
Contributor

@michivi michivi commented Mar 10, 2022

This PR introduces the new Servant combinator HmacAuthed. It is essentially the same as HmacAuth but adds the following:

  • Takes the request's body into account when generating the HMAC signature (HmacAuth ignores it and as a consequence doesn't guarantee the integrity of the request)
  • Ignore the () parameter introduced by HmacAuth used to represent the user.

Discussions

To achieve this, an entirely new module is introduced with a specific client hmacClient and runner runHmacClient. The implementation guidelines behind the module were:

  • An API as close to the original Servant API as possible. That's the reason why these hmacClient and runHmacClient are different from the old versions. It also means that to switch to and from the unsecured equivalent (client and runClientM), the user only need to rename the function call and add the required parameter.
  • Keep the old versions working for smoother transitions.
  • An API no longer using the experimental Servant authentication mechanism. The experimental aspect aside, it also wasn't suited to our needs: we only guarantee the identify of a single user and don't distinguish user from a set of users (though that would be technically possible, with each user having its own secret key), we also need to access the request body for authentication which poses its "consumption problem" in the regular request handling workflow...

Pros

  • Simpler API to use (though that is very subjective, at the very least the API is closer to the Servant API)
  • The integrity of the request's content is guaranteed by the signature.
  • Request hashing is only performed when authentication is required (as opposed to the previous workaround)

Caveats

  • As the new API hashes the request's body content to generate the HMAC signature, it needs to make to copy of it to give to the client / server backend. This copy lives in memory. This would be very expensive for very large requests and streaming. Streaming is also affected by the fact that the different body chunks are only transferred as soon as they have all been inspected for signature. This HMAC authentication implementation is therefore not really suited for streaming. Plus, care needs to be taken and protection put in place to prevent accidental (or voluntary and malicious) DoS attacks with very large requests sent to APIs protected by this library. A possible workaround would be to limit the size of the requests for the protected endpoints.

Breaking changes

  • During implementation, the existing API was kept as-is after all... almost. The forced addition of the Host and Accept-Encoding HTTP headers was removed as it was deemed unnecessary, even harmful for the test cases, for the new implementation in the shared module: the Host may be set by the client and we shouldn't change it and use the existing value for signature. If it's not set, then it should be ok to extract the value from the target URL. As for the Accept-Encoding HTTP header, it doesn't look like a good idea to tell the remote side that we can accept a Gzip encoding if the client didn't say so and if we actually don't (both the client and this library). Perhaps we should drop it off the whitelist of HTTP headers as well: if any intermediate reverse proxy accepts additional encodings, both the client and server shouldn't care about that and see the unharmed body content, no matter what happens in-between. If the client actually supports the Gzip encoding and the library doesn't, that actually doesn't matter as well: this library doesn't need to know that and the Gzip middleware should ensure that the library only sees and authenticates the decoded content with the new workflow.

Questions

  • As it is, the old version of the API (the one that doesn't take the request's content into account) is still functional but not very useful: it only guarantees the caller's identity and integrity of the requested URL (including the query string). There is no integrity check on the content. Though it can be kept in the library, it doesn't seem very useful in terms of security. It doesn't suffer the issue of the very large request / streaming though. What should be done about it?
  • Should we consider the possibility of distinguishing multiple users by their secret key? There are multiple ways of achieving this. But it was not a requirement for this library. Shall we continue with that? If we intend on adding support to that, it might be easier to take it on immediately.
  • See comment on the headers in the Breaking changes section. Can we still reproduce the reverse-proxy issue without the Accept-Encoding header?

Further actions required

  • Discuss this PR.
  • Update the documentation
  • Decide on the old API's future

@michivi
Copy link
Contributor Author

michivi commented Mar 14, 2022

As discussed with @arbus and @JolandaNava , this PR has been changed in the following way:
The client and server APIs have been changed to identify the user and use their secret key for authentication. As opposed to the first proposed version, the user is now provided to the server handler after the given context has identified and authenticated the user. The client API will only need to be provided with the user while the given context will select their secret key and add the necessary information in the HTTP request so that the server is able to identify the user and use their secret key as well.

@michivi
Copy link
Contributor Author

michivi commented Apr 15, 2022

If that's ok with everyone, we can merge this as experimental and create a new version out of it. There should be no real issue as the previous implementation is still available. If there's any feedback, we can still have PRs changing the experimental implementations.
/Cc @JolandaNava @arbus @alexterz

Base automatically changed from tle/hash-request to master April 21, 2022 07:34
@michivi michivi force-pushed the tle/use-request-in-signing-algorithm branch from 576b7aa to c7dcf3e Compare April 21, 2022 07:35
@michivi
Copy link
Contributor Author

michivi commented Apr 21, 2022

@kahlil29 What do you think we should do about this PR? There should be no breaking changes as the previous API is still here. The new proposed API is still experimental, though functional in our project for now.

@kahlil29
Copy link
Member

Hey @michivi
Given that there are no breaking changes, I am in favor of merging and cutting a new release, then we go ahead and test it out and figure out next steps.
I feel like the "older" (existing) API would always need to be kept, since it's being used in multiple projects, internally at Holmusk as well as by other external parties like @ocharles (CircuitHub).

Copy link
Member

@kahlil29 kahlil29 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed description that outlines things quite clearly 🚀

@ocharles
Copy link
Contributor

From my perspective, I wouldn't worry about maintaining an API just for us - we'll adapt our code quick enough!

@michivi michivi mentioned this pull request Apr 21, 2022
@kahlil29
Copy link
Member

@ocharles Thanks for that!
Any other thoughts/opinions on this PR are also appreciated and welcome 👍🏻 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants