Skip to content
This repository has been archived by the owner on May 2, 2023. It is now read-only.

Provide a TAP device to enclave application #43

Merged
merged 55 commits into from
Jan 24, 2023
Merged

Provide a TAP device to enclave application #43

merged 55 commits into from
Jan 24, 2023

Conversation

NullHypothesis
Copy link
Contributor

@NullHypothesis NullHypothesis commented Nov 21, 2022

This PR does the following:

  • Replace our HTTP-01 Let's Encrypt challenge with TLS-ALPN-01, which simplifies the code and obviates the need for opening port 80.
  • Allow applications to register a SHA-256 over their public key material. Together with the hash over our X509 certificate, this hash is embedded in attestation documents.
  • Update the documentation and add an architecture diagram.
  • Turn nitriding into a stand-alone executable and add it to cmd/.
  • Expose an Internet-facing Web server (for remote attestation, key synchronization, and nonce retrieval) and an enclave-internal Web server (for state operations, signalling readiness, and for requesting key synchronization).
  • Add a handler that allows the application to signal when it's ready, after which nitriding starts its Internet-facing Web server.
  • Add a reverse proxy that terminates HTTPS and forwards HTTP requests to the application (if desired).
  • Add a Python example application to example/.
  • Add networking code that creates an enclave-internal TAP interface which forwards network packets via the VSOCK interface. Most of this is provided by the package gvisor-tap-vsock.

Philipp Winter added 30 commits November 17, 2022 10:32
This fixes #26.
...because nitriding is no longer (just) a package.
Nitriding is going to be running two Web servers: one is public-facing
and meant to be accessed by clients and the other is enclave-facing and
meant to be accessed by the enclave application.
Nitriding is now a stand-alone application, obviating the need for an
API to add HTTP handlers.
The reverse proxy terminates TLS and forwards all but a select few HTTP
requests to the enclave application, which runs its own Web server.
Since the reverse proxy terminates TLS, the enclave application does not
need to bother with certificates and can expose a simple HTTP server.
Non-HTTP enclave applications need a way to link their key material to
the attestation document, which serves as our root of trust.  This
commit adds a new enclave-internal endpoint that allows applications to
register a hash over their public key material.  This hash (along with
a hash over nitriding's HTTPS certificate) is then embedded in
attestation documents.
This enclave-internal HTTP handler allows applications to signal their
readiness to nitriding.  When the handler is called, nitriding starts
its Internet-facing Web server.
Domain sockets are great for high-throughput applications but we don't
need troughput here; we need ease of use.
A simple Python client that retrieves its IP address by connecting to a
Web server.
The endpoint takes as input a SHA-256 hash, so "hash" better reflects
what's going on behind the scenes; "key" is too broad of a term.
This patch makes use of Let's Encrypt's tls-alpn-01 challenge which
is simpler than the http-01 challenge because it does not require a
separate listener on port 80.
Philipp Winter added 4 commits November 22, 2022 09:33
...otherwise, the operation fails.
This commit removes the chi import that's not part of its v5 API.  It
also initializes chi's middleware before we create routes because chi
requires that.
cmd/main.go Show resolved Hide resolved
attestation.go Outdated Show resolved Hide resolved
enclave.go Outdated Show resolved Hide resolved
Comment on lines +41 to +48
// The following paths are handled by nitriding.
pathRoot = "/enclave"
pathNonce = "/enclave/nonce"
pathAttestation = "/enclave/attestation"
pathState = "/enclave/state"
pathSync = "/enclave/sync"
pathHash = "/enclave/hash"
pathReady = "/enclave/ready"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expected a subrouter pattern here, so the /enclave prefix is only specified once and it's easy to move all the enclave-related handlers around if necessary. But there aren't many paths and they're all in the same place, so I'm fine with this as-is.

handlers.go Show resolved Hide resolved
handlers.go Show resolved Hide resolved
Philipp Winter added 4 commits November 23, 2022 11:23
This commit 1) ensures that all required command line arguments are
present and 2) validates the arguments' values.  This commit also
changes the type of some configuration variables to more appropriate
types, e.g., uint16 for AF_INET port numbers.
This is going to facilitate the transition to a different hash function.
rillian
rillian previously approved these changes Nov 28, 2022
cmd/main.go Outdated Show resolved Hide resolved
cmd/main.go Outdated
Comment on lines 36 to 37
flag.Int64Var(&hostProxyPort, "host-proxy-port", 1024,
"Port of proxy application running on EC2 host.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be 64 bits because there's no flag.UIntVar? Seems unlikely we'll be running on a 32-bit host, where one would otherwise have to use negative numbers to specify ports with the high bit set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, there is flag.UintVar and we might as well use it for all three ports. I addressed this in a098ea3.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this better!

example/Dockerfile Outdated Show resolved Hide resolved
Philipp Winter added 2 commits November 29, 2022 08:32
There's no reason to allow negative ports and Go's spec guarantees that
a uint is at least 32 bits -- the size of an AF_VSOCK port.
Ralph measured that this saves around 20 MB of space.
rillian
rillian previously approved these changes Nov 29, 2022
Copy link
Member

@kdenhartog kdenhartog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few questions I just want some further clarification on before approving, but overall this is looking safe

nitriding->>nitriding: Set up enclave-internal Web server

nitriding->>+ec2: Packet forwarding
ec2->>+ca: Request HTTPS certificate (via HTTP-01)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non blocking: Is this still accurate now that TLS-ALPN-01 has replaced HTTP-01?

Good to see these docs here!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, that's a mistake. Good catch! Fixed in 94d7d99.

enclave.go Outdated
m.Get(pathAttestation, attestationHandler(e.hashes))
m.Get(pathNonce, nonceHandler(e))
m.Get(pathRoot, rootHandler(e.cfg))
m.Post(pathSync, respSyncHandler(e, time.Now))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking: How might OS time changing affect this? I've heard of a method that MITM network time protocol to make it possible to reuse expired certs and I'm wondering if that may be something we'd encounter here as well.

From what I could tell, there's no actual way to resolve this without redoing NTP so that's why I'm not blocking on it, but I want to bring up in case you have ideas that might help improve this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took another look at the code and noticed that the time argument is not actually used! I deleted it in c850d56.

go.mod Outdated
github.com/songgao/packets v0.0.0-20160404182456-549a10cd4091
github.com/songgao/water v0.0.0-20200317203138-2b4b6d7c09d8
github.com/vishvananda/netlink v1.2.1-beta.2
golang.org/x/crypto v0.3.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for switching from a specific hash for these to now just setting a version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go's tooling made this change because the Go folks finally started using tags for packages in golang.org/x/ (see golang/go#21324). In fact, I just updated to the latest dependencies in beed0b9, which bumps the x/crypto package to v0.5.0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! good to know this is the direction they're heading to prevent supply chain attacks through redeploying a new package under the same version

func getStateHandler(e *Enclave) http.HandlerFunc {
return func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/octet-stream")
s, err := e.KeyMaterial()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a way to validate that this hasn't been changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Key material can be set in two ways:

  1. Within the enclave, by the trusted enclave application. In this case, we don't need validation because all involved parties are trusted.
  2. Across the Internet (or ideally: across our private Kubernetes network), when enclaves synchronize with each other. In this case, we do mutual authentication using attestation documents.

Does that answer your question? I hope I didn't miss anything.

// putStateHandler returns a handler that lets the enclave application set
// state that's synchronized with another enclave in case of horizontal
// scaling. The state can be arbitrary bytes.
func putStateHandler(e *Enclave) http.HandlerFunc {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do any validation for this endpoint first? I was thinking previously we were validating the attestation of the other server before allowing the state to be set but maybe thats for a different endpoint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, you are probably thinking about the function respSyncHandler. The purpose of this endpoint is to let the (trusted) enclave application set whatever state it wants to be set. It's an enclave-internal, trusted endpoint. We allow the application to set arbitrary bytes, so we limit our validation to enforcing a length limit by taking advantage of newLimitReader.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that was the one I was originally thinking about. Ok, that makes sense for why getStateHandler and putStateHandler are not validating at the nitriding application layer here. I think the aspect that's throwing me off is at what level are we isolating the trust boundary for this. From the code it appears as if this is just a publicly available endpoint, but from the sounds of it this is being isolated at maybe(?) the network layer. Can you point me to where that's being done so I can understand how these two endpoints are being secured better? I don't see an issue with leaving these endpoints unvalidated at this point, but I just want to better understand the attack surface to make sure I'm not misunderstanding something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To isolate public from private endpoints, we are using two separate Web servers: one that binds to localhost (see here) and another (see here) that imposes no such restrictions. I'm going to add a comment to the affected endpoint handlers because this is more confusing than it should be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9edb2c4 adds a few comments, explaining that some of our endpoints are enclave-internal.

Philipp Winter added 2 commits January 20, 2023 08:20
Note that we have to run `go get gvisor.dev/gvisor/runsc@go` in addition
to `go get -u` because we need gvisor/runsc's Go branch:
https://github.com/google/gvisor#using-go-get
rillian
rillian previously approved these changes Jan 20, 2023
@NullHypothesis
Copy link
Contributor Author

@kdenhartog Are you happy with the changes I made? If so, we should be ready to merge!

Copy link
Contributor

@rillian rillian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see the clear comments about which handler is for which interface.

@kdenhartog
Copy link
Member

This looks great! Thanks for taking the time to add the additional comments.

Copy link
Member

@kdenhartog kdenhartog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@NullHypothesis NullHypothesis merged commit 4053741 into master Jan 24, 2023
@NullHypothesis NullHypothesis deleted the tun-tap branch January 24, 2023 21:06
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants