-
Notifications
You must be signed in to change notification settings - Fork 10
Provide a TAP device to enclave application #43
Conversation
This fixes #26.
...because nitriding is no longer (just) a package.
Nitriding is going to be running two Web servers: one is public-facing and meant to be accessed by clients and the other is enclave-facing and meant to be accessed by the enclave application.
Nitriding is now a stand-alone application, obviating the need for an API to add HTTP handlers.
The reverse proxy terminates TLS and forwards all but a select few HTTP requests to the enclave application, which runs its own Web server. Since the reverse proxy terminates TLS, the enclave application does not need to bother with certificates and can expose a simple HTTP server.
Non-HTTP enclave applications need a way to link their key material to the attestation document, which serves as our root of trust. This commit adds a new enclave-internal endpoint that allows applications to register a hash over their public key material. This hash (along with a hash over nitriding's HTTPS certificate) is then embedded in attestation documents.
This enclave-internal HTTP handler allows applications to signal their readiness to nitriding. When the handler is called, nitriding starts its Internet-facing Web server.
Domain sockets are great for high-throughput applications but we don't need troughput here; we need ease of use.
A simple Python client that retrieves its IP address by connecting to a Web server.
The endpoint takes as input a SHA-256 hash, so "hash" better reflects what's going on behind the scenes; "key" is too broad of a term.
This patch makes use of Let's Encrypt's tls-alpn-01 challenge which is simpler than the http-01 challenge because it does not require a separate listener on port 80.
...otherwise, the operation fails.
This commit removes the chi import that's not part of its v5 API. It also initializes chi's middleware before we create routes because chi requires that.
// The following paths are handled by nitriding. | ||
pathRoot = "/enclave" | ||
pathNonce = "/enclave/nonce" | ||
pathAttestation = "/enclave/attestation" | ||
pathState = "/enclave/state" | ||
pathSync = "/enclave/sync" | ||
pathHash = "/enclave/hash" | ||
pathReady = "/enclave/ready" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I expected a subrouter pattern here, so the /enclave
prefix is only specified once and it's easy to move all the enclave-related handlers around if necessary. But there aren't many paths and they're all in the same place, so I'm fine with this as-is.
This commit 1) ensures that all required command line arguments are present and 2) validates the arguments' values. This commit also changes the type of some configuration variables to more appropriate types, e.g., uint16 for AF_INET port numbers.
This is going to facilitate the transition to a different hash function.
cmd/main.go
Outdated
flag.Int64Var(&hostProxyPort, "host-proxy-port", 1024, | ||
"Port of proxy application running on EC2 host.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has to be 64 bits because there's no flag.UIntVar
? Seems unlikely we'll be running on a 32-bit host, where one would otherwise have to use negative numbers to specify ports with the high bit set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, there is flag.UintVar
and we might as well use it for all three ports. I addressed this in a098ea3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this better!
This adds clarity.
There's no reason to allow negative ports and Go's spec guarantees that a uint is at least 32 bits -- the size of an AF_VSOCK port.
Ralph measured that this saves around 20 MB of space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few questions I just want some further clarification on before approving, but overall this is looking safe
doc/architecture.md
Outdated
nitriding->>nitriding: Set up enclave-internal Web server | ||
|
||
nitriding->>+ec2: Packet forwarding | ||
ec2->>+ca: Request HTTPS certificate (via HTTP-01) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non blocking: Is this still accurate now that TLS-ALPN-01
has replaced HTTP-01
?
Good to see these docs here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, that's a mistake. Good catch! Fixed in 94d7d99.
enclave.go
Outdated
m.Get(pathAttestation, attestationHandler(e.hashes)) | ||
m.Get(pathNonce, nonceHandler(e)) | ||
m.Get(pathRoot, rootHandler(e.cfg)) | ||
m.Post(pathSync, respSyncHandler(e, time.Now)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-blocking: How might OS time changing affect this? I've heard of a method that MITM network time protocol to make it possible to reuse expired certs and I'm wondering if that may be something we'd encounter here as well.
From what I could tell, there's no actual way to resolve this without redoing NTP so that's why I'm not blocking on it, but I want to bring up in case you have ideas that might help improve this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took another look at the code and noticed that the time argument is not actually used! I deleted it in c850d56.
go.mod
Outdated
github.com/songgao/packets v0.0.0-20160404182456-549a10cd4091 | ||
github.com/songgao/water v0.0.0-20200317203138-2b4b6d7c09d8 | ||
github.com/vishvananda/netlink v1.2.1-beta.2 | ||
golang.org/x/crypto v0.3.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for switching from a specific hash for these to now just setting a version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go's tooling made this change because the Go folks finally started using tags for packages in golang.org/x/ (see golang/go#21324). In fact, I just updated to the latest dependencies in beed0b9, which bumps the x/crypto package to v0.5.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! good to know this is the direction they're heading to prevent supply chain attacks through redeploying a new package under the same version
func getStateHandler(e *Enclave) http.HandlerFunc { | ||
return func(w http.ResponseWriter, r *http.Request) { | ||
w.Header().Set("Content-Type", "application/octet-stream") | ||
s, err := e.KeyMaterial() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a way to validate that this hasn't been changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Key material can be set in two ways:
- Within the enclave, by the trusted enclave application. In this case, we don't need validation because all involved parties are trusted.
- Across the Internet (or ideally: across our private Kubernetes network), when enclaves synchronize with each other. In this case, we do mutual authentication using attestation documents.
Does that answer your question? I hope I didn't miss anything.
// putStateHandler returns a handler that lets the enclave application set | ||
// state that's synchronized with another enclave in case of horizontal | ||
// scaling. The state can be arbitrary bytes. | ||
func putStateHandler(e *Enclave) http.HandlerFunc { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to do any validation for this endpoint first? I was thinking previously we were validating the attestation of the other server before allowing the state to be set but maybe thats for a different endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, you are probably thinking about the function respSyncHandler. The purpose of this endpoint is to let the (trusted) enclave application set whatever state it wants to be set. It's an enclave-internal, trusted endpoint. We allow the application to set arbitrary bytes, so we limit our validation to enforcing a length limit by taking advantage of newLimitReader
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, that was the one I was originally thinking about. Ok, that makes sense for why getStateHandler
and putStateHandler
are not validating at the nitriding application layer here. I think the aspect that's throwing me off is at what level are we isolating the trust boundary for this. From the code it appears as if this is just a publicly available endpoint, but from the sounds of it this is being isolated at maybe(?) the network layer. Can you point me to where that's being done so I can understand how these two endpoints are being secured better? I don't see an issue with leaving these endpoints unvalidated at this point, but I just want to better understand the attack surface to make sure I'm not misunderstanding something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9edb2c4 adds a few comments, explaining that some of our endpoints are enclave-internal.
Note that we have to run `go get gvisor.dev/gvisor/runsc@go` in addition to `go get -u` because we need gvisor/runsc's Go branch: https://github.com/google/gvisor#using-go-get
@kdenhartog Are you happy with the changes I made? If so, we should be ready to merge! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to see the clear comments about which handler is for which interface.
This looks great! Thanks for taking the time to add the additional comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR does the following:
cmd/
.example/
.