Skip to content

Serving files from Nexus -> Sled Agent for the update system #457

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 15 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions common/src/api/external/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -569,6 +569,7 @@ pub enum ResourceType {
Project,
Dataset,
Disk,
DownloadArtifact,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we're adding / have added a bunch of stuff here that's only appropriate to the internal API. I'm guessing that's because of the way we try to construct external::Errors from diesel errors using the helper function that takes a resource type. Maybe we should have a different helper that uses a different enum for internal stuff? Doesn't have to be done here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Filed #532 to track.

Instance,
NetworkInterface,
Rack,
Expand Down
78 changes: 78 additions & 0 deletions nexus/src/internal_api/http_entrypoints.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
// License, v. 2.0. If a copy of the MPL was not distributed with this
// file, You can obtain one at https://mozilla.org/MPL/2.0/.

use crate::nexus::BASE_ARTIFACT_DIR;
/**
* Handler functions (entrypoints) for HTTP APIs internal to the control plane
*/
Expand All @@ -19,13 +20,16 @@ use dropshot::HttpResponseUpdatedNoContent;
use dropshot::Path;
use dropshot::RequestContext;
use dropshot::TypedBody;
use http::{Response, StatusCode};
use hyper::Body;
use omicron_common::api::internal::nexus::DiskRuntimeState;
use omicron_common::api::internal::nexus::InstanceRuntimeState;
use omicron_common::api::internal::nexus::ProducerEndpoint;
use oximeter::types::ProducerResults;
use oximeter_producer::{collect, ProducerIdPathParams};
use schemars::JsonSchema;
use serde::Deserialize;
use std::path::PathBuf;
use std::sync::Arc;
use uuid::Uuid;

Expand All @@ -44,6 +48,7 @@ pub fn internal_api() -> NexusApiDescription {
api.register(cpapi_producers_post)?;
api.register(cpapi_collectors_post)?;
api.register(cpapi_metrics_collect)?;
api.register(cpapi_artifact_download)?;
Ok(())
}

Expand Down Expand Up @@ -280,3 +285,76 @@ async fn cpapi_metrics_collect(
.instrument_dropshot_handler(&request_context, handler)
.await
}

#[derive(Deserialize, JsonSchema)]
struct AllPath {
path: String,
}

/// Endpoint used by Sled Agents to download cached artifacts.
#[endpoint {
method = GET,
path = "/artifacts/{path}",
}]
async fn cpapi_artifact_download(
request_context: Arc<RequestContext<Arc<ServerContext>>>,
path: Path<AllPath>,
) -> Result<Response<Body>, HttpError> {
let context = request_context.context();
let nexus = &context.nexus;
let mut entry = PathBuf::from(BASE_ARTIFACT_DIR);

// TODO: Most of the below code is ready to accept a multi-component path,
// such as in:
// https://github.com/oxidecomputer/dropshot/blob/78be3deda556a9339ea09f3a9961fd91389f8757/dropshot/examples/file_server.rs#L86-L89
//
// However, openapi does *not* like that currently, so we limit the endpoint
// to only accepting single-component paths.
let path = vec![path.into_inner().path];

for component in &path {
// Dropshot should not provide "." and ".." components.
assert_ne!(component, ".");
assert_ne!(component, "..");
entry.push(component);

if entry.exists() {
// We explicitly prohibit consumers from following symlinks to prevent
// showing data outside of the intended directory.
let m = entry.symlink_metadata().map_err(|e| {
HttpError::for_bad_request(
None,
format!("Failed to query file metadata: {}", e),
)
})?;
if m.file_type().is_symlink() {
return Err(HttpError::for_bad_request(
None,
"Cannot traverse symlinks".to_string(),
));
}
}
}

// Note - at this point, "entry" may or may not actually exist.
// We try to avoid creating intermediate artifacts until we know there
// is something "real" to download, as this would let malformed paths
// create defunct intermediate directories.
if entry.is_dir() {
return Err(HttpError::for_bad_request(
None,
"Directory download not supported".to_string(),
));
}
let body = nexus.download_artifact(&entry).await?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a TOCTOU race here, if the path has changed since we checked it?

I think the traditional approach is to use open(2) and then fstat(2). I haven't dug into it but it looks like you can do this in Rust with File::open and then File::metadata.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm looking at the file server - I think that whole example may need a re-work, in addition to this "final entry" check. We also check that the intermediate paths are not symlinks, and I believe this suffers from the same TOCTTOU issue - they could be replaced by symlinks, since we're re-checking the whole path again.

In C, I would normally use openat to resolve this issue, but I don't see this function exposed from the Rust standard library (though I suppose I could use libc directly, or one of the wrapper crates).


// Derive the MIME type from the file name
let content_type = mime_guess::from_path(&entry)
.first()
.map_or_else(|| "text/plain".to_string(), |m| m.to_string());

Ok(Response::builder()
.status(StatusCode::OK)
.header(http::header::CONTENT_TYPE, content_type)
.body(body.into())?)
}
87 changes: 87 additions & 0 deletions nexus/src/nexus.rs
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ use sled_agent_client::Client as SledAgentClient;
use slog::Logger;
use std::convert::TryInto;
use std::net::SocketAddr;
use std::path::Path;
use std::sync::Arc;
use std::time::Duration;
use steno::SagaId;
Expand Down Expand Up @@ -107,6 +108,8 @@ pub trait TestInterfaces {
) -> CreateResult<db::model::ConsoleSession>;
}

pub static BASE_ARTIFACT_DIR: &str = "/var/tmp/oxide_artifacts";

/**
* Manages an Oxide fleet -- the heart of the control plane
*/
Expand Down Expand Up @@ -2375,6 +2378,90 @@ impl Nexus {
pub async fn session_hard_delete(&self, token: String) -> DeleteResult {
self.db_datastore.session_hard_delete(token).await
}

/// Downloads a file from within [`BASE_ARTIFACT_DIR`].
pub async fn download_artifact<P: AsRef<Path>>(
&self,
path: P,
) -> Result<Vec<u8>, Error> {
let path = path.as_ref();
if !path.starts_with(BASE_ARTIFACT_DIR) {
return Err(Error::internal_error(
"Cannot access path outside artifact directory",
));
}

if !path.exists() {
info!(
self.log,
"Accessing {} - needs to be downloaded",
path.display()
);
// If the artifact doesn't exist, we should download it.
//
// TODO: There also exists the question of "when should we *remove*
// things from BASE_ARTIFACT_DIR", which we should also resolve.
// Demo-quality solution could be "destroy it on boot" or something?
// (we aren't doing that yet).

let file_name = path.strip_prefix(BASE_ARTIFACT_DIR).unwrap();
match file_name.to_str().unwrap() {
// TODO: iliana if you're reading this,
// 1. I'm sorry
// 2. We should probably do something less bad here
//
// At the moment, the only file we "know" how to download is a
// testfile, which is pulled out of thin air. Realistically, we
// should pull this from the DB + query an external server.
// Happy to delete this as soon as we can.
"testfile" => {
// We should only create the intermediate directories
// after validating that this is a real artifact that
// can (and should) be downloaded.
if let Some(parent) = path.parent() {
tokio::fs::create_dir_all(parent).await.map_err(|e| {
Error::internal_error(
&format!("Failed to create intermediate directory: {}", e)
)
})?;
}
tokio::fs::write(path, "testfile contents").await.map_err(
|e| {
Error::internal_error(&format!(
"Failed to write file: {}",
e
))
},
)?;
}
_ => {
return Err(Error::not_found_other(
ResourceType::DownloadArtifact,
file_name.display().to_string(),
));
}
}
} else {
info!(self.log, "Accessing {} - already exists", path.display());
}

// TODO: These artifacts could be quite large - we should figure out how to
// stream this file back instead of holding it entirely in-memory in a
// Vec<u8>.
//
// Options:
// - RFC 7233 - "Range Requests" (is this HTTP/1.1 only?)
// https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests
// - "Roll our own". See:
// https://stackoverflow.com/questions/20969331/standard-method-for-http-partial-upload-resume-upload
let body = tokio::fs::read(&path).await.map_err(|e| {
Error::internal_error(&format!(
"Cannot read artifact from filesystem: {}",
e
))
})?;
Ok(body)
}
}

fn generate_session_token() -> String {
Expand Down
73 changes: 73 additions & 0 deletions nexus/tests/integration_tests/artifact_download.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
// This Source Code Form is subject to the terms of the Mozilla Public
// License, v. 2.0. If a copy of the MPL was not distributed with this
// file, You can obtain one at https://mozilla.org/MPL/2.0/.

use http::method::Method;
use http::StatusCode;
use nexus_test_utils::test_setup;

// Tests the "normal" case of downloading an artifact.
//
// This will typically be invoked by the Sled Agent, after instructed
// to access an artifact.
#[tokio::test]
async fn test_download_known_artifact_returns_ok() {
let cptestctx = test_setup("test_download_known_artifact_returns_ok").await;
let client = &cptestctx.internal_client;

// TODO: Can we replace this with a "real" small file that must be
// downloaded, instead of synthetically created?
let filename = "testfile";
let artifact_get_url = format!("/artifacts/{}", filename);

let response = client
.make_request_no_body(Method::GET, &artifact_get_url, StatusCode::OK)
.await
.unwrap();

assert_eq!(
hyper::body::to_bytes(response.into_body()).await.unwrap(),
"testfile contents"
);
cptestctx.teardown().await;
}

// Tests that missing artifacts return "NOT_FOUND".
#[tokio::test]
async fn test_download_bad_artifact_not_found() {
let cptestctx = test_setup("test_download_bad_artifact_not_found").await;
let client = &cptestctx.internal_client;

let filename = "not_a_real_artifact";
let artifact_get_url = format!("/artifacts/{}", filename);

client
.make_request_error(
Method::GET,
&artifact_get_url,
StatusCode::NOT_FOUND,
)
.await;

cptestctx.teardown().await;
}

// Tests that ".." paths are disallowed by dropshot.
#[tokio::test]
async fn test_download_with_dots_fails() {
let cptestctx = test_setup("test_download_with_dots_fails").await;
let client = &cptestctx.internal_client;

let filename = "hey/can/you/look/../../../../up/the/directory/tree";
let artifact_get_url = format!("/artifacts/{}", filename);

client
.make_request_error(
Method::GET,
&artifact_get_url,
StatusCode::BAD_REQUEST,
)
.await;

cptestctx.teardown().await;
}
1 change: 1 addition & 0 deletions nexus/tests/integration_tests/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
//! See the driver in the parent directory for how and why this is structured
//! the way it is.

mod artifact_download;
mod authn_http;
mod authz;
mod basic;
Expand Down
18 changes: 18 additions & 0 deletions openapi/nexus-internal.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,24 @@
"version": "0.0.1"
},
"paths": {
"/artifacts/{path}": {
"get": {
"description": "Endpoint used by Sled Agents to download cached artifacts.",
"operationId": "cpapi_artifact_download",
"parameters": [
{
"in": "path",
"name": "path",
"required": true,
"schema": {
"type": "string"
},
"style": "simple"
}
],
"responses": {}
}
},
"/disks/{disk_id}": {
"put": {
"description": "Report updated state for a disk.",
Expand Down
Loading