Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Factors] Change how incoming-request.authority is set. #2723

Merged
merged 3 commits into from
Aug 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions crates/factor-outbound-http/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ mod wasi;
pub mod wasi_2023_10_18;
pub mod wasi_2023_11_10;

use std::net::SocketAddr;

use anyhow::Context;
use http::{
uri::{Authority, Parts, PathAndQuery, Scheme},
Expand Down Expand Up @@ -101,6 +103,16 @@ pub struct SelfRequestOrigin {
}

impl SelfRequestOrigin {
pub fn create(scheme: Scheme, addr: &SocketAddr) -> anyhow::Result<Self> {
Ok(SelfRequestOrigin {
scheme,
authority: addr
.to_string()
.parse()
.with_context(|| format!("address '{addr}' is not a valid authority"))?,
})
}

pub fn from_uri(uri: &Uri) -> anyhow::Result<Self> {
Ok(Self {
scheme: uri.scheme().context("URI missing scheme")?.clone(),
Expand Down
10 changes: 10 additions & 0 deletions crates/trigger-http2/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,10 @@ pub(crate) type InstanceState = ();

/// The Spin HTTP trigger.
pub struct HttpTrigger {
/// The address the server will listen on.
///
/// Note that this might not be the actual socket address that ends up being bound to.
/// If the port is set to 0, the actual address will be determined by the OS.
listen_addr: SocketAddr,
tls_config: Option<TlsConfig>,
router: Router,
Expand Down Expand Up @@ -131,6 +135,12 @@ impl Trigger for HttpTrigger {
.await
.with_context(|| format!("Unable to listen on {listen_addr}"))?;

// Get the address the server is actually listening on
// We can't use `self.listen_addr` because it might not
// be fully resolved (e.g, port 0).
let listen_addr = listener
.local_addr()
.context("failed to retrieve address server is listening on")?;
let server = Arc::new(HttpServer::new(
listen_addr,
trigger_app,
Expand Down
3 changes: 2 additions & 1 deletion crates/trigger-http2/src/outbound_http.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ use std::{
sync::Arc,
};

use http::uri::Scheme;
use spin_factor_outbound_http::{
HostFutureIncomingResponse, InterceptOutcome, OutgoingRequestConfig, Request, SelfRequestOrigin,
};
Expand Down Expand Up @@ -42,7 +43,7 @@ impl spin_factor_outbound_http::OutboundHttpInterceptor for OutboundHttpIntercep
let server = self.server.clone();
let resp_fut = async move {
match server
.handle_trigger_route(req, route_match, CHAINED_CLIENT_ADDR)
.handle_trigger_route(req, route_match, Scheme::HTTP, CHAINED_CLIENT_ADDR)
.await
{
Ok(resp) => Ok(Ok(IncomingResponse {
Comment on lines 43 to 49
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should be setting the SelfRequestOrigin unconditionally in this interceptor. It seems like self requests ought to work in service-chained handlers? 🤔

Expand Down
64 changes: 48 additions & 16 deletions crates/trigger-http2/src/server.rs
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ use crate::{
};

pub struct HttpServer {
/// The address the server is listening on.
listen_addr: SocketAddr,
trigger_app: TriggerApp,
router: Router,
Expand Down Expand Up @@ -74,7 +75,8 @@ impl HttpServer {
self.print_startup_msgs("http", &listener)?;
loop {
let (stream, client_addr) = listener.accept().await?;
self.clone().serve_connection(stream, client_addr);
self.clone()
.serve_connection(stream, Scheme::HTTP, client_addr);
}
}

Expand All @@ -88,28 +90,32 @@ impl HttpServer {
loop {
let (stream, client_addr) = listener.accept().await?;
match acceptor.accept(stream).await {
Ok(stream) => self.clone().serve_connection(stream, client_addr),
Ok(stream) => self
.clone()
.serve_connection(stream, Scheme::HTTPS, client_addr),
Err(err) => tracing::error!(?err, "Failed to start TLS session"),
}
}
}

/// Handles incoming requests using an HTTP executor.
///
/// This method handles well known paths and routes requests to the handler when the router
/// matches the requests path.
async fn handle(
self: &Arc<Self>,
mut req: Request<Body>,
scheme: Scheme,
server_scheme: Scheme,
client_addr: SocketAddr,
) -> anyhow::Result<Response<Body>> {
set_req_uri(&mut req, scheme, self.listen_addr)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be done at the top of instrumented_service_fn? The http_span! there emits the scheme for o11y. As a bonus it would avoid needing to thread the server_scheme param through here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting it in instrumented_service_fn moves the logic up the call stack where we need to move it down the call stack.

The call graph is: instrumented_service_fn -> handle -> handle_trigger_route. The handler for service chained calls calls handle_trigger_route which is why we moved set_req_uri to that function - the service chained request also needs its uri manipulated. If we were to move that logic up to instrumented_service_fn, the issue this change is trying to address would persist.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah gotcha. Perhaps it would make sense to split the scheme and host updates? 🤔

strip_forbidden_headers(&mut req);

spin_telemetry::extract_trace_context(&req);

tracing::info!("Processing request on URI {}", req.uri());

let path = req.uri().path().to_string();

tracing::info!("Processing request on path '{path}'");

// Handle well-known spin paths
if let Some(well_known) = path.strip_prefix(spin_http::WELL_KNOWN_PREFIX) {
return match well_known {
Expand All @@ -124,19 +130,22 @@ impl HttpServer {

match self.router.route(&path) {
Ok(route_match) => {
self.handle_trigger_route(req, route_match, client_addr)
self.handle_trigger_route(req, route_match, server_scheme, client_addr)
.await
}
Err(_) => Self::not_found(NotFoundRouteKind::Normal(path.to_string())),
}
}

/// Handles a successful route match.
pub async fn handle_trigger_route(
self: &Arc<Self>,
req: Request<Body>,
mut req: Request<Body>,
route_match: RouteMatch,
server_scheme: Scheme,
client_addr: SocketAddr,
) -> anyhow::Result<Response<Body>> {
set_req_uri(&mut req, server_scheme.clone())?;
let app_id = self
.trigger_app
.app()
Expand All @@ -155,9 +164,7 @@ impl HttpServer {
let mut instance_builder = self.trigger_app.prepare(component_id)?;

// Set up outbound HTTP request origin and service chaining
let uri = req.uri();
let origin = SelfRequestOrigin::from_uri(uri)
.with_context(|| format!("invalid request URI {uri:?}"))?;
let origin = SelfRequestOrigin::create(server_scheme, &self.listen_addr)?;
instance_builder
.factor_builders()
.outbound_http()
Expand Down Expand Up @@ -259,6 +266,7 @@ impl HttpServer {
fn serve_connection<S: AsyncRead + AsyncWrite + Unpin + Send + 'static>(
self: Arc<Self>,
stream: S,
server_scheme: Scheme,
client_addr: SocketAddr,
) {
task::spawn(async move {
Expand All @@ -267,7 +275,11 @@ impl HttpServer {
.serve_connection(
TokioIo::new(stream),
service_fn(move |request| {
self.clone().instrumented_service_fn(client_addr, request)
self.clone().instrumented_service_fn(
server_scheme.clone(),
client_addr,
request,
)
}),
)
.await
Expand All @@ -279,6 +291,7 @@ impl HttpServer {

async fn instrumented_service_fn(
self: Arc<Self>,
server_scheme: Scheme,
client_addr: SocketAddr,
request: Request<Incoming>,
) -> anyhow::Result<Response<HyperOutgoingBody>> {
Expand All @@ -291,7 +304,7 @@ impl HttpServer {
body.map_err(wasmtime_wasi_http::hyper_response_error)
.boxed()
}),
Scheme::HTTP,
server_scheme,
client_addr,
)
.await;
Expand Down Expand Up @@ -322,11 +335,30 @@ impl HttpServer {

/// The incoming request's scheme and authority
///
/// The incoming request's URI is relative to the server, so we need to set the scheme and authority
fn set_req_uri(req: &mut Request<Body>, scheme: Scheme, addr: SocketAddr) -> anyhow::Result<()> {
/// The incoming request's URI is relative to the server, so we need to set the scheme and authority.
/// The `Host` header is used to set the authority. This function will error if no `Host` header is
/// present or if it is not parsable as an `Authority`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on hyperium/hyper#1612 I believe this should also consider the req.uri().authority(), verifying that they are identical if both are present.

fn set_req_uri(req: &mut Request<Body>, scheme: Scheme) -> anyhow::Result<()> {
let uri = req.uri().clone();
let mut parts = uri.into_parts();
let authority = format!("{}:{}", addr.ip(), addr.port()).parse().unwrap();
let headers = req.headers();
let host_header = headers
.get(http::header::HOST)
.context("missing 'Host' header")?
.to_str()
.context("'Host' header is not valid UTF-8")?;
let authority = host_header
.parse()
.context("'Host' header contains an invalid authority")?;
// Ensure that if `req.authority` is set, it matches what was in the `Host` header
// https://github.com/hyperium/hyper/issues/1612
if let Some(a) = parts.authority.as_ref() {
if a != &authority {
return Err(anyhow::anyhow!(
"authority in 'Host' header does not match authority in URI"
));
}
}
Comment on lines +355 to +361
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I wasn't very clear. I believe it is valid to have just the host header, just the uri authority, or both set, where in the "both" case they must match.

Egads this seems complicated: https://www.rfc-editor.org/rfc/rfc9113.html#section-8.3.1-2.3.3

parts.scheme = Some(scheme);
parts.authority = Some(authority);
*req.uri_mut() = Uri::from_parts(parts).unwrap();
Expand Down
Loading