Skip to content

Commit

Permalink
Reintroduce h2 work originally done in #286 (#307)
Browse files Browse the repository at this point in the history
  • Loading branch information
mtrudel authored Mar 14, 2024
1 parent 26dc5e1 commit fff06ef
Show file tree
Hide file tree
Showing 29 changed files with 1,719 additions and 1,694 deletions.
27 changes: 27 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,30 @@
## 1.4.0 (TBD)

### Enhancements

* Complete refactor of HTTP/2. Improved process model is MUCH easier to
understand and yields about a 10% performance boost to HTTP/2 requests (#286)

### Changes

* **BREAKING CHANGE** The HTTP/2 header size limit options have been deprecated,
and have been replaced with a single `max_header_block_size` option. The setting
defaults to 50k bytes, and refers to the size of the compressed header block
as sent on the wire (including any continuation frames)
* We no longer log if processes that are linked to an HTTP/2 stream process
terminate unexpectedly. This has always been unspecified behaviour so is not
considered a breaking change
* Calls of `Plug.Conn` functions for an HTTP/2 connection must now come from the
stream process; any other process will raise an error. Again, this has always
been unspecified behaviour
* Reading the body of an HTTP/2 request after it has already been read will
return `{:ok, ""}` instead of raising a `Bandit.BodyAlreadyReadError` as it
previously did
* We now send RST_STREAM frames if we complete a stream and the remote end is
still open. This optimizes cases where the client may still be sending a body
that we never consumed and don't care about
* We no longer explicitly close the connection when we receive a GOAWAY frame

## 1.3.0 (8 Mar 2024)

### Enhancements
Expand Down
15 changes: 5 additions & 10 deletions lib/bandit.ex
Original file line number Diff line number Diff line change
Expand Up @@ -124,12 +124,9 @@ defmodule Bandit do
Options to configure the HTTP/2 stack in Bandit
* `enabled`: Whether or not to serve HTTP/2 requests. Defaults to true
* `max_header_key_length`: The maximum permitted length of any single header key
(expressed as the number of decompressed bytes) in an HTTP/2 request. Defaults to 10_000 bytes
* `max_header_value_length`: The maximum permitted length of any single header value
(expressed as the number of decompressed bytes) in an HTTP/2 request. Defaults to 10_000 bytes
* `max_header_count`: The maximum permitted number of headers in an HTTP/2 request.
Defaults to 50 headers
* `max_header_block_size`: The maximum permitted length of a field block of an HTTP/2 request
(expressed as the number of compressed bytes). Includes any concatenated block fragments from
continuation frames. Defaults to 50_000 bytes
* `max_requests`: The maximum number of requests to serve in a single
HTTP/2 connection before closing the connection. Defaults to 0 (no limit)
* `default_local_settings`: Options to override the default values for local HTTP/2
Expand All @@ -142,9 +139,7 @@ defmodule Bandit do
"""
@type http_2_options :: [
enabled: boolean(),
max_header_key_length: pos_integer(),
max_header_value_length: pos_integer(),
max_header_count: pos_integer(),
max_header_block_size: pos_integer(),
max_requests: pos_integer(),
default_local_settings: Bandit.HTTP2.Settings.t(),
compress: boolean(),
Expand Down Expand Up @@ -199,7 +194,7 @@ defmodule Bandit do

@top_level_keys ~w(plug scheme port ip keyfile certfile otp_app cipher_suite display_plug startup_log thousand_island_options http_1_options http_2_options websocket_options)a
@http_1_keys ~w(enabled max_request_line_length max_header_length max_header_count max_requests gc_every_n_keepalive_requests log_unknown_messages log_protocol_errors compress deflate_options)a
@http_2_keys ~w(enabled max_header_key_length max_header_value_length max_header_count max_requests default_local_settings compress deflate_options)a
@http_2_keys ~w(enabled max_header_block_size max_requests default_local_settings compress deflate_options)a
@websocket_keys ~w(enabled max_frame_size validate_text_frames compress)a
@thousand_island_keys ThousandIsland.ServerConfig.__struct__()
|> Map.from_struct()
Expand Down
68 changes: 37 additions & 31 deletions lib/bandit/http2/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
# HTTP/2 Handler

Included in this folder is a complete `ThousandIsland.Handler` based implementation of HTTP/2 as
defined in [RFC 9113](https://datatracker.ietf.org/doc/rfc9113).
defined in [RFC 9110](https://datatracker.ietf.org/doc/rfc9110) & [RFC
9113](https://datatracker.ietf.org/doc/rfc9113)

## Process model

Expand All @@ -10,24 +11,31 @@ Within a Bandit server, an HTTP/2 connection is modeled as a set of processes:
* 1 process per connection, a `Bandit.HTTP2.Handler` module implementing the
`ThousandIsland.Handler` behaviour, and;
* 1 process per stream (i.e.: per HTTP request) within the connection, implemented as
a `Bandit.HTTP2.StreamTask` Task
a `Bandit.HTTP2.StreamProcess` process

Each of these processes model the majority of their state via a
`Bandit.HTTP2.Connection` & `Bandit.HTTP2.Stream` struct, respectively.

The lifetimes of these processes correspond to their role; a connection process lives for as long
as a client is connected, and a stream process lives only as long as is required to process
a single stream request within a connection.
a single stream request within a connection.

Connection processes are the 'root' of each connection's process group, and are supervised by
Thousand Island in the same manner that `ThousandIsland.Handler` processes are usually supervised
(see the [project README](https://github.com/mtrudel/thousand_island) for details).

Stream processes are not supervised by design. The connection process starts new stream processes as required, and does so
once a complete header block for a new stream has been received. It starts stream processes via
a standard `start_link` call, and manages the termination of the resultant linked stream processes
by handling `{:EXIT,...}` messages as described in the Elixir documentation. This approach is
aligned with the realities of the HTTP/2 model, insofar as if a connection process terminates
there is no reason to keep its constituent stream processes around, and if a stream process dies
the connection should be able to handle this without itself terminating. It also means that our
process model is very lightweight - there is no extra supervision overhead present because no such
Stream processes are not supervised by design. The connection process starts new
stream processes as required, via a standard `start_link`
call, and manages the termination of the resultant linked stream processes by
handling `{:EXIT,...}` messages as described in the Elixir documentation. Each
stream process stays alive long enough to fully model an HTTP/2 stream,
beginning its life in the `:init` state and ending it in the `:closed` state (or
else by a stream or connection error being raised). This approach is aligned
with the realities of the HTTP/2 model, insofar as if a connection process
terminates there is no reason to keep its constituent stream processes around,
and if a stream process dies the connection should be able to handle this
without itself terminating. It also means that our process model is very
lightweight - there is no extra supervision overhead present because no such
supervision is required for the system to function in the desired way.

## Reading client data
Expand All @@ -40,13 +48,15 @@ looks like the following:
2. Frames are parsed from these bytes by calling the `Bandit.HTTP2.Frame.deserialize/2`
function. If successful, the parsed frame(s) are returned. We retain any unparsed bytes in
a buffer in order to attempt parsing them upon receipt of subsequent data from the client
3. Parsed frames are passed into the `Bandit.HTTP2.Connection` module along with a struct of
same module. Frames are applied against this struct in a vaguely FSM-like manner, using pattern
matching within the `Bandit.HTTP2.Connection.handle_frame/3` function. Any side-effects of
received frames are applied in these functions, and an updated connection struct is returned to
represent the updated connection state. These side-effects can take the form of starting stream
tasks, conveying data to running stream tasks, responding to the client with various frames, or
any number of other actions
3. Parsed frames are passed into the `Bandit.HTTP2.Connection` module along with a struct of
same module. Frames are processed via the `Bandit.HTTP2.Connection.handle_frame/3` function.
Connection-level frames are handled within the `Bandit.HTTP2.Connection`
struct, and stream-level frames are passed along to the corresponding stream
process, which is wholly responsible for managing all aspects of a stream's
state (which is tracked via the `Bandit.HTTP2.Stream` struct). The one
exception to this is the handling of frames sent to streams which have
already been closed (and whose corresponding processes have thus terminated).
Any such frames are discarded without effect.
4. This process is repeated every time we receive data from the client until the
`Bandit.HTTP2.Connection` module indicates that the connection should be closed, either
normally or due to error. Note that frame deserialization may end up returning a connection
Expand All @@ -58,19 +68,15 @@ looks like the following:

## Processing requests

The details of a particular stream are contained within a `Bandit.HTTP2.Stream` struct
(as well as a `Bandit.HTTP2.StreamTask` process in the case of active streams). The
`Bandit.HTTP2.StreamCollection` module manages a collection of streams, allowing for the memory
efficient management of complete & yet unborn streams alongside active ones.

Once a complete header block has been read, a `Bandit.HTTP2.StreamTask` is started to manage the
actual calling of the configured `Plug` module for this server, using the `Bandit.HTTP2.Adapter`
module as the implementation of the `Plug.Conn.Adapter` behaviour. This adapter uses a simple
`receive` pattern to listen for messages sent to it from the connection process, a pattern chosen
because it allows for easy provision of the blocking-style API required by the `Plug.Conn.Adapter`
behaviour. Functions in the `Bandit.HTTP2.Adapter` behaviour which write data to the client use
`GenServer` calls to the `Bandit.HTTP2.Handler` module in order to pass data to the connection
process.
The state of a particular stream are contained within a `Bandit.HTTP2.Stream`
struct, maintained within a `Bandit.HTTP2.StreamProcess` process. As part of the
stream's lifecycle, the server's configured Plug is called, with an instance of
the `Bandit.HTTP2.Adapter` struct being used to interface with the Plug. There
is a separation of concerns between the aspect of HTTP semantics managed by
`Bandit.HTTP2.Adapter` (roughly, those concerns laid out in
[RFC9110](https://datatracker.ietf.org/doc/html/rfc9110)) and the more
transport-specific HTTP/2 concerns managed by `Bandit.HTTP2.Stream` (roughly the
concerns specified in [RFC9113](https://datatracker.ietf.org/doc/html/rfc9113)).

# Testing

Expand Down
Loading

0 comments on commit fff06ef

Please sign in to comment.