Skip to content

io-context overwhelmed by small 16KB TLS reads limits high-throughput scenarios #3062

@hofst

Description

@hofst

We use boost beast to read from S3 in high-throughput scenarios: files with hundreds of GBs on machines with 192 cores and 100Gb/s network interfaces to S3 from ec2.
The io-context is limited to ~500k operations per second and saturates at ~8 threads on our machines. More threads will just burn through CPU cycles fighting for locks.
500k operations per second should be plenty to achieve 8+GB/s throughput but it isn't: Every read_some operation only reads a single TLS fragment (max size is 16KB) even when all buffers (socket and user-space buffers) are decently sized. This leads to a limit of 3GB/sec while still beeing inefficient with the io-context busy with tiny read operations.
I tried a hacky boost modification to read more than one TLS fragment per read_some operation which promptly increased throughput by 3x and is close to the theoretical maximum of the machine. All while requiring fewer operations (and CPU utilization) in the io-context.

I don't see any way to control this without modifying boost itself - am I missing something or is this scenario currently not covered?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions