feat: add logging to streams #6924
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
LoggingStream
is a wrapper for IMAP/SMTP/HTTP session streams. Originally I started building it with the idea to debug #6477 and similar issues where IMAP loop gets stuck with the hope that right before the loop gets stuck or even continuously there are socket errors like a read timeout that are not treated correctly.Now I also think about expanding it for measuring network performance, e.g. throughput and latency.
We can estimate latency by measuring the time between the last full write() and the first successful read(). This should measure e.g. the time between writing "A0001 IDLE" and receiving "+", but not the time when we are in a long read() actually idling. This latency might however actually be low on throttled connections and not reflect that message downloading is slow. If throttling works as a rate limiter and not some hack like random packet dropping, then latency will not increase.
For throughput it's possible to measure something by averaging over the intervals from first read after full write until the end of last read followed by a write. We don't run multiple IMAP commands in parallel, so this should be a good estimation.
When connecting to servers we currently sort IPs as at most two IPs returned from DNS followed by DNS cache, followed by the rest of DNS results. If we have throughput estimations, we may sort by the expected (sampled from empirical distribution) throughput and avoid connecting to IP addresses that are known to be slow or timeout frequently due to congestion or bad routing over low-rate or high packet loss connections.