Description
Followup to #921 which we merged, and I was hoping to do some post-merge followup on. This issue will track some of that followup.
- docs Various updates for
--progress-fd
#961 - integration tests tests: Add some sanity checking of --json-fd #1017
- size concerns
size concerns
In the PR we kept going back and forth on whether the progress data should require the client to keep track of state or not. @antheas was calling this "normalization". There's clear simplicity in having the protocol be stateless.
However...looking at the layer progress (one of the most important phases) in a local run here during a download, we keep accumulating and emitting more data (completed subtasks) for each layer. In this run the final ProgressBytes
I got was 12 KiB which...is a bit more than I was expecting.
Click for (prettified) example:
Does that size actually matter? Well...obviously we're talking about local IPC so 12k isn't large exactly, but we are also trying to emit this data at a relatively high frequency. Note that a default Linux pipe buffer is 64k, so if the reader slows down a bit that's only a maximum of 5 messages queued before we fill the pipe.
The size of this progress data will grow with the number of layers...let's say in the future we start using 100 layers (not at all an unreasonable thing!) that'd double to 25k of JSON etc.
This is data we're serializing and having the caller to deserialize ~5 times a second, competing with actual real work (like decompression, sha256 etc.).
OK well, benchmarking this obviously CPUs are super fast and python parses it in 44.4 usec
here...but...still...it wouldn't be a huge burden at all to have a stateful protocol either.