Description
It writes the data and then closes the connection, but sometimes the receiver does not receive all the data.
It seems to be the cause of the CI failures in elastic/integrations#11075, elastic/integrations#11224 (this one for tls
and tcp
), and elastic/integrations#10620.
Reproduce the bug
It can be reproduced as follows:
In one terminal, run this:
while true; do
echo Running the server...
echo
openssl s_server -accept 4433 -naccept 1 \
-cert ~/.elastic-package/profiles/default/certs/elastic-agent/cert.pem \
-key ~/.elastic-package/profiles/default/certs/elastic-agent/key.pem \
2>&1 | pv -L 100k | tee output.log
echo
if grep -q "ERROR" output.log; then
echo "Error detected. Stopping."
break
fi
done
In another terminal, from the root of the integrations
repository, repeatedly run the following, until the loop in the first terminal stops with an error:
stream log --delay=1s --addr localhost:4433 -p=tls --insecure packages/cyberarkpas/_dev/deploy/docker/sample_logs/audit/*.log
This will trigger an error on my system within a few runs. If necessary, try lowering the limit set by pv
to apply additional backpressure on the server.
When there is an error, the stream log
output will show that all files were sent, but the server log file will show that less data was received, and it will end with an ERROR
.
Possible fixes
Closing a connection will usually ensure that all written data is sent. For TLS connections, after writing it may be necessary to keep the read side open for longer in order for the sent data to be accepted without error.
The more frequently observed failures and the reproduction were for TLS, but it may not be a TLS-only issue.