The LineWriter (used by stdout) searches for a newline in each .write() call. As compiled today, this byte search compiles to a byte-by-byte loop. This is a very hot loop, and unnecessarily so if you output data that's not line delimited. Using memchr or equivalent would improve this.