Description
I'm trying to transfer a bigger amount of data (~50-100MB, in 1MB chunks) through a binary[1] websocket, but it's ridiculously slow (it can take 8-10 seconds to transfer 50MB of data on localhost, while both my client and server uses 100% CPU). After a bit of profiling, it turned out that more than 99% of the total runtime is spent here:
protocol-websocket/lib/protocol/websocket/frame.rb
Lines 105 to 107 in 6280c19
Collecting the relevant parts into a standalone ruby file:
require 'securerandom'
mask = SecureRandom.bytes(4)
data = ' ' * (1024*1024*50)
@payload = String.new.b
for i in 0...data.bytesize do
@payload << (data.getbyte(i) ^ mask.getbyte(i % 4))
end
Running this code takes about 8.6s on my machine (the initialization part is about 0.09s, ruby 3.1.2, amd64, gentoo linux).
Doing things inplace, and no modulo is a bit faster, about 7.3s:
(0...data.bytesize).each do |i|
data.setbyte(i, data.getbyte(i) ^ mask.getbyte(i & 3))
end
(Interestingly Range#each is faster than the for loop.) Is it possible to do something about this other than writing a C extension?
For the time being, I've worked around the problem by changint the mask: true
to false
here: https://github.com/socketry/async-websocket/blob/a63151d0759edde5ca5cd9ffa0414d9f2e295ee8/lib/async/websocket/client.rb#L39 which works on my development setup, but it's AFAIK a violation of the websocket spec, so it could break with proxies and whatever.
[1]: I'm using a handler class like this, because transferring binary data in json is not fun:
class BinaryWebSocket < Async::WebSocket::Connection
def parse x; x; end
def dump x; x; end
end