-
-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client.createReadableStream() #44
Comments
hmm ok this sucks. Can you provide a gist that's similar enough to your example? Then I'll use trace to see what's really going on. |
Why does level up turn buffers into objects? Are you using |
And thanks a lot for bringing this issue here! |
I have an idea: now, see they are just piped directly. however, when the client pauses the stream, If this is correct, then the solution is to use a similar method as tcp. hmm, mux-demux might need a change to implement this, but it would be quite simple. I wanted to have this when I first wrote mux-demux, But first we should confirm this is the problem at hand. |
i'm seeing a similar issue with regards to point 2 of this issue. i have a client that opens several read streams over the same rpc channel concurrently. during high load, the server process CPU spikes and memory leaks. over the course of 8 days, my multilevel server process consumes all the RAM and is killed by OOM. i had a theory that multiple read streams were buffering on the server due to the busy rpc stream connection, but based on @dominictarr's comment, it sounds like that might not be the case. |
I'm seeing the same issue. A helper script that we use streams the entire database into a custom transform stream in order to tidy things up. If we run it as is the server runs at 100% CPU and consumes memory until it gets killed. Our solution is to batch things up but the extra code isn't pretty and we're losing out on the elegance of nodes streams. It does seem like the need for back pressure is not getting through. |
ok can someone help the maintainers by creating a script / gist that shows exactly this problem? |
Hello - thanks again for the nice work on multilevel. Works great for the most part, and is making my life much easier. I recently ran into an issue with the
createReadableStream()
.My scenario: I have a 60 gig level database with about 1.5m keys. In one process (my master process) I am traversing through the database and doing a 5-6 second operation on each item in batches of 10 or so and updating those items...so a
createReadableStream()
piped into a number of in process 'workers', each of which are callingdb.put()
when they're done. In another process I am using multilevel to get a readable stream to my master process database and doing a different set of operations using the another set of workers. The output of the client process is into a separate system, so the multilevel client process does not call anything other than a singlemultilevel.createReadableStream()
when it boots up.I have run into a few issues specifically related to the readable stream I'd like to bring up...I'm probably missing something in how I'm using it.
I apologize if my examples are kinda hard to understand...I spent the better part of Wednesday thrashing on this and fiddling various things very unscientifically without taking extremely detailed notes. I eventually came to the conclusion something in the multilevel "stack" is either buffering endlessly or being very inefficient with allocations. As a temporary solution I created a module to handle streaming large result-sets from leveldb more efficiently. It only works with leveldb because it uses leveldown to avoid turning records from binary into objects and directly back into binary to go out over the network. I've dropped it in as an addition to multilevel, and CPU usage is down from 100% pegged when a client is connected to barely increasing at all, the client can read rows much more quickly, and when the client connects the server CPU usage returns to an idle state.
https://github.com/brianc/node-level-readable
I think the main differences are instead of using the rpc-stream it uses a custom, light-weight binary protocol (based on the PostgreSQL client/server protocol), avoids object allocations on the server, and uses's node's stream2 stuff internally to respond to back-pressure the best I can make it. I'd be happy to try to work this into a pull request instead of having it be a separate thing...but it is specific directly to level-down (using
db.iterator()
) and doesn't conform to the rest of the library using rpc-stream so I'm not sure how you feel about that.It's not battle hardened yet by any means, and lacks documentation because its not quite ready yet, but some folks on the ##leveldb irc suggested I drop by and talk about it a bit. It has proved to be faster, use less memory and less CPU. It's completely not general purpose like multilevel, but thought maybe it could be useful.
Thanks again and sorry for the ramblings!
The text was updated successfully, but these errors were encountered: