Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for server.Policy does not agree with implementation #194

Closed
zenhack opened this issue Dec 5, 2021 · 4 comments
Closed

Documentation for server.Policy does not agree with implementation #194

zenhack opened this issue Dec 5, 2021 · 4 comments

Comments

@zenhack
Copy link
Contributor

zenhack commented Dec 5, 2021

The documentation for server.Policy indicates that when the limits are exceeded calls will immediately return exceptions. However, when pairing to debug #189, @lthibault and I discovered this is not actually what happens. Instead, the receive loop just blocks until an item in the queue clears.

However, I do not actually think what the documentation describes is what we want; this would mean lost messages if the sender is sending too fast, and it seems like it would be challenging to program around. We really want some kind of backpressure like that provided by the C++ implementation's streaming support.

I propose the following roadmap for dealing with memory/backpressure instead:

  1. Short term, fix the docs to describe what the code actually does.
  2. Slightly longer term, get rid of the bounded queue and replace it with an unbounded one.
  3. Remove the server.Policy type entirely.
  4. Implement something like the C++ implementation's per connection flow limit, for some soft backpressure at the connection level. Probably as a field on rpc.Options
  5. Implement per-object flow control as is done in the C++ implementation for streaming specifically. I think we can do this for all clients irrespective of the streaming annotation, without changing any APIs; we just need to block the caller in appropriate places.
  6. In rpc.Options, add a field for a hard limit on memory consumed by outstanding rpc messages (of all types, including returns), after which the connection will just be dropped.

Thoughts?

@zombiezen, particularly since I'm proposing removing something you added for v3, I would appreciate your thoughts.

@zombiezen
Copy link
Contributor

The documentation for server.Policy indicates that when the limits are exceeded calls will immediately return exceptions. [...] Instead, the receive loop just blocks until an item in the queue clears.

Yeah, looks like I introduced that in 27748fa without updating docs. Sorry.

However, I do not actually think what the documentation describes is what we want; this would mean lost messages if the sender is sending too fast, and it seems like it would be challenging to program around. We really want some kind of backpressure like that provided by the C++ implementation's streaming support.

I agree, server.Policy seems like a kludge and it would be a better developer experience to have backpressure than to have a capability reject method calls. My memory is hazy here on details, but it seems like I might have realized the mistake and was trying to move more toward backpressure.


I did spill a lot of words at the time I was crunching through this that might help explain: https://groups.google.com/g/capnproto/c/A4LiXXd1t94/m/GkFH1AK-AgAJ

The fundamental insight that I have is that Cap'n Proto RPC is a work plan stream across all capabilities. You necessarily need to queue under the assumption traffic will be bursty, but there are situations that can deadlock if not enough work resolves in time.

IIUC, if there's protocol-level support for applying backpressure to individual objects while preserving E-Order, that would be better than the state of the world when I was trying to hack in server.Policy.

An unbounded queue sounds potentially dangerous, though, but I think you're addressing that with the rpc.Options memory limit? That does match with my insight above.

@zenhack
Copy link
Contributor Author

zenhack commented Dec 13, 2021

An unbounded queue sounds potentially dangerous, though, but I think you're addressing that with the rpc.Options memory limit? That does match with my insight above.

Right, the idea is that rather than relying on a queue bound for memory limiting, we'd track it manually. But we would expect under normal circumstances we would never reach the memory limit; it's not for backpressure, it's a stopgap for abuse, and we just drop the connection if it is breached. We rely on cooperative mechanisms for "normal" backpressure. See also #198, where I started working on per-object backpressure.

I vaguely remember the mailing list thread. See also https://groups.google.com/g/capnproto/c/wbGvhHaBan4 which from a couple weeks ago.

@zombiezen
Copy link
Contributor

Cool. Sounds like the right direction; let me know if there's anything else I can help with.

@zenhack
Copy link
Contributor Author

zenhack commented Aug 27, 2022

Everything on my roadmap above is done except for (1) the connection-level limits and (2) adaptive flow control (which to be fair the C++ implementation doesn't really have anyway; we already have a non-adaptive version). (2) is in progress as #294. I'm going to open a separate issue for the connection limits and close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants