-
Notifications
You must be signed in to change notification settings - Fork 460
Internalize reconnection mechanics and introduce connection pooling #492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -52,6 +52,9 @@ | |||
@Param({"10000"}) | |||
public int notificationCount; | |||
|
|||
@Param({"1", "4", "8"}) | |||
public int concurrentConnections; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the curious, early benchmarks show things to be maaaaaaaaybe slightly faster (it's pretty much in the noise) than they were before channel pooling with a single connection. Unsurprisingly, a single client's throughput goes up significantly when using multiple threads/connections.
c79a916
to
57c9e12
Compare
This one would be very useful. |
57c9e12
to
d36b7e9
Compare
Aside from the part where I think the Travis CI team is trolling me by constantly fixing and then un-fixing a weird hostname bug for OpenJDK 7, I think this is ready to roll. I'll sleep on it before I merge it, but anticipate a release some time tomorrow. |
f444a70
to
1dc2d7e
Compare
0aeafbd
to
c8ce83b
Compare
For those of you waiting patiently for this change (and release): given the significance of the changes, we want to observe it in production for a little while on our end before we unleash it upon the wider world. I'm hoping for a release at the end of next week. |
Our work in #515 prompted some testing with very high levels of leak detection, which in turn increased load and slowed everything down. That, in turn, revealed some sporadic test failures when sending many notifications, which hints at a race condition. We're tracking it down, but it's something we'll need to resolve before the next release. |
Here's what I've figured out so far:
The last part is really puzzling to me, and will be the focus of the investigation from here. |
In all cases of failure, the client sends a |
Found it. We're attaching response promises to streams asynchronously in the client handler, and in some cases, we can get a reply from the server before that promise is in place, which causes a |
… avoid race conditions.
ddea9dc
to
ed1d764
Compare
Everything is fixed in ed1d764. |
After several false starts, I'm pleased to share this work-in-progress attempt at channel pooling. In the past, the model was that a single
ApnsClient
would have a single connection to the APNs server. In this pull request, each client now has anApnsChannelPool
which manages access to one or more connections. Connections are created on demand, which eliminates the notion of explicitly connecting (or reconnecting) anApnsClient
.Because the model is changing significantly, this change will introduce lots of breaking API changes. I'm very interested in all of your feedback.
Again, I emphasize that this is a work in progress, but I do think it should be fairly representative of the shape of things to come. I'm also hopeful that this will resolve the host of reconnection issues folks have been reporting lately. In particular, I believe this:
TODO: