-
Notifications
You must be signed in to change notification settings - Fork 31
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution. I think you're running with an older Tornado. Please update and retest.
nb2kg/handlers.py
Outdated
def open(self, kernel_id, *args, **kwargs): | ||
"""Handle web socket connection open to notebook server and delegate to gateway web socket handler """ | ||
loop = IOLoop.current() | ||
|
||
self.ping_callback = PeriodicCallback(self.send_ping, WS_PING_INTERVAL_MS, io_loop=loop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the io_loop
parameter. It was removed in Tornado 5.0 and deprecated in 4.1. It currently produces the following repeatedly on systems running Tornado >= 5.0...
[E 07:53:09.523 NotebookApp] Uncaught exception GET /api/kernels/e26308d6-517d-40b3-a5ef-3da204c73923/channels?session_id=70fb168f96ef449a80de44728d1f906d (::1)
HTTPServerRequest(protocol='http', host='localhost:8888', method='GET', uri='/api/kernels/e26308d6-517d-40b3-a5ef-3da204c73923/channels?session_id=70fb168f96ef449a80de44728d1f906d', version='HTTP/1.1', remote_ip='::1')
Traceback (most recent call last):
File "/opt/anaconda3/envs/nb2kg-dev/lib/python3.7/site-packages/tornado/websocket.py", line 952, in _accept_connection
open_result = handler.open(*handler.open_args, **handler.open_kwargs)
File "/opt/anaconda3/envs/nb2kg-dev/lib/python3.7/site-packages/nb2kg/handlers.py", line 106, in open
self.ping_callback = PeriodicCallback(self.send_ping, WS_PING_INTERVAL_MS, io_loop=loop)
TypeError: __init__() got an unexpected keyword argument 'io_loop'
Once removed, I see expected behavior for startup (although I didn't test behind a proxy).
nb2kg/handlers.py
Outdated
@@ -86,8 +89,23 @@ def get(self, kernel_id, *args, **kwargs): | |||
self.kernel_id = cast_unicode(kernel_id, 'ascii') | |||
yield gen.maybe_future(super(WebSocketChannelsHandler, self).get(kernel_id=kernel_id, *args, **kwargs)) | |||
|
|||
def send_ping(self): | |||
if self.stream.closed() and self.ping_callback is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm seeing the following every 30 seconds. Since the callback is never stopped, this error gets logged every 30 seconds even after the kernel is shutdown...
[E 08:11:10.512 NotebookApp] Exception in callback <bound method WebSocketChannelsHandler.send_ping of <nb2kg.handlers.WebSocketChannelsHandler object at 0x1038c29b0>>
Traceback (most recent call last):
File "/opt/anaconda3/envs/nb2kg-dev/lib/python3.7/site-packages/tornado/ioloop.py", line 907, in _run
return self.callback()
File "/opt/anaconda3/envs/nb2kg-dev/lib/python3.7/site-packages/nb2kg/handlers.py", line 93, in send_ping
if self.stream.closed() and self.ping_callback is not None:
AttributeError: 'NoneType' object has no attribute 'closed'
It may be trickier than just testing self.stream
for None
since the connection code uses 'detach' and I'm not certain how that actually works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be an issue with the move to Tornado 6. See this update in notebook: jupyter/notebook@5828300#diff-808f61143410aa59f6ac185458495d66
However, I suspect there's a bug there in that self.ws_connection
is really not accessible to the WebSocketMixin
class or I don't understand how multiple inheritance works in Python. This member variable is defined in WebSocketHandler
, which is the sibling super-class to ZMQStreamHandler
.
That said, since NB2KG doesn't use a mixin here, you're probably fine using self.ws_connection
in the condition.
This brings up an interesting issue. How to dictate dependency versions? As a result, I think we should probably bump the minimal Notebook version to 5.7.8. |
@kevin-bates Oh.. sorry. I tested this only in my env. My fault..T^T. I deliberately changed tornado version dependency for compatibility with other server extensions using older version of tornado. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates. I had a couple more comments/questions.
nb2kg/handlers.py
Outdated
@@ -40,6 +40,9 @@ | |||
KG_CONNECT_TIMEOUT = float(os.getenv('KG_CONNECT_TIMEOUT', 20.0)) | |||
KG_REQUEST_TIMEOUT = float(os.getenv('KG_REQUEST_TIMEOUT', 20.0)) | |||
|
|||
# Keepalive ping interval (30 seconds) | |||
WS_PING_INTERVAL_MS = 30000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make this value a function of a seconds-based environment variable (e.g., KG_WS_PING_INTERVAL_SECS
) that defaults to 30? I'd like the option of being able to change this w/o building.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll fix this as you recommended. BTW do you have any plan to move those settings from Environment Variable to Configuration File such as ini, yaml, or even .py
dealing with traits
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update.
No plans on creating formal config values for these since NB2KG is embedded in Notebook (in master tip) where these have been promoted to formal config values (with env support). Once NB 6 is released, there won't be a need to deal with a separate NB2KG server extension (although that approach would still work provided once doesn't start NB with --gateway-url
).
nb2kg/handlers.py
Outdated
self.ping(b'') | ||
|
||
def on_pong(self, data): | ||
self.log.debug('Client has sent pong') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the Notebook code implements a timeout mechanism where on_pong
captures the last pong time. Is that not necessary here (between Notebook and Gateway)? If not, can we remove this method of comment out the debug statement and use pass
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need it because KG already have logic for culling idle kernel. It will cover culling websocket of no response.
When nb2kg server is behind the proxy closing idle stream connection such as nginx and envoy websocket is closed and reconnected. This behavior updates kernel `last_activity` so that it keeps `kernel_gateway` from culling kernels. This change is from `WebsocketMixin` in jupyter/notebook. Signed-off-by: Eunsoo Park <esevan.park@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good Eunsoo - thank you.
When nb2kg server is behind the proxy closing idle stream connection such as nginx and envoy websocket is closed and reconnected.
This behavior updates kernel
last_activity
so that it keepskernel_gateway
from culling kernels.This change is from
WebsocketMixin
in jupyter/notebook.Signed-off-by: Eunsoo Park esevan.park@gmail.com