-
Notifications
You must be signed in to change notification settings - Fork 251
Description
First, a fairy story: I have a very big system based on greenlet at work, with dozens of plugins and thousands of greenlets, all in one process. Occasionaly, I notice that my system stalls with severe lags, where some instances of plugins might wait minutes in line for their turn to run. It's quite obvious that one of plugins contains a bug, likely doing something blocking, or maybe computing something for a bit too long. If only it was possible to know which plugin/function consumes CPU cycles the most!
As a side note: usual profiling tools don't work correctly. Others use hacks to detect greenlet switches, with workarounds for specific event libraries (e.g. gevent), and are not optimal. The most problematic case is when switch goes into C code. Python's settrace only traces on Python code, any switch to C code is invisible until trace resurfaces back in Python code, in some other greenlet, unknown time later.
I propose a new global function set_switch_callback that accepts a callback object as a parameter and returns previous callback object (None initially). The calback value of None would mean no callback is called. The callback would execute immediately before an actual C-level switch or throw with two parameters: current greenlet and target greenlet objects. The callback is only called when those differ, obviously. The callback is only set for the current thread, making it thread-safe and similar to settrace. Any exceptions raised by the callback will be raised by switch/throw instead of the intended action.
This way tracing and profiling code would be able to have a definite information about greenlet switches (and thus would be able to time them), even when switches are made between greenlets in C.
What do you think?