-
-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable mypy in CI 2/2 #5348
Enable mypy in CI 2/2 #5348
Conversation
services: "dict | None" = None, | ||
versions: "dict | None" = None, | ||
extra: "dict | None" = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the Cython code around the usage of these variables look like? In the past I've seen Cython handle these ""
types as just object
s. If that's still true, that would mean Cython does not use the faster C level APIs associated with these objects, which would lose some of the performance gains we got with Cythonizing the scheduler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, they degrade performance. Reverted.
Moving the conversation from #5328 |
Could you articulate how @cclass
class TaskGroup:
_last_worker: WorkerState
def __init__(self, name: str):
self._last_worker = None to this? @cclass
class TaskGroup:
_last_worker: WorkerState
def __init__(self, name: str):
self._last_worker = None # type: ignore note that I'm not advocating for changing |
Notes from today's maintainers meeting: At the moment, there's the problem that developers can easily introduce performance degradation in the Cythonized code every time they touch scheduler.py. Comparing side by side the HTML reports before/after the change is feasible only when one has a specific doubt, not as a general approach going forward.
|
Work in scheduler.py is done. All unit test failures are unrelated. Waiting for merge of #5328. |
_transition_counter: Py_ssize_t | ||
_plugins: dict # dict[str, SchedulerPlugin] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pulled up from Scheduler since SchedulerState was invoking self.plugins on many occasions
unrunnable: set, | ||
validate: bint, | ||
plugins: "Iterable[SchedulerPlugin]" = (), | ||
**kwargs, # Passed verbatim to Server.__init__() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed a wealth of needless default values for init params
All test failures are unrelated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an upstream cython issue for why using stringy type annotations makes performance worse? Is it just because they haven't implemented it, or is there a deeper reason?
distributed/scheduler.py
Outdated
@@ -5461,7 +5480,7 @@ def add_plugin(self, plugin=None, idempotent=False, name=None, **kwargs): | |||
|
|||
""" | |||
if isinstance(plugin, type): | |||
plugin = plugin(self, **kwargs) | |||
plugin = plugin(self, **kwargs) # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could probably be fixed by capturing scheduler plugin kwargs in a ParamSpec
and making a TypeVar
for the add_plugin
signature. I don't think we should actually do that here, though, since ParamSpec
is in typing_extensions~=3.10
, which is not universally supported yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mypy is right complaining here. This is genuinely wrong; most plugins DO NOT accept a scheduler positional arg.
See #5394 for proper fix
@jakirkham are you happy to merge this? |
Merging on Wednesday if there are no further comments |
@@ -2517,6 +2517,8 @@ async def gather_dep( | |||
|
|||
if not to_gather_keys: | |||
return | |||
assert cause |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverts #5160
|
||
from the root of the distributed repository. Now the code linters will be run each time | ||
you commit changes. You can skip these checks with ``git commit --no-verify`` or with | ||
the short version ``git commit -n``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Align to dask/dask
In it goes! |
Closes #2803.
Iterative follow-up to #5328.
scheduler.py should be reviewed thoroughly for performance regressions in the cythonised code.