-
-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dashboard: Fine Performance Metrics #7725
Dashboard: Fine Performance Metrics #7725
Conversation
6f13c00
to
83045dc
Compare
83045dc
to
a3965b0
Compare
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 26 files ± 0 26 suites ±0 16h 1m 40s ⏱️ + 7m 36s For more details on these failures, see this check. Results for commit a4e1def. ± Comparison against base commit f4f4718. ♻️ This comment has been updated with latest results. |
0d25a90
to
b93fe0b
Compare
Ready for a roasting. :) |
@crusaderky maybe you'd have some time to review. 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very small comments
|
||
idx = self.data["operation"].index(operation) | ||
while idx >= len(self.data["operation_val"]): | ||
self.data["operation_val"].append(float("NaN")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be better to model these two keys as a NamedTuple to protect against unequal lengths?
def init_root(self): | ||
def handle_toggle(_toggle): | ||
self.toggle.label = ( | ||
"Bytes (Toggle for Values)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These strings look like something that could get out of sync very easily, maybe use variables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First of all, I must say I'm very excited about this new view.
Must have
Bokeh 3.1 specific issues
The page is broken on Bokeh 3.1 (everything is fine on Bokeh 2):
- The plot looks off and I can't see the bar chart for execute.
- The plot doesn't seem to update itself in realtime. I have to reload the page every time.
- Selecting a prefix makes the plot go completely blank.
- Every few seconds, the plot goes blank. On my stderr I get:
bokeh.core.serialization.DeserializationError: can't resolve reference 'p278815'
2023-04-26 16:20:39,524 - bokeh.server.protocol_handler - ERROR - error handling message
message: Message 'PATCH-DOC' content: {'events': [{'kind': 'ModelChanged', 'model': {'id': 'p582495'}, 'attr': 'inner_height', 'new': 409}]}
error: DeserializationError("can't resolve reference 'p582495'")
There's also a stack trace, but it's completely internal to bokeh.
Other must have
- Please change unit selection to a dynamically populated dropdown
- Changing the unit only impacts the execute-by-prefix chart. It should impact all.
- I think it would make more sense for get_data to be a pie chart
- Can't find the plot in dask-labextension
- The tooltip for execute by prefix needs polishing:
- Please add unit tests. Typically (see other plots) these unit tests run the rendering in a couple of relevant use cases (e.g. no tasks / with tasks) and simply verify that they don't raise.
- The subtitles are incoherent/confusing. Please change to:
- Task execution, by prefix
- Task execution, by activity
- Send data, by activity (confusingly,
get_data
is what is called through RPC to get data for yourself; but when it runs locally it's sending the data to a peer)
Bonus
These critiques don't block this PR from merging and can be delivered at a later date.
- Please add a plot for gather_dep (title: "Receive data, by activity"). It's not quite redundant with get_data, as for example it can highlight deserialization-specific issues.
- The bytes plot, as it is, is not very useful. The reason is that
memory-read
is going to eclipse everything else. The purpose of that metric is to compare it to disk-read and calculate cache hit ratios; nothing else. It would become instantly more useful if there was a filter by activity so that it can be excluded (just like the filter by prefix). - The stacked bar charts could use a vertical axis
- The pie charts could use an indication of percentage in the tooltip
I'd be happy to merge this PR with only the "Other must have" section done, plus a check that hides it with bokeh 3, and then have a follow-up issue to introduce bokeh 3 support. I don't know the effort involved in making it work for bokeh 3; the overhead may not be worth it. |
Thanks @crusaderky for the thorough review! |
5e7947c
to
884262d
Compare
Maybe I misunderstand, but your'e talking about the same toggle button, yes? If it was dynamic those plots are explicitly collecting "bytes" or "seconds"; using thins like
I don't know what this is, and grepping thru the codebase I'm not sure how other plots are connected to this. |
d3caebd
to
f5ee251
Compare
It should react to any arbitrary units it finds. For example, somebody may want to track time that doesn't add up to the wall time: @context_meter.meter("some_stuff", unit="gpu-seconds", func=my_video_card.get_elapsed)
def runs_on_gpu():
...
client.submit(runs_on_gpu, key="foo").result() The above will appear in Another example: track number of iterations needed to converge: def converging():
n = 0
while not converged():
n += 1
...
context_meter.digest_metric("blah", n, "iterations")
client.submit(converging, key="c-123").result() The above will appear in I would suggest you keep it simple: use ad-hoc formatting for
|
604f902
to
e994cfc
Compare
Okay, @crusaderky another look when you have time, have a drop down dynamic unit selection and I don't know about the dask-labextension, seems to work fine for me. Maybe you need to re-install the extension or something? Not that familiar with jupyter lab. 🤷♂️ |
@crusaderky you are asking for functionality that is not even used at the moment. I consider this entire feature still experimental and I explicitly stated already that this thing does not need to be perfect. This PR has been open now for more than a month now. I do not want to see any more features being added to this but want this to be wrapped up asap. If anything is missing, please create a follow up ticket for it. |
See code review commit. Mostly renamed stuff to align it to metrics.py. Sizes are off. The top-left panel is the only one that resizes dynamically; the other two are fixed. Peek.2023-05-11.00-33.mp4 |
4ae732d
to
290e5d3
Compare
290e5d3
to
a4e1def
Compare
I'm still observing an artifact with the plots. If I am observing the dashboard while the computation is going on, the metrics are added to the chart as I'd expect. If I refresh the page, I get accurate results for a short time but the same process of "drop a couple of tasks" happens again. Considering how long this is open now I'm inclined to merge regardless. There is value in this dashboard and I'd like to get this out asap and prefer polishing it later. from dask.datasets import timeseries
from distributed import Client
client = Client()
ddf = timeseries("2000", "2020", partition_freq="1w")
ddf.groupby("id").x.mean().compute() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't reviewed the core logic. Given that this is an optional and non-critical component, I'd be fine with merging. @hendrikmakait you intended to review as well. I'll leave it up to you to merge.
If during review anything major comes up, please move this to a follow up ticket
|
||
futures = c.map(slowinc, range(10), delay=0.001) | ||
await wait(futures) | ||
await asyncio.sleep(1) # wait for metrics to arrive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit (not worthy of holding off the merge): I'd like us to be very conservative with this kind of sleeping since it artificially slows down our test suite.
Ideally, there was a hook we could listen to to know when we can check our assertions. This hook often doesn't exist but we can still implement the test more forgivingly, e.g.
start = time()
while True:
try:
assert not cl.task_exec_data
cl.update()
assert cl.task_exec_data
assert cl.task_exec_data["functions"] == ["slowinc"]
break
except AssertionError:
if time() - start > 1:
raise
await asyncio.sleep(0.01)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM. I haven't review the core logic, but focused on the requested changes. Another small issue I observed is that the charts don't scale if they have no data in them.
I'm fine with merging this and getting it out there. We can iron out the remaining issues in follow-up PRs. This dashboard is already very useful as-is.
🎉 |
To recap, these are the follow-ups:
I think at least the first one should be delivered before we can properly advertise this feature to the wider public. |
Closes #7679
pre-commit run --all-files
I'm not very satisfied with the implementation itself, open to suggestions. Maybe after we hammer out a data schema / dataclasses for the metrics the parsing will become a bit more ergonomic.
CURRENT STATE:
(I'll update these GIF(s) / comments as the PR progresses)
There is a function selector box to view/keep specific functions, otherwise the functions will be removed from the chart by function after a set amount of time (currently 10s)
So please let me have your input and here is an example of other data which is to be added as well to this page later:
Example data
Current state: