Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add progress bar to allocation preparation #53

Merged
merged 2 commits into from
Feb 8, 2022

Conversation

IanButterworth
Copy link
Contributor

I ran the allocation profiler with sample_rate=0.1 and the process of preparing the pprof took >10 minutes with no output before I quit.

This adds a progress meter to the preparation stage.

There's still a long silence before the warning message shows, but I guess that's a Profile issue.

julia> PProf.Allocs.pprof(from_c = false)
┌ Warning: The allocation profiler is not fully implemented, and may have missed some of the allocs. For more info see https://github.com/JuliaLang/julia/issues/43688
└ @ Profile.Allocs ~/Documents/GitHub/julia/usr/share/julia/stdlib/v1.8/Profile/src/Allocs.jl:134
Analyzing 9239305 allocation samples...  1%|▉                                                                                                                       |  ETA: 0:48:08

cc. @NHDaly @vilterp

Copy link
Collaborator

@vilterp vilterp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems legit :) Not sure the idiomatic way to do this in Julia, but I think this could be sped up by making this loop into a "parallel map" — i.e. decoding the allocs in parallel on multiple cores. But that should be a separate PR…

@codecov-commenter
Copy link

codecov-commenter commented Feb 7, 2022

Codecov Report

Merging #53 (c52920e) into master (18d842d) will not change coverage.
The diff coverage is 0.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master      #53   +/-   ##
=======================================
  Coverage   72.81%   72.81%           
=======================================
  Files           3        3           
  Lines         298      298           
=======================================
  Hits          217      217           
  Misses         81       81           
Impacted Files Coverage Δ
src/Allocs.jl 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 18d842d...c52920e. Read the comment docs.

Copy link
Member

@NHDaly NHDaly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😊 thanks Ian, that's great. 👍 LGTM

@NHDaly NHDaly merged commit d8c80ea into JuliaPerf:master Feb 8, 2022
@vilterp
Copy link
Collaborator

vilterp commented Feb 8, 2022

Reading this again, I realized that a significant amount of time is spent in Profile.Allocs.fetch() which won't be captured by this progress bar 🤔

I.e. out of two stages — decode and pprof — this is only capturing pprof. Kind of makes me want to bring back JuliaLang/julia#43889, which deferred stack decoding so basically all the time would be in pprof 🤔

@NHDaly
Copy link
Member

NHDaly commented Feb 8, 2022

Yeah, @vilterp: I think that we should provide another API that only exports the raw data together with something like an lidict that users can use to decode the data offline, which will also allow for exporting the results to disk without needing PProf, which can be nice.

We talked about something like that in the past, i think.

@IanButterworth IanButterworth deleted the ib/alloc_progress branch February 8, 2022 16:43
@IanButterworth
Copy link
Contributor Author

IanButterworth commented Feb 8, 2022

Note that Profile.fetch() was recently multithreaded JuliaLang/julia#43816 Maybe something similar is possible?

@vilterp
Copy link
Collaborator

vilterp commented Feb 8, 2022

Yes, I think we should be able to do something similar :)

Also, for what it's worth, the strategy there of partitioning a collection and spawning a thread to process each partition seems like a good addition to the standard library (i.e. parallel_map or something), so both profilers (and other stuff) could use it — not sure where such a function would live 🤷‍♂️

@IanButterworth
Copy link
Contributor Author

Exactly.. JuliaLang/julia#43919

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants