Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profile: Faster data dict lookup #43805

Merged

Conversation

IanButterworth
Copy link
Member

@IanButterworth IanButterworth commented Jan 14, 2022

The ip -> stackframe lookup that happens when viewing profiling data dominates TTFPP and TTSPP

Master

julia> @time ProfileView.view()
 20.764135 seconds (2.68 M allocations: 304.669 MiB, 0.48% gc time)

which is basically all spent in getdict! within retrieve

julia> @time Profile.retrieve();
 20.858307 seconds (1.87 M allocations: 172.837 MiB, 0.36% gc time)

julia> @time Profile.retrieve();
 20.861972 seconds (1.87 M allocations: 172.837 MiB, 0.83% gc time)

This PR

julia> @time Profile.retrieve();
  1.891498 seconds (3.50 M allocations: 314.141 MiB)

julia> @time Profile.retrieve();
  2.110156 seconds (3.50 M allocations: 314.143 MiB, 8.65% gc time)

so

julia> @time ProfileView.view()
  2.392356 seconds (4.41 M allocations: 450.986 MiB, 14.13% gc time, 1.23% compilation time)

And when julia only has one thread it takes the same amount of time as master.

# Base.update_stackframes_callback[]
stn = map(x->(x, 1), st)
try Base.invokelatest(Base.update_stackframes_callback[], stn) catch end
dict[UInt64(ip)] = map(first, stn)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always forget if Dict is threadsafe. I don't think it is?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was if it's preallocated like it is here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least experimentally they seem to be?

julia> dict = Dict{Int,Vector{Int}}();

julia> foreach(i -> dict[i] = Int[], 1:10_000)

julia> @sync for i in keys(dict)
           Threads.@spawn begin
               dict[i] = collect(1:i)
           end
       end

julia> for i in keys(dict)
           sum(dict[i]) == sum(1:i) || error()
       end

julia> 

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes as long as we are not reallocating...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is a very bad idea

@IanButterworth IanButterworth merged commit 14154fc into JuliaLang:master Jan 14, 2022
@IanButterworth IanButterworth deleted the ib/profile_faster_fetch branch January 14, 2022 19:21
vtjnash added a commit that referenced this pull request Jan 14, 2022
@IanButterworth IanButterworth restored the ib/profile_faster_fetch branch January 14, 2022 21:34
try Base.invokelatest(Base.update_stackframes_callback[], stn) catch end
dict[UInt64(ip)] = map(first, stn)
# we don't want metadata here as we're just looking up ips
unique_data_itr = Iterators.unique(has_meta(data) ? strip_meta(data) : data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since unique uses hashing which is much of what Dicts do, why is this a speed improvement?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That just facilitates identifying the unique ips before going off to parallelize. It's the parallel part that helps most

N5N3 pushed a commit to N5N3/julia that referenced this pull request Jan 24, 2022
@IanButterworth IanButterworth deleted the ib/profile_faster_fetch branch January 24, 2022 20:46
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 4, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 4, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 5, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 5, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 6, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 9, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 9, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 9, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 9, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 9, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 10, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 11, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 11, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 11, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 11, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 12, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 12, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 13, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 13, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 13, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 13, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 14, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 14, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 15, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 15, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 15, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 16, 2022
N5N3 added a commit to N5N3/julia that referenced this pull request Feb 17, 2022
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Feb 22, 2022
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Feb 22, 2022
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Feb 22, 2022
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Mar 8, 2022
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Mar 8, 2022
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Mar 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants