-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inferece: remove CachedMethodTable
#44240
Conversation
@nanosoldier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this that helpful anymore? For concrete types, we already cache those in a hash table at the next level
I didn't confirm noticeable performance improvements when running the
But does this mean cache might be useful when a signature is abstract? |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
Lol, the benchmark shows that this change make it less efficient both in time and space... |
yeah, we might want to do this and have only non-concrete types put here, otherwise we may end up doing the lookup twice in the cache-miss case. |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
d0f9e26
to
38d8fa1
Compare
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
38d8fa1
to
df07fea
Compare
ok, I really couldn't confirm any performance improvement in |
CachedMethodTable
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
Since we couldn't confirm any performance benefit from `CachedMethodTable` in the current infrastructure (see the benchmark results in #44240), now I'd like to propose to eliminate that entirely and save a bit of space.
56eefec
to
ecbc263
Compare
Apologize for the confusion! In <JuliaLang/julia#44240>, we figured out that `CachedMethodTable` doesn't give us any performance benefit (both in time and space), and so we removed that from Julia Base. This commit updates the (newly updated) GPUCompiler's overloads accordingly.
Since we couldn't confirm any performance benefit from `CachedMethodTable` in the current infrastructure (see the benchmark results in JuliaLang#44240), now I'd like to propose to eliminate that entirely and save a bit of space.
…InferenceState)` interface In #44240 we removed the `CachedMethodTable` support as it turned out to be ineffective under the current compiler infrastructure. Because of this, there is no strong reason to keep a method table per `InferenceState`. This commit simply removes the `method_table(::AbstractInterpreter, ::InferenceState)` interface and should make it clearer which interface should be overloaded to implement a contextual dispatch.
…InferenceState)` interface In #44240 we removed the `CachedMethodTable` support as it turned out to be ineffective under the current compiler infrastructure. Because of this, there is no strong reason to keep a method table per `InferenceState`. This commit simply removes the `method_table(::AbstractInterpreter, ::InferenceState)` interface and should make it clearer which interface should be overloaded to implement a contextual dispatch.
…InferenceState)` interface In #44240 we removed the `CachedMethodTable` support as it turned out to be ineffective under the current compiler infrastructure. Because of this, there is no strong reason to keep a method table per `InferenceState`. This commit simply removes the `method_table(::AbstractInterpreter, ::InferenceState)` interface and should make it clearer which interface should be overloaded to implement a contextual dispatch.
Since we couldn't confirm any performance benefit from `CachedMethodTable` in the current infrastructure (see the benchmark results in JuliaLang#44240), now I'd like to propose to eliminate that entirely and save a bit of space.
…InferenceState)` interface (#44389) In #44240 we removed the `CachedMethodTable` support as it turned out to be ineffective under the current compiler infrastructure. Because of this, there is no strong reason to keep a method table per `InferenceState`. This commit simply removes the `method_table(::AbstractInterpreter, ::InferenceState)` interface and should make it clearer which interface should be overloaded to implement a contextual dispatch.
Since we couldn't confirm any performance benefit from `CachedMethodTable` in the current infrastructure (see the benchmark results in JuliaLang#44240), now I'd like to propose to eliminate that entirely and save a bit of space.
Since we couldn't confirm any performance benefit from `CachedMethodTable` in the current infrastructure (see the benchmark results in #44240), now I'd like to propose to eliminate that entirely and save a bit of space.
…InferenceState)` interface (#44389) In #44240 we removed the `CachedMethodTable` support as it turned out to be ineffective under the current compiler infrastructure. Because of this, there is no strong reason to keep a method table per `InferenceState`. This commit simply removes the `method_table(::AbstractInterpreter, ::InferenceState)` interface and should make it clearer which interface should be overloaded to implement a contextual dispatch.
`CachedMethodTable` was removed within #44240 as we couldn't confirm any performance improvement then. However it turns out the optimization was critical in some real world cases (e.g. #46492), so this commit revives the mechanism with the following tweaks that should make it more effective: - create method table cache per inference (rather than per local inference on a function call as on the previous implementation) - only use cache mechanism for abstract types (since we already cache lookup result at the next level as for concrete types) As a result, the following snippet reported at #46492 recovers the compilation performance: ```julia using ControlSystems a_2 = [-5 -3; 2 -9] C_212 = ss(a_2, [1; 2], [1 0; 0 1], [0; 0]) @time norm(C_212) ``` > on master ``` julia> @time norm(C_212) 364.489044 seconds (724.44 M allocations: 92.524 GiB, 6.01% gc time, 100.00% compilation time) 0.5345224838248489 ``` > on this commit ``` julia> @time norm(C_212) 26.539016 seconds (62.09 M allocations: 5.537 GiB, 5.55% gc time, 100.00% compilation time) 0.5345224838248489 ```
`CachedMethodTable` was removed within #44240 as we couldn't confirm any performance improvement then. However it turns out the optimization was critical in some real world cases (e.g. #46492), so this commit revives the mechanism with the following tweaks that should make it more effective: - create method table cache per inference (rather than per local inference on a function call as on the previous implementation) - only use cache mechanism for abstract types (since we already cache lookup result at the next level as for concrete types) As a result, the following snippet reported at #46492 recovers the compilation performance: ```julia using ControlSystems a_2 = [-5 -3; 2 -9] C_212 = ss(a_2, [1; 2], [1 0; 0 1], [0; 0]) @time norm(C_212) ``` > on master ``` julia> @time norm(C_212) 364.489044 seconds (724.44 M allocations: 92.524 GiB, 6.01% gc time, 100.00% compilation time) 0.5345224838248489 ``` > on this commit ``` julia> @time norm(C_212) 26.539016 seconds (62.09 M allocations: 5.537 GiB, 5.55% gc time, 100.00% compilation time) 0.5345224838248489 ```
`CachedMethodTable` was removed within #44240 as we couldn't confirm any performance improvement then. However it turns out the optimization was critical in some real world cases (e.g. #46492), so this commit revives the mechanism with the following tweaks that should make it more effective: - create method table cache per inference (rather than per local inference on a function call as on the previous implementation) - only use cache mechanism for abstract types (since we already cache lookup result at the next level as for concrete types) As a result, the following snippet reported at #46492 recovers the compilation performance: ```julia using ControlSystems a_2 = [-5 -3; 2 -9] C_212 = ss(a_2, [1; 2], [1 0; 0 1], [0; 0]) @time norm(C_212) ``` > on master ``` julia> @time norm(C_212) 364.489044 seconds (724.44 M allocations: 92.524 GiB, 6.01% gc time, 100.00% compilation time) 0.5345224838248489 ``` > on this commit ``` julia> @time norm(C_212) 26.539016 seconds (62.09 M allocations: 5.537 GiB, 5.55% gc time, 100.00% compilation time) 0.5345224838248489 ``` (cherry picked from commit 8445744)
`CachedMethodTable` was removed within #44240 as we couldn't confirm any performance improvement then. However it turns out the optimization was critical in some real world cases (e.g. #46492), so this commit revives the mechanism with the following tweaks that should make it more effective: - create method table cache per inference (rather than per local inference on a function call as on the previous implementation) - only use cache mechanism for abstract types (since we already cache lookup result at the next level as for concrete types) As a result, the following snippet reported at #46492 recovers the compilation performance: ```julia using ControlSystems a_2 = [-5 -3; 2 -9] C_212 = ss(a_2, [1; 2], [1 0; 0 1], [0; 0]) @time norm(C_212) ``` > on master ``` julia> @time norm(C_212) 364.489044 seconds (724.44 M allocations: 92.524 GiB, 6.01% gc time, 100.00% compilation time) 0.5345224838248489 ``` > on this commit ``` julia> @time norm(C_212) 26.539016 seconds (62.09 M allocations: 5.537 GiB, 5.55% gc time, 100.00% compilation time) 0.5345224838248489 ```
`CachedMethodTable` was removed within #44240 as we couldn't confirm any performance improvement then. However it turns out the optimization was critical in some real world cases (e.g. #46492), so this commit revives the mechanism with the following tweaks that should make it more effective: - create method table cache per inference (rather than per local inference on a function call as on the previous implementation) - only use cache mechanism for abstract types (since we already cache lookup result at the next level as for concrete types) As a result, the following snippet reported at #46492 recovers the compilation performance: ```julia using ControlSystems a_2 = [-5 -3; 2 -9] C_212 = ss(a_2, [1; 2], [1 0; 0 1], [0; 0]) @time norm(C_212) ``` > on master ``` julia> @time norm(C_212) 364.489044 seconds (724.44 M allocations: 92.524 GiB, 6.01% gc time, 100.00% compilation time) 0.5345224838248489 ``` > on this commit ``` julia> @time norm(C_212) 26.539016 seconds (62.09 M allocations: 5.537 GiB, 5.55% gc time, 100.00% compilation time) 0.5345224838248489 ```
Previously the method lookup result was created per frame and so the
look cache hasn't been use that much. With this change the cache is
created per inference, and so the cached result will be used when we
already saw the same match in the same inference shot, and it may speed
up the lookup time a bit.
This commit also setups new
AbstractInterpreter
interfaceget_method_lookup_cache
which specifies what method lookup cache is used by each
AbstractInterpreter
.NativeInterpreter
creates a cache per inference, and so it is validsince lookup is done in the same world age in the same inference shot.
External
AbstractInterpreter
doesn't opt into this cache by default,and its behavior won't change in anyway.