-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Final switch to IMASDD #32
Conversation
Attaching profview results for the two versions of solps2imas: solps2imas_profview_results.zip It has two files:
To view these files, open julia REPL: > using Pkg; Pkg.add("ProfileView");
> using ProfileView
> ProfileView.view(nothing, windowname="OMAS") # Use GUI to open solps2imas_omas_profview.jlprof
> ProfileView.view(nothing, windowname="IMASDD") # Use GUI to open solps2imas_imasdd_profview.jlprof
> # Keep REPL open Highlights
|
I was able to increase the speed by a factor of 2 by simply reducing the number of calls that would be made to get_property inside search_points and search_edges function. Speed result now: SOLPS2IMAS.jl % julia --project test/speedtest.jl
-----------------------------------------------------------------------------
solps2imas(b2gmtry, b2output; b2mn=b2mn, fort=fort) time with compilation: 139.745324 seconds (433.19 M allocations: 16.150 GiB, 4.25% gc time, 40.88% compilation time: <1% of which was recompilation)
solps2imas(b2gmtry, b2output; b2mn=b2mn, fort=fort) time (true runtime): 74.138648 seconds (267.65 M allocations: 4.938 GiB, 2.12% gc time, 0.02% compilation time) It is still not an acceptable speed. There might not be any more optimization possible in search_edges and search_nodes. These functions are operating on unsorted lists, some kdtree stuff can be maybe, but that would be too much complex. Hopefully, we can improve the speed from IMASDD side in get_property function. Maybe there is a better way to get the property of a structure which can reduce the time spent in get_property function. |
@bclyons12 if you have time, could you please take a look and see if there's anything obvious? |
@anchal-physics the GGD is highly nested and depth really stress tests the access to the data structure. For now I have resorted to simply using Of course doing so removes the IMASDD functionalities, but that's ok for the ggd, where I don't think we'll want to use expressions, lazy copies, and handling of time dependent data. I may consider defining the nodes under the ggd structures as a different abstract subtype of IDS, on which I can dispatch to get faster operations. |
I don't see anything at first glance. I'm assuming it's related to some of the allocation issues we've seen in IMASDD elsewhere, but I'd need to do some profiling myself to confirm that. I could look into it more in a week or two. |
Thanks @orso82. After your change, the execution time on my machine was about 8.5 seconds. I found that half of the time was spent in a deepcopy of grid_subset operation, so I made a faster version of deepcopy in GGDUtils that uses getfield and uses deepcopy for the julia base types only. This improved the speed further and now the run time is 3.5s only. There is still a fact to note that the compilation time with IMASDD is still a lot in comparison to OMAS (7 times more). Maybe that is just the way julia is. Here is the summary of speedtest now: Speed at 557b60eSOLPS2IMAS.jl % julia --project test/speedtest.jl
-----------------------------------------------------------------------------
solps2imas(b2gmtry, b2output; b2mn, fort) time with compilation: 93.010971 seconds (187.08 M allocations: 12.482 GiB, 8.50% gc time, 90.99% compilation time: <1% of which was recompilation)
solps2imas(b2gmtry, b2output; b2mn, fort) time (true runtime): 8.574597 seconds (21.54 M allocations: 1.271 GiB, 10.96% gc time, 0.25% compilation time) Speed at ProjectTorreyPines/IMASggd.jl@3ebbf8f and 11ac45aSOLPS2IMAS.jl % julia --project test/speedtest.jl
-----------------------------------------------------------------------------
solps2imas(b2gmtry, b2output; b2mn, fort) time with compilation: 55.555791 seconds (172.91 M allocations: 11.503 GiB, 8.39% gc time, 93.66% compilation time)
solps2imas(b2gmtry, b2output; b2mn, fort) time (true runtime): 3.405274 seconds (18.18 M allocations: 1.003 GiB, 12.79% gc time, 0.37% compilation time) I think it is acceptable speed now, except for the fact that first time use of solps2imas in any code will still be slow. I'll move forward to other repos now. Question: Is it possible to precompile a julia script and create a executable that runs at the fastest speed? |
@anchal-physics tihs is what we should do to speedup precompilation times on all of our packages: |
* All references to OMAS have been removed now. * All tests pass with IMASDD. * Minor changes in edge and cell initialization was required.
It was identified that majority of time is spent in get_property calls made due search_points and search_edges. Ithis commit reduces these calls by making a single clal per iteration of loop and storing the result in a temporary variable. This improved the speed by a factor of about 2. Before commit SOLPS2IMAS.jl % julia --project test/speedtest.jl ----------------------------------------------------------------------------- solps2imas(b2gmtry, b2output; b2mn=b2mn, fort=fort) time with compilation: 139.745324 seconds (433.19 M allocations: 16.150 GiB, 4.25% gc time, 40.88% compilation time: <1% of which was recompilation) solps2imas(b2gmtry, b2output; b2mn=b2mn, fort=fort) time (true runtime): 74.138648 seconds (267.65 M allocations: 4.938 GiB, 2.12% gc time, 0.02% compilation time) After commit SOLPS2IMAS.jl % julia --project test/speedtest.jl ----------------------------------------------------------------------------- solps2imas(b2gmtry, b2output; b2mn=b2mn, fort=fort) time with compilation: 139.745324 seconds (433.19 M allocations: 16.150 GiB, 4.25% gc time, 40.88% compilation time: <1% of which was recompilation) solps2imas(b2gmtry, b2output; b2mn=b2mn, fort=fort) time (true runtime): 74.138648 seconds (267.65 M allocations: 4.938 GiB, 2.12% gc time, 0.02% compilation time)
After last commits, it was found that deepcopy of the grid_subsets done when triangular mesh is read is taking up about half of the computation time of solps2imas(). This is because deepcopy has to go through the deep grid_subset strucutre calling get_property at every instance. This commit uses custom deepcopy_subset function which is aware of the fields inside grid_subset and uses getfield() for faster access to items for creating deepcopy at julia base type level. This function has been added in GGDutils.jl/subset_tools.jl at ProjectTorreyPines/IMASggd.jl@3ebbf8f With this change, the speed has improved by a factor of 2.5. deepcopy is not a major bottleneck anymore.
11ac45a
to
624f383
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, it passes, but it would be nice if it were faster:
julia> include("test/runtests.jl")
Test Summary: | Pass Total Time
Test index conversions | 1 1 0.1s
Test Summary: | Pass Total Time
Test file parsing in depth | 6 6 1.3s
Test Summary: | Pass Total Time
Test read_b2_output | 19 19 3.3s
solps2imas() time: 160.382650 seconds (154.93 M allocations: 10.483 GiB, 5.57% gc time, 96.87% compilation time)
Test Summary: | Pass Total Time
Test solps2imas() (overall workflow) | 4 4 2m42.0s
Test Summary: | Pass Total Time
Test triangular mesh generation from fort files | 23479 23479 11.7s
Test.DefaultTestSet("Test triangular mesh generation from fort files", Any[], 23479, false, false, true, 1.712016072012787e9, 1.712016083762666e9, false)
julia>
Yeah, as Orso suggested, we might have to use PrecompileTools to make our packages faster. Most time spent in those tests is compilation time. Those functions will run much faster the second time in the same julia instance. |
They sure do |
This PR would be merged simultaneously when all other repos also have the PR ready for the final switch from OMAS to IMASDD.
Speed test
I ran a speed test to check the difference between using OMAS and IMASDD. With IMASDD, solps2imas is about 100 times slower!. I believe we need to address this, or atleast confirm that this was expected before marking this PR ready above.
Following are the results:
Speed test with IMASDD.jl
Speed test with OMAS.jl