-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize compute_kda_ma #745
Optimize compute_kda_ma #745
Conversation
Thanks for opening this pull request! We have detected this is the first time you have contributed to NiMARE. Please check out our contributing guidelines.
Of course, if you want to opt out this time there is no problem at all with adding your name later. You will be always welcome to add it in the future whenever you feel it should be listed. |
Tried to fix the |
Codecov ReportBase: 88.56% // Head: 88.55% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #745 +/- ##
==========================================
- Coverage 88.56% 88.55% -0.01%
==========================================
Files 38 38
Lines 4371 4370 -1
==========================================
- Hits 3871 3870 -1
Misses 500 500
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution! These changes look good to me. I just have two small suggestions. I was able to reproduce the profiling results on different datasets.
I also found that parallelize this loop with joblib could speedup the process by another small percent.
That's great. We are currently profiling MKDAChi2()
to see how we better handle the memory while computing multiple ma_maps in parallel.
Thanks!
nimare/meta/utils.py
Outdated
if sum_overlap: | ||
nonzero_to_append = counts[sphere_idx_inside_mask] | ||
else: | ||
nonzero_to_append = np.ones((len(sphere_idx_inside_mask),)) * counts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nonzero_to_append = np.ones((len(sphere_idx_inside_mask),)) * counts | |
nonzero_to_append = np.ones((len(sphere_idx_inside_mask),)) * value |
With these changes, I think here we could use value
and remove counts = value
from line 125 so that counts will always be a list.
nimare/meta/utils.py
Outdated
if sum_overlap: | ||
counts = counts[idx] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be merged with (or removed in favor of) the if sum_overlap:
you have below.
Co-authored-by: Julio A. Peraza <52050407+JulioAPeraza@users.noreply.github.com>
Changes proposed in this pull request:
compute_kda_ma
I was trying to speedup
MKDAChi2()
and made some changes to the matrix operations incompute_kda_ma
, which could half the running time. Please feel free to make any revisions, as I could only test under my specific case.nonzero_idx
is not sorted after the change, but I guess it's not an issue.sum_overlap=True
, in whichcounts
could be a list.I also found that parallelize this loop with joblib could speedup the process by another small percent. So, I was wondering if there is a way to pre-calculate
ma_maps
for all the files in a dataset and reuse them. I saw that many functions support using saved MA maps internally throughself.inputs_["ma_maps"]
, but I did not find a way to patch into the input.Thank you.