Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not transpose 1d arrays during interpolation #5542

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Illviljan
Copy link
Contributor

Seems a waste of time to transpose 1d arrays.

  • Closes #xxxx
  • Tests added
  • Passes pre-commit run --all-files
  • User visible changes (including notable bug fixes) are documented in whats-new.rst
  • New functions/methods are listed in api.rst

@Illviljan
Copy link
Contributor Author

Illviljan commented Jun 27, 2021

It's just a simple copy/paste job at the moment. I welcome suggestions for a more elegant solution.

@github-actions
Copy link
Contributor

github-actions bot commented Jun 27, 2021

Unit Test Results

         6 files           6 suites   52m 1s ⏱️
16 204 tests 14 485 ✔️ 1 719 💤 0 ❌
90 420 runs  82 260 ✔️ 8 160 💤 0 ❌

Results for commit fb68d59.

♻️ This comment has been updated with latest results.

@max-sixty
Copy link
Collaborator

Sorry this didn't get @Illviljan .

It looks good, without me having that much context. Do you have any info on whether this has any performance impact?

I imagine this isn't as easy as it sounds, but do you have a view on whether we could apply this concept more broadly, and make transposing 1D arrays a no-op in the transpose method, rather than writing that logic for each method that calls .transpose?

@Illviljan
Copy link
Contributor Author

Yes this improves the interp performance a bit. The .copy in the transpose is rather slow so it seems better to just not do it. An alternative is removing the .copy in the transpose or make it an option?

@max-sixty
Copy link
Collaborator

max-sixty commented Jul 25, 2021

Yeah, I guess if someone does x = y.transpose() and then x[0] = 42, then y would be inconsistently updated. Don't think we're likely to get copy-on-write semantics soon!

So maybe this is the best we can hope for.

Edit: actually is this already implemented? https://github.com/pydata/xarray/blob/main/xarray/core/variable.py#L1441-L1444. Does interpolate not hit this code path?

Is it worth adding an ASV? I've found them fairly quick to set up a new one, though takes some lift to set up the environment etc. I think in general we should try and have them for performance work, so we can track if it regresses.

@Illviljan
Copy link
Contributor Author

Edit: actually is this already implemented? https://github.com/pydata/xarray/blob/main/xarray/core/variable.py#L1441-L1444. Does interpolate not hit this code path?

Yes, that's the copy that's slow and there's no real need to create a new copy in the 1D case. My (bad) idea was to just return the original array there instead, but as you noted that might not be fully intuitive.

A ASV for 1D- and ND-interpolation and having it running in the CI would be nice.
Though I'm not really interested in implementing that, It would be just another thing to document for me because I use other profilers anyway and as you said getting it setup requires some work that I don't enjoy.

@max-sixty
Copy link
Collaborator

It's curious that's slow — it's not a deep copy and so should be fast (in python terms!), since it's just copying the class instance.

Totally understand re ASV — and more generally you should choose the most meaningful work for you. I hope you continue to become more involved with the project, and there'll be plenty of time to expand into other areas.

(though down a level, we should have some way of justifying the merge — let me know if you have any profiles to hand, no rush)

Thanks as ever @Illviljan !

@Illviljan
Copy link
Contributor Author

It probably is decently fast but it can't compare to just not running the .copy() code if a copy isn't necessary. :) But this isn't a major bottleneck either:

bild
Before, the transpose is one of the largest bottlenecks in missing.interp where the .copy() is what takes time. missing.interp took 1.33s to run.

bild
After, the transpose obviously isn't a factor anymore since it isn't triggered. missing.interp took 1.24s to run.

@Illviljan Illviljan marked this pull request as ready for review August 14, 2021 07:42
@Illviljan Illviljan added the run-benchmark Run the ASV benchmark workflow label Dec 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants