Skip to content

Conversation

@rcomer
Copy link
Member

@rcomer rcomer commented May 5, 2023

🚀 Pull Request

Description

Attempt to resolve #5295, following my own suggestion at #5295 (comment). It turned out I didn't need to use da.broadcast_to directly - this is automatic due to dask's support for NEP13/18. We do still need to handle masked arrays separately because they don't (yet) support NEP13/18.

This is still not working for the case where the array is both lazy and masked (see failing new test) because the mask gets ignored by da.broadcast_to. We could apply a similar approach to that for the numpy masked arrays, but you can't tell whether a dask array is masked until you realise it. This seems like it must be a common problem. Anyone know a solution before I attempt to re-invent the wheel?

Performance Note

For a simple array I found this new implementation takes nearly twice as long as main. However we are only talking about ~4 microseconds, and it does not appear to depend on array size, so maybe it's OK?


Consult Iris pull request check list

@bouweandela
Copy link
Member

you can't tell whether a dask array is masked until you realise it. This seems like it must be a common problem. Anyone know a solution before I attempt to re-invent the wheel?

You can tell by using an undocumented dask.array function dask.array.utils.meta_from_array. I'm not sure why it's not documented though, maybe it's not supposed to be that public.

>>> import dask.array as da
>>> import numpy as np
>>> a = da.array(np.ma.array([1.], mask=[1]))
>>> isinstance(da.utils.meta_from_array(a), np.ma.MaskedArray)
True

@rcomer
Copy link
Member Author

rcomer commented May 12, 2023

Thanks @bouweandela that is golden knowledge. I just opened dask/dask#10278 to get more clarity on the status of that function.

@rcomer rcomer force-pushed the simpler-broadcast-to-shape branch from b4bb546 to e7e8a41 Compare May 22, 2023 11:19
@rcomer
Copy link
Member Author

rcomer commented May 22, 2023

I think this is now ready for review. I made a helper function to check whether an array is both lazy and masked type, as I figured it could come in useful elsewhere in Iris.

@rcomer rcomer marked this pull request as ready for review May 22, 2023 11:25
@codecov
Copy link

codecov bot commented May 22, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.01 🎉

Comparison is base (95cdea3) 89.35% compared to head (e7e8a41) 89.37%.

❗ Current head e7e8a41 differs from pull request most recent head 4b0d6b5. Consider uploading reports for the commit 4b0d6b5 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5307      +/-   ##
==========================================
+ Coverage   89.35%   89.37%   +0.01%     
==========================================
  Files          89       89              
  Lines       22417    22419       +2     
  Branches     5382     5380       -2     
==========================================
+ Hits        20030    20036       +6     
+ Misses       1639     1637       -2     
+ Partials      748      746       -2     
Impacted Files Coverage Δ
lib/iris/_lazy_data.py 100.00% <100.00%> (ø)
lib/iris/util.py 90.47% <100.00%> (+0.68%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@ESadek-MO ESadek-MO self-requested a review May 22, 2023 14:32
Copy link
Contributor

@ESadek-MO ESadek-MO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good to me, thanks @rcomer!

@ESadek-MO ESadek-MO merged commit 48e3a86 into SciTools:main Jun 1, 2023
@rcomer
Copy link
Member Author

rcomer commented Jun 1, 2023

Thanks @ESadek-MO!

@rcomer rcomer deleted the simpler-broadcast-to-shape branch June 1, 2023 09:38
bjlittle pushed a commit to bjlittle/iris that referenced this pull request Jun 26, 2023
* working for all except masked lazy

* use moveaxis

* handle lazy masked case

* add tests for is_lazy_masked_data

* whatsnew

* check compute isn't called

* update docstring
trexfeathers pushed a commit that referenced this pull request Jun 26, 2023
* Simplify and lazify broadcast_to_shape (#5307)

* working for all except masked lazy

* use moveaxis

* handle lazy masked case

* add tests for is_lazy_masked_data

* whatsnew

* check compute isn't called

* update docstring

* add whatnew patch footnote

---------

Co-authored-by: Ruth Comer <10599679+rcomer@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

iris.util.broadcast_to_shape does not work with lazy data

3 participants