Skip to content

Add stats.is_constant #21

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Feb 28, 2025
Merged

Add stats.is_constant #21

merged 28 commits into from
Feb 28, 2025

Conversation

flying-sheep
Copy link
Member

@flying-sheep flying-sheep commented Feb 18, 2025

Fixes #28

I don‘t actually understand why the dask tests pass, but they do.

@ilan-gold @Intron7 can you help me understand why this works?

def test_is_constant_dask() -> None:
"""Tests if is_constant works if each chunk is individually constant."""
if TYPE_CHECKING:
import dask.array.core as da
else:
import dask.array as da
x_np = np.repeat(np.repeat(np.arange(4).reshape(2, 2), 2, axis=0), 2, axis=1)
x: da.Array = da.from_array(x_np, (2, 2)) # type: ignore[no-untyped-call]
result = stats.is_constant(x, axis=None).compute() # type: ignore[attr-defined]
assert result is False

TODO:

  • benchmarks
  • enable doctests
  • sparse-in-dask
  • zarr/hdf5?

@flying-sheep flying-sheep changed the title WIP is-constant Add stats.is_constant Feb 18, 2025
Copy link

codecov bot commented Feb 18, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.95%. Comparing base (ef6b847) to head (521fb63).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #21      +/-   ##
==========================================
+ Coverage   95.53%   96.95%   +1.41%     
==========================================
  Files           9       11       +2     
  Lines         112      164      +52     
==========================================
+ Hits          107      159      +52     
  Misses          5        5              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

codspeed-hq bot commented Feb 18, 2025

CodSpeed Performance Report

Merging #21 will not alter performance

Comparing pa/is-constant (521fb63) with main (ef6b847)

Summary

✅ 34 untouched benchmarks
🆕 18 new benchmarks

Benchmarks breakdown

Benchmark BASE HEAD Change
🆕 test_stats_benchmark[0-numpy.ndarray-float32-is_constant] N/A 230.1 µs N/A
🆕 test_stats_benchmark[0-numpy.ndarray-float64-is_constant] N/A 274.7 µs N/A
🆕 test_stats_benchmark[0-scipy.sparse.csc_array-float32-is_constant] N/A 93.5 µs N/A
🆕 test_stats_benchmark[0-scipy.sparse.csc_array-float64-is_constant] N/A 102.3 µs N/A
🆕 test_stats_benchmark[0-scipy.sparse.csr_array-float32-is_constant] N/A 567.4 µs N/A
🆕 test_stats_benchmark[0-scipy.sparse.csr_array-float64-is_constant] N/A 607.8 µs N/A
🆕 test_stats_benchmark[1-numpy.ndarray-float32-is_constant] N/A 228.6 µs N/A
🆕 test_stats_benchmark[1-numpy.ndarray-float64-is_constant] N/A 272 µs N/A
🆕 test_stats_benchmark[1-scipy.sparse.csc_array-float32-is_constant] N/A 566.6 µs N/A
🆕 test_stats_benchmark[1-scipy.sparse.csc_array-float64-is_constant] N/A 606.9 µs N/A
🆕 test_stats_benchmark[1-scipy.sparse.csr_array-float32-is_constant] N/A 488.1 µs N/A
🆕 test_stats_benchmark[1-scipy.sparse.csr_array-float64-is_constant] N/A 152.7 µs N/A
🆕 test_stats_benchmark[None-numpy.ndarray-float32-is_constant] N/A 109.4 µs N/A
🆕 test_stats_benchmark[None-numpy.ndarray-float64-is_constant] N/A 129.9 µs N/A
🆕 test_stats_benchmark[None-scipy.sparse.csc_array-float32-is_constant] N/A 111.4 µs N/A
🆕 test_stats_benchmark[None-scipy.sparse.csc_array-float64-is_constant] N/A 132.7 µs N/A
🆕 test_stats_benchmark[None-scipy.sparse.csr_array-float32-is_constant] N/A 111.1 µs N/A
🆕 test_stats_benchmark[None-scipy.sparse.csr_array-float64-is_constant] N/A 131.7 µs N/A

@flying-sheep flying-sheep marked this pull request as ready for review February 25, 2025 11:28
Copy link
Contributor

@ilan-gold ilan-gold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good TODO

@ilan-gold
Copy link
Contributor

I can look into why the above works

@flying-sheep
Copy link
Member Author

flying-sheep commented Feb 27, 2025

OK, using asymmetric overlap this looks almost as efficient as it could be:

since it implies boundary="none" this means our example chunks become something like:

[[0, 0, 1],  | [[1, 1],
 [0, 0, 1],  |  [1, 1],
 [2, 2, 3]]  |  [3, 3]]
             |
-------------+---------
             |
[[2, 2, 3],  | [[3, 3],
 [2, 2, 3]]  |  [3, 3]]

Comment on lines 122 to 134
@pytest.mark.array_type(Flags.Dask)
def test_is_constant_dask(
dask_viz: Callable[[object], None], array_type: ArrayType[types.DaskArray, Any]
) -> None:
"""Tests if is_constant works if each chunk is individually constant."""
x_np = np.repeat(np.repeat(np.arange(4).reshape(2, 2), 2, axis=0), 2, axis=1)
x = array_type(x_np)
assert x.blocks.shape == (2, 2)
assert all(stats.is_constant(block).compute() for block in x.blocks.ravel())

result = stats.is_constant(x, axis=None)
dask_viz(result)
assert result.compute() is False # type: ignore[no-untyped-call]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this test be renamed to something maybe more informative (test_dask_constant_blocks?), and then the other test be updated to use dask? Am I missing something or does the test_is_constant one skip dask?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn’t, this one just tests a specific possible failure in dask, so I should definitely rename it.

@flying-sheep flying-sheep merged commit d459cca into main Feb 28, 2025
12 checks passed
@flying-sheep flying-sheep deleted the pa/is-constant branch February 28, 2025 08:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New function: is_constant
2 participants