Skip to content

Conversation

@christiangnrd
Copy link
Member

Depends on #728 and a way to select runners with at least 16 GB memory

@christiangnrd christiangnrd marked this pull request as ready for review January 29, 2026 01:29
@github-actions
Copy link
Contributor

github-actions bot commented Jan 29, 2026

Your PR requires formatting changes to meet the project's style guidelines.
Please consider running Runic (git runic main) to apply these changes.

Click here to view the suggested changes.
diff --git a/test/largebroadcast.jl b/test/largebroadcast.jl
index be0bf22b..b037458b 100644
--- a/test/largebroadcast.jl
+++ b/test/largebroadcast.jl
@@ -1,13 +1,15 @@
 const N = Int(typemax(UInt32)) + 1
 const T = Int8
 
-@testset "len = $n" for n in ((N÷2) - 4, (N÷2), (N÷2) + 4, N - 1024, N - 3, N - 1, N, N + 4)
+@testset "len = $n" for n in ((N ÷ 2) - 4, (N ÷ 2), (N ÷ 2) + 4, N - 1024, N - 3, N - 1, N, N + 4)
     A = MtlArray{T}(undef, n)
     # Known working method to zero out array
     Metal.unsafe_fill!(device(A), pointer(A), T(0), n * sizeof(T); async = false)
 
-    _dims = [(n,), (n, 1), (1, n), (n, 1, 1), (1, n, 1), (1, 1, n),
-             (n, 1, 1, 1), (1, n, 1, 1), (1, 1, n, 1), (1, 1, 1, n),]
+    _dims = [
+        (n,), (n, 1), (1, n), (n, 1, 1), (1, n, 1), (1, 1, n),
+        (n, 1, 1, 1), (1, n, 1, 1), (1, 1, n, 1), (1, 1, 1, n),
+    ]
     if n == 2^32
         push!(_dims, (2^16, 2^16))
         push!(_dims, (2^16, 2^8, 2^8))
@@ -17,7 +19,7 @@ const T = Int8
         # These must be run first to ensure we test
         # the unspecialized broadcast kernels
         Metal._broadcast_shapes[CartesianIndices(dims)] = Metal.BROADCAST_SPECIALIZATION_THRESHOLD - 1
-        unspec_val = T((i-1) * 2 + 1)
+        unspec_val = T((i - 1) * 2 + 1)
         arr = reshape(A, dims)
         arr .= unspec_val
         @test all(==(unspec_val), arr)

@christiangnrd christiangnrd force-pushed the bigbuffertests branch 2 times, most recently from c5152f9 to 0104cca Compare January 29, 2026 01:40
@codecov
Copy link

codecov bot commented Jan 29, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.28%. Comparing base (f7b0829) to head (f2a7fa1).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #735      +/-   ##
==========================================
- Coverage   82.59%   82.28%   -0.32%     
==========================================
  Files          62       62              
  Lines        2862     2873      +11     
==========================================
  Hits         2364     2364              
- Misses        498      509      +11     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metal Benchmarks

Details
Benchmark suite Current: f2a7fa1 Previous: f7b0829 Ratio
latency/precompile 24994097291 ns 25035410125 ns 1.00
latency/ttfp 2272859917 ns 2272092167 ns 1.00
latency/import 1445333042 ns 1444131708 ns 1.00
integration/metaldevrt 864354.5 ns 870500 ns 0.99
integration/byval/slices=1 1591292 ns 1578834 ns 1.01
integration/byval/slices=3 11237083 ns 10274687 ns 1.09
integration/byval/reference 1567667 ns 1566167 ns 1.00
integration/byval/slices=2 2643125 ns 2622959 ns 1.01
kernel/indexing 602292 ns 625542 ns 0.96
kernel/indexing_checked 627458 ns 615520.5 ns 1.02
kernel/launch 11625 ns 11417 ns 1.02
kernel/rand 568041 ns 566500 ns 1.00
array/construct 6500 ns 6084 ns 1.07
array/broadcast 598167 ns 602917 ns 0.99
array/random/randn/Float32 968666.5 ns 1001708 ns 0.97
array/random/randn!/Float32 743125 ns 750583 ns 0.99
array/random/rand!/Int64 549833.5 ns 551375 ns 1.00
array/random/rand!/Float32 586208 ns 587041 ns 1.00
array/random/rand/Int64 787917 ns 777958 ns 1.01
array/random/rand/Float32 642333 ns 590000 ns 1.09
array/accumulate/Int64/1d 1280229 ns 1254458 ns 1.02
array/accumulate/Int64/dims=1 1841125 ns 1827292 ns 1.01
array/accumulate/Int64/dims=2 2189750 ns 2165896 ns 1.01
array/accumulate/Int64/dims=1L 11558895.5 ns 11587604 ns 1.00
array/accumulate/Int64/dims=2L 9795208 ns 9813792 ns 1.00
array/accumulate/Float32/1d 1125500 ns 1121542 ns 1.00
array/accumulate/Float32/dims=1 1566375 ns 1398896.5 ns 1.12
array/accumulate/Float32/dims=2 1890042 ns 1890312.5 ns 1.00
array/accumulate/Float32/dims=1L 9798959 ns 9780770.5 ns 1.00
array/accumulate/Float32/dims=2L 7258792 ns 7265249.5 ns 1.00
array/reductions/reduce/Int64/1d 1519437.5 ns 1357812.5 ns 1.12
array/reductions/reduce/Int64/dims=1 1103250 ns 1116917 ns 0.99
array/reductions/reduce/Int64/dims=2 1153458.5 ns 1188313 ns 0.97
array/reductions/reduce/Int64/dims=1L 1984917 ns 1986958.5 ns 1.00
array/reductions/reduce/Int64/dims=2L 4718354 ns 4191792 ns 1.13
array/reductions/reduce/Float32/1d 929542 ns 1028770.5 ns 0.90
array/reductions/reduce/Float32/dims=1 831875 ns 830375 ns 1.00
array/reductions/reduce/Float32/dims=2 925833.5 ns 861042 ns 1.08
array/reductions/reduce/Float32/dims=1L 1558208.5 ns 1318208 ns 1.18
array/reductions/reduce/Float32/dims=2L 1820875 ns 1794083 ns 1.01
array/reductions/mapreduce/Int64/1d 1519479.5 ns 1551042 ns 0.98
array/reductions/mapreduce/Int64/dims=1 1105292 ns 1125167 ns 0.98
array/reductions/mapreduce/Int64/dims=2 1187312.5 ns 1206208 ns 0.98
array/reductions/mapreduce/Int64/dims=1L 2022833 ns 1995875 ns 1.01
array/reductions/mapreduce/Int64/dims=2L 3579875 ns 3621563 ns 0.99
array/reductions/mapreduce/Float32/1d 943917 ns 1027916 ns 0.92
array/reductions/mapreduce/Float32/dims=1 832833 ns 830833 ns 1.00
array/reductions/mapreduce/Float32/dims=2 862708 ns 862209 ns 1.00
array/reductions/mapreduce/Float32/dims=1L 1327354.5 ns 1332666 ns 1.00
array/reductions/mapreduce/Float32/dims=2L 1803459 ns 1842000 ns 0.98
array/private/copyto!/gpu_to_gpu 629709 ns 632500 ns 1.00
array/private/copyto!/cpu_to_gpu 789708 ns 780667 ns 1.01
array/private/copyto!/gpu_to_cpu 804625 ns 787750 ns 1.02
array/private/iteration/findall/int 1556291 ns 1564000 ns 1.00
array/private/iteration/findall/bool 1412146 ns 1393667 ns 1.01
array/private/iteration/findfirst/int 2062333 ns 2101875 ns 0.98
array/private/iteration/findfirst/bool 2041229.5 ns 2049749.5 ns 1.00
array/private/iteration/scalar 4025000 ns 4317166 ns 0.93
array/private/iteration/logical 2632459 ns 2622584 ns 1.00
array/private/iteration/findmin/1d 2504666 ns 2507959 ns 1.00
array/private/iteration/findmin/2d 1792750 ns 1788250 ns 1.00
array/private/copy 568792 ns 556437.5 ns 1.02
array/shared/copyto!/gpu_to_gpu 82916 ns 83375 ns 0.99
array/shared/copyto!/cpu_to_gpu 82375 ns 87084 ns 0.95
array/shared/copyto!/gpu_to_cpu 81792 ns 82625 ns 0.99
array/shared/iteration/findall/int 1559042 ns 1552250 ns 1.00
array/shared/iteration/findall/bool 1432083 ns 1424958 ns 1.01
array/shared/iteration/findfirst/int 1665458 ns 1688500 ns 0.99
array/shared/iteration/findfirst/bool 1641209 ns 1642438 ns 1.00
array/shared/iteration/scalar 207167 ns 205792 ns 1.01
array/shared/iteration/logical 2246584 ns 2261000 ns 0.99
array/shared/iteration/findmin/1d 2123958 ns 2133500 ns 1.00
array/shared/iteration/findmin/2d 1792083 ns 1797312.5 ns 1.00
array/shared/copy 235583 ns 240542 ns 0.98
array/permutedims/4d 2399833.5 ns 2390688 ns 1.00
array/permutedims/2d 1179208 ns 1170542 ns 1.01
array/permutedims/3d 1686792 ns 1675145.5 ns 1.01
metal/synchronization/stream 19416 ns 17291 ns 1.12
metal/synchronization/context 20291 ns 17667 ns 1.15

This comment was automatically generated by workflow using github-action-benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants