Test to recreate metal bug #3609

antimora · 2025-08-24T19:25:28Z

This PR contains tests to recreate #3600

NOTE: this PR based off #3599 PR that's why you see Yolo11x related tests.

This deep layers test can highlight the discrepancy in outputs between tch and metal especially for ResNet type of models.

DEEP

cd crates/burn-import/model-checks/metal-bug
cargo run --bin deep-network

See the output down below.

SHALLOW

cd crates/burn-import/model-checks/metal-bug
cargo run --bin single-layer

See the output down below.

Implemented broadcasting for add, sub, mul, and div operations between tensors of different ranks in BinaryNode. Added new ONNX test models and Rust tests to verify correct broadcasting behavior for various tensor rank combinations.

Enhances the YOLO11x model check by loading test input/output from a PyTorch file, running inference, and comparing the model output to reference data with detailed statistics. Updates dependencies and features in Cargo.toml, improves build script to track test data, and refines the Python script for model and test data preparation.

…rray

Updated slice range generation to treat i64::MAX as an open-ended range (..), and added checks to prevent slice end indices from exceeding i32::MAX. Also replaced alloc::vec::Vec with Vec for clarity and consistency.

Introduce a new crate 'burn-metal-bug' with two binaries: 'single-layer' and 'deep-network'. These binaries provide comprehensive tests comparing Metal, Tch, and Ndarray backends for various layers and deep network architectures, focusing on error accumulation and identifying backend-specific numerical issues, especially in Conv2d.

codecov · 2025-08-24T20:06:04Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 64.02%. Comparing base (47b5fe8) to head (6e6a83f).

❌ Your project check has failed because the head coverage (64.02%) is below the target coverage (80.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #3609   +/-   ##
=======================================
  Coverage   64.02%   64.02%           
=======================================
  Files        1084     1084           
  Lines      126894   126894           
=======================================
  Hits        81238    81238           
  Misses      45656    45656

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

antimora · 2025-09-23T00:19:49Z

@louisfd @wingertge This bug still persist.

nathanielsimard · 2025-09-23T20:55:59Z

@ArthurBrussee Could it be the same subgroup bug?

ArthurBrussee · 2025-09-23T23:18:50Z

No, the subgroup stuff just crashes as the MSL kernel doesn't compile

antimora · 2025-09-24T15:05:00Z


[metal-bug]% cargo run --bin deep-network


     Running `target/debug/deep-network`
========================================
Deep Network Error Accumulation Test
========================================

Testing how errors accumulate through deep networks

==================================================
TEST 1: Deep CNN (20 layers)
==================================================

🏗️ Building 20-layer deep CNN for tch vs metal
  Input: [1, 3, 128, 128]
  Layer  1 (Conv2d 3->6): max_diff = 0.00000001
  Layer  2 (MaxPool2d): max_diff = 0.00000001, size now 64x64
  Layer  3 (Conv2d 6->12): max_diff = 0.00000954
  Layer  4 (InstanceNorm): max_diff = 0.00055087
  Layer  5 (Conv2d 12->24): max_diff = 0.00078678
  Layer  6 (MaxPool2d): max_diff = 0.00078678, size now 32x32
  Layer  7 (Conv2d 24->48): max_diff = 0.00072074
  Layer  8 (InstanceNorm): max_diff = 0.00082397
  Layer  9 (Conv2d 48->96): max_diff = 0.00202739
  Layer 10 (MaxPool2d): max_diff = 0.00202739, size now 16x16
  Layer 11 (Conv2d 96->192): max_diff = 0.00806236
  Layer 12 (InstanceNorm): max_diff = 0.00085898
  Layer 13 (Conv2d 192->384): max_diff = 0.00719213
  Layer 14 (MaxPool2d): max_diff = 0.00710869, size now 8x8
  Layer 15 (Conv2d 384->768): max_diff = 0.05778503
  Layer 16 (InstanceNorm): max_diff = 0.00021422
  Layer 17 (Conv2d 384->768): max_diff = 0.00925446
  Layer 19 (Conv2d 384->768): max_diff = 0.55859375
  Layer 20 (InstanceNorm): max_diff = 0.00023580

🏗️ Building 20-layer deep CNN for tch vs ndarray
  Input: [1, 3, 128, 128]
  Layer  1 (Conv2d 3->6): max_diff = 0.00000001
  Layer  2 (MaxPool2d): max_diff = 0.00000001, size now 64x64
  Layer  3 (Conv2d 6->12): max_diff = 0.00000003
  Layer  4 (InstanceNorm): max_diff = 0.00000516
  Layer  5 (Conv2d 12->24): max_diff = 0.00000459
  Layer  6 (MaxPool2d): max_diff = 0.00000453, size now 32x32
  Layer  7 (Conv2d 24->48): max_diff = 0.00000763
  Layer  8 (InstanceNorm): max_diff = 0.00001431
  Layer  9 (Conv2d 48->96): max_diff = 0.00004864
  Layer 10 (MaxPool2d): max_diff = 0.00003529, size now 16x16
  Layer 11 (Conv2d 96->192): max_diff = 0.00009918
  Layer 12 (InstanceNorm): max_diff = 0.00001144
  Layer 13 (Conv2d 192->384): max_diff = 0.00014496
  Layer 14 (MaxPool2d): max_diff = 0.00012207, size now 8x8
  Layer 15 (Conv2d 384->768): max_diff = 0.00291443
  Layer 16 (InstanceNorm): max_diff = 0.00001146
  Layer 17 (Conv2d 384->768): max_diff = 0.00043488
  Layer 19 (Conv2d 384->768): max_diff = 0.00668335
  Layer 20 (InstanceNorm): max_diff = 0.00000364

📈 Error Growth Analysis for CNN (Metal):
  Initial error: 0.00000001
  Final error:   0.00023580
  Total growth:  16878.93x
  Max growth:    682.67x at layer 3
  ⚠️ WARNING: Exponential error growth detected! (avg: 1.67x per layer)

📈 Error Growth Analysis for CNN (Ndarray):
  Initial error: 0.00000001
  Final error:   0.00000364
  Total growth:  300.31x
  Max growth:    153.78x at layer 4
  ⚠️ WARNING: Exponential error growth detected! (avg: 1.35x per layer)

==================================================
TEST 2: ResNet-like (10 residual blocks = ~20 layers)
==================================================

🏗️ Building 10-block ResNet-like network for tch vs metal
  Block  1: max_diff = 0.00000030
  Block  2: max_diff = 0.00002766
  Block  3: max_diff = 0.00126648
  Block  4: max_diff = 0.06152344
  Block  5: max_diff = 1.56250000
  Block  6: max_diff = 56.50000000
  Block  7: max_diff = 2160.00000000
  Block  8: max_diff = 100864.00000000
  Block  9: max_diff = 2998272.00000000
  Block 10: max_diff = 101711872.00000000

🏗️ Building 10-block ResNet-like network for tch vs ndarray
  Block  1: max_diff = 0.00000000
  Block  2: max_diff = 0.00000000
  Block  3: max_diff = 0.00000000
  Block  4: max_diff = 0.00000000
  Block  5: max_diff = 0.00000000
  Block  6: max_diff = 0.00000000
  Block  7: max_diff = 0.00000000
  Block  8: max_diff = 0.00000000
  Block  9: max_diff = 0.00000000
  Block 10: max_diff = 0.00000000

📈 Error Growth Analysis for ResNet (Metal):
  Initial error: 0.00000030
  Final error:   101711872.00000000
  Total growth:  341288402550784.00x
  Max growth:    92.80x at layer 2
  ⚠️ WARNING: Exponential error growth detected! (avg: 28.40x per layer)

📈 Error Growth Analysis for ResNet (Ndarray):
  Initial error: 0.00000000
  Final error:   0.00000000
  Total growth:  0.00x
  Max growth:    0.00x at layer 1
  ✅ Error is stable or decreasing

==================================================
TEST 3: Deep MLP (30 layers)
==================================================

🏗️ Building 30-layer deep MLP for tch vs metal
  Layer  1 (Linear + ReLU): max_diff = 0.00000000
  Layer  2 (Linear + GELU): max_diff = 0.00000000
  Layer  3 (Linear + ReLU): max_diff = 0.00000000
  Layer  4 (Linear + GELU): max_diff = 0.00000000
  Layer  5 (Linear + ReLU): max_diff = 0.00000000
  Layer  6 (Linear + GELU): max_diff = 0.00000000
  Layer  7 (Linear + ReLU): max_diff = 0.00000000
  Layer  8 (Linear + GELU): max_diff = 0.00000000
  Layer  9 (Linear + ReLU): max_diff = 0.00000000
  Layer 10 (Linear + GELU): max_diff = 0.00000000
  Layer 11 (Linear + ReLU): max_diff = 0.00000000
  Layer 12 (Linear + GELU): max_diff = 0.00000000
  Layer 13 (Linear + ReLU): max_diff = 0.00000000
  Layer 14 (Linear + GELU): max_diff = 0.00000000
  Layer 15 (Linear + ReLU): max_diff = 0.00000000
  Layer 16 (Linear + GELU): max_diff = 0.00000000
  Layer 17 (Linear + ReLU): max_diff = 0.00000000
  Layer 18 (Linear + GELU): max_diff = 0.00000000
  Layer 19 (Linear + ReLU): max_diff = 0.00000000
  Layer 20 (Linear + GELU): max_diff = 0.00000000
  Layer 21 (Linear + ReLU): max_diff = 0.00000000
  Layer 22 (Linear + GELU): max_diff = 0.00000000
  Layer 23 (Linear + ReLU): max_diff = 0.00000000
  Layer 24 (Linear + GELU): max_diff = 0.00000000
  Layer 25 (Linear + ReLU): max_diff = 0.00000000
  Layer 26 (Linear + GELU): max_diff = 0.00000000
  Layer 27 (Linear + ReLU): max_diff = 0.00000000
  Layer 28 (Linear + GELU): max_diff = 0.00000000
  Layer 29 (Linear + ReLU): max_diff = 0.00000000
  Layer 30 (Linear + GELU): max_diff = 0.00000000

🏗️ Building 30-layer deep MLP for tch vs ndarray
  Layer  1 (Linear + ReLU): max_diff = 0.00000000
  Layer  2 (Linear + GELU): max_diff = 0.00000000
  Layer  3 (Linear + ReLU): max_diff = 0.00000000
  Layer  4 (Linear + GELU): max_diff = 0.00000000
  Layer  5 (Linear + ReLU): max_diff = 0.00000001
  Layer  6 (Linear + GELU): max_diff = 0.00000000
  Layer  7 (Linear + ReLU): max_diff = 0.00000000
  Layer  8 (Linear + GELU): max_diff = 0.00000000
  Layer  9 (Linear + ReLU): max_diff = 0.00000002
  Layer 10 (Linear + GELU): max_diff = 0.00000001
  Layer 11 (Linear + ReLU): max_diff = 0.00000002
  Layer 12 (Linear + GELU): max_diff = 0.00000001
  Layer 13 (Linear + ReLU): max_diff = 0.00000001
  Layer 14 (Linear + GELU): max_diff = 0.00000001
  Layer 15 (Linear + ReLU): max_diff = 0.00000001
  Layer 16 (Linear + GELU): max_diff = 0.00000001
  Layer 17 (Linear + ReLU): max_diff = 0.00000001
  Layer 18 (Linear + GELU): max_diff = 0.00000002
  Layer 19 (Linear + ReLU): max_diff = 0.00000003
  Layer 20 (Linear + GELU): max_diff = 0.00000001
  Layer 21 (Linear + ReLU): max_diff = 0.00000000
  Layer 22 (Linear + GELU): max_diff = 0.00000000
  Layer 23 (Linear + ReLU): max_diff = 0.00000001
  Layer 24 (Linear + GELU): max_diff = 0.00000001
  Layer 25 (Linear + ReLU): max_diff = 0.00000001
  Layer 26 (Linear + GELU): max_diff = 0.00000000
  Layer 27 (Linear + ReLU): max_diff = 0.00000001
  Layer 28 (Linear + GELU): max_diff = 0.00000000
  Layer 29 (Linear + ReLU): max_diff = 0.00000000
  Layer 30 (Linear + GELU): max_diff = 0.00000000

📈 Error Growth Analysis for MLP (Metal):
  Initial error: 0.00000000
  Final error:   0.00000000
  Total growth:  0.00x
  Max growth:    0.00x at layer 1
  ✅ Error is stable or decreasing

📈 Error Growth Analysis for MLP (Ndarray):
  Initial error: 0.00000000
  Final error:   0.00000000
  Total growth:  0.00x
  Max growth:    48.00x at layer 5
  ✅ Error is stable or decreasing

==================================================
TEST 4: Attention Stack (12 layers - like BERT)
==================================================

🏗️ Building 12-layer attention stack for tch vs metal

thread 'main' panicked at /Users/dilshod/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/wgpu-26.0.1/src/backend/wgpu_core.rs:1055:30:
wgpu error: Validation Error

Caused by:
  In Device::create_shader_module_passthrough, label = 'reduce_kernel'
    Failed to generate the backend-specific code


note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

thread 'main' panicked at /Users/dilshod/Projects/burn/crates/burn-fusion/src/stream/execution/ordering.rs:51:13:
Ordering is bigger than operations
stack backtrace:
   0:        0x104defbd8 - std::backtrace_rs::backtrace::libunwind::trace::h72f4b72e0962905d
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/../../backtrace/src/backtrace/libunwind.rs:117:9
   1:        0x104defbd8 - std::backtrace_rs::backtrace::trace_unsynchronized::hff394536698b6b10
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/../../backtrace/src/backtrace/mod.rs:66:14
   2:        0x104defbd8 - std::sys::backtrace::_print_fmt::h64d1e3035850353e
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/sys/backtrace.rs:66:9
   3:        0x104defbd8 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::hf35f9734f9a29483
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/sys/backtrace.rs:39:26
   4:        0x104e0a3b4 - core::fmt::rt::Argument::fmt::hedf6f2a66f855f69
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/core/src/fmt/rt.rs:173:76
   5:        0x104e0a3b4 - core::fmt::write::h60ec6633daab7b35
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/core/src/fmt/mod.rs:1468:25
   6:        0x104ded3dc - std::io::default_write_fmt::h0e30d7b1295222cb
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/io/mod.rs:639:11
   7:        0x104ded3dc - std::io::Write::write_fmt::hc29709fdab2e34e2
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/io/mod.rs:1954:13
   8:        0x104defa8c - std::sys::backtrace::BacktraceLock::print::hca95bffd78053951
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/sys/backtrace.rs:42:9
   9:        0x104df0dd0 - std::panicking::default_hook::{{closure}}::h357ed4fbef22679d
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/panicking.rs:300:27
  10:        0x104df0c28 - std::panicking::default_hook::h0a4e133b151d5758
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/panicking.rs:327:9
  11:        0x104df1870 - std::panicking::rust_panic_with_hook::h557a23724a5de839
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/panicking.rs:833:13
  12:        0x104df1464 - std::panicking::begin_panic_handler::{{closure}}::h269cace6208fef05
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/panicking.rs:699:13
  13:        0x104df0088 - std::sys::backtrace::__rust_end_short_backtrace::h5be0da278f3aaec7
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/sys/backtrace.rs:174:18
  14:        0x104df1168 - __rustc[de2ca18b4c54d5b8]::rust_begin_unwind
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/panicking.rs:697:5
  15:        0x104ea6b08 - core::panicking::panic_fmt::h477ff48eff31ffa4
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/core/src/panicking.rs:75:14
  16:        0x1029cc858 - burn_fusion::stream::execution::ordering::OrderedExecution<R>::execute_optimization::hce75d34fc5568c7e
                               at /Users/dilshod/Projects/burn/crates/burn-fusion/src/stream/execution/ordering.rs:51:13
  17:        0x103018628 - burn_fusion::stream::queue::execution::QueueExecution<R>::execute_strategy::h64571638861d5865
                               at /Users/dilshod/Projects/burn/crates/burn-fusion/src/stream/queue/execution.rs:130:31
  18:        0x10301898c - burn_fusion::stream::queue::execution::QueueExecution<R>::run::h3f2a6e27f7d58a98
                               at /Users/dilshod/Projects/burn/crates/burn-fusion/src/stream/queue/execution.rs:112:25
  19:        0x1033d5900 - burn_fusion::stream::queue::execution::<impl burn_fusion::stream::queue::base::OperationQueue<R>>::execute_block_optimization::h9d37f971b9532a46
                               at /Users/dilshod/Projects/burn/crates/burn-fusion/src/stream/queue/execution.rs:36:13
  20:        0x1033d5a3c - burn_fusion::stream::queue::execution::<impl burn_fusion::stream::queue::base::OperationQueue<R>>::execute::h4aea3e0d77c4c8ce
                               at /Users/dilshod/Projects/burn/crates/burn-fusion/src/stream/queue/execution.rs:25:14
  21:        0x1035a5590 - <burn_fusion::stream::multi::Segment<R> as burn_fusion::stream::execution::processor::StreamSegment<<R as burn_fusion::backend::FusionRuntime>::Optimization>>::execute::h630f741803a0ec1f
                               at /Users/dilshod/Projects/burn/crates/burn-fusion/src/stream/multi.rs:405:20
  22:        0x10387a588 - burn_fusion::stream::execution::processor::Processor<O>::explore::h66b59bc8db016590
                               at /Users/dilshod/Projects/burn/crates/burn-fusion/src/stream/execution/processor.rs:108:22
  23:        0x10387a76c - burn_fusion::stream::execution::processor::Processor<O>::process::h0039dc42ddf0d69c
                               at /Users/dilshod/Projects/burn/crates/burn-fusion/src/stream/execution/processor.rs:55:26
  24:        0x10359f404 - burn_fusion::stream::multi::MultiStream<R>::enqueue_operation::h031efee10608a9f3
                               at /Users/dilshod/Projects/burn/crates/burn-fusion/src/stream/multi.rs:155:26
  25:        0x1035a0ae8 - burn_fusion::stream::multi::MultiStream<R>::register::h903a6256c7260c72
                               at /Users/dilshod/Projects/burn/crates/burn-fusion/src/stream/multi.rs:79:33
  26:        0x102ff79fc - burn_fusion::server::FusionServer<R>::register::hc416f0ae34a8f24f
                               at /Users/dilshod/Projects/burn/crates/burn-fusion/src/server.rs:33:14
  27:        0x10388ab20 - <burn_fusion::client::mutex::MutexFusionClient<R> as burn_fusion::client::base::FusionClient<R>>::register::he8c19a2f79cc206d
                               at /Users/dilshod/Projects/burn/crates/burn-fusion/src/client/mutex.rs:46:14
  28:        0x102dda060 - <burn_fusion::tensor::FusionTensor<R> as core::ops::drop::Drop>::drop::h7322d0b699dd401c
                               at /Users/dilshod/Projects/burn/crates/burn-fusion/src/tensor.rs:205:22
  29:        0x102d88088 - core::ptr::drop_in_place<burn_fusion::tensor::FusionTensor<burn_cubecl::fusion::FusionCubeRuntime<cubecl_wgpu::runtime::WgpuRuntime,u8>>>::h96ceb1f8108ddc1b
                               at /Users/dilshod/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:804:1
  30:        0x103bb4760 - burn_fusion::ops::float::<impl burn_tensor::tensor::ops::tensor::FloatTensorOps<burn_fusion::backend::Fusion<B>> for burn_fusion::backend::Fusion<B>>::float_sum_dim::hccdeec8d92d96467
                               at /Users/dilshod/Projects/burn/crates/burn-fusion/src/ops/float.rs:1316:5
  31:        0x10255fe08 - <burn_tensor::tensor::api::kind::Float as burn_tensor::tensor::api::numeric::Numeric<B>>::sum_dim::hba37a7d0e5e99074
                               at /Users/dilshod/Projects/burn/crates/burn-tensor/src/tensor/api/numeric.rs:3642:70
  32:        0x103ca9afc - burn_tensor::tensor::api::numeric::<impl burn_tensor::tensor::api::base::Tensor<B,_,K>>::sum_dim::h5f7d0831fd083db0
                               at /Users/dilshod/Projects/burn/crates/burn-tensor/src/tensor/api/numeric.rs:516:19
  33:        0x102d4aeec - burn_tensor::tensor::activation::base::softmax::h4f91825bfe743931
                               at /Users/dilshod/Projects/burn/crates/burn-tensor/src/tensor/activation/base.rs:117:37
  34:        0x102933050 - deep_network::test_deep_attention::hcf1a3654883c0455
                               at /Users/dilshod/Projects/burn/crates/burn-import/model-checks/metal-bug/src/deep_network.rs:325:21
  35:        0x102959c54 - deep_network::main::h59ce4d632a8c8a88
                               at /Users/dilshod/Projects/burn/crates/burn-import/model-checks/metal-bug/src/deep_network.rs:521:28
  36:        0x102d71c78 - core::ops::function::FnOnce::call_once::h645af7609c652ed9
                               at /Users/dilshod/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/ops/function.rs:253:5
  37:        0x103b2ae40 - std::sys::backtrace::__rust_begin_short_backtrace::h124cd71f5e6c4165
                               at /Users/dilshod/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/std/src/sys/backtrace.rs:158:18
  38:        0x1033c0c60 - std::rt::lang_start::{{closure}}::hc763b7eff506a3f5
                               at /Users/dilshod/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/std/src/rt.rs:206:18
  39:        0x104de8064 - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::hbb2eb0e6976088d9
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/core/src/ops/function.rs:290:21
  40:        0x104de8064 - std::panicking::catch_unwind::do_call::h93858ce5ba09f3d9
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/panicking.rs:589:40
  41:        0x104de8064 - std::panicking::catch_unwind::h129a241a010f1b76
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/panicking.rs:552:19
  42:        0x104de8064 - std::panic::catch_unwind::h5ca6b885cfe10586
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/panic.rs:359:14
  43:        0x104de8064 - std::rt::lang_start_internal::{{closure}}::hed6353a412388a00
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/rt.rs:175:24
  44:        0x104de8064 - std::panicking::catch_unwind::do_call::h6579b7caa3691f01
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/panicking.rs:589:40
  45:        0x104de8064 - std::panicking::catch_unwind::h4557f88752b89087
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/panicking.rs:552:19
  46:        0x104de8064 - std::panic::catch_unwind::h82809ba82b8374af
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/panic.rs:359:14
  47:        0x104de8064 - std::rt::lang_start_internal::hdb28e94b6865fa11
                               at /rustc/1159e78c4747b02ef996e55082b704c09b970588/library/std/src/rt.rs:171:5
  48:        0x1033c0c38 - std::rt::lang_start::he19c4f25e5d88b50
                               at /Users/dilshod/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/std/src/rt.rs:205:5
  49:        0x10295aad8 - _main

thread 'main' panicked at library/core/src/panicking.rs:233:5:
panic in a destructor during cleanup
thread caused non-unwinding panic. aborting.
zsh: abort      cargo run --bin deep-network
[metal-bug]%
[metal-bug]%

antimora · 2025-09-24T15:05:50Z

[metal-bug]% cargo run --bin single-layer


warning: burn-tch@0.19.0: clang++: warning: -Wl,-rpath=/Users/dilshod/Projects/burn/crates/burn-import/model-checks/metal-bug/target/debug/build/torch-sys-f35f3d135078293c/out/libtorch/libtorch/lib: 'linker' input unused [-Wunused-command-line-argument]
   Compiling burn-metal-bug v0.1.0 (/Users/dilshod/Projects/burn/crates/burn-import/model-checks/metal-bug)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 1.56s
     Running `target/debug/single-layer`
========================================
Critical Layer Backend Testing
========================================

Testing layers with uniform weights (triggers Conv2d bug):


📊 Linear Layer:
----------------------------------------
Linear tch vs metal: max_diff = 0.00000000
Linear tch vs ndarray: max_diff = 0.00000000

📊 Conv1d Layer:
----------------------------------------
Conv1d tch vs metal: max_diff = 0.00000000
Conv1d tch vs ndarray: max_diff = 0.00000000

📊 Conv2d Layer (YOLO config):
----------------------------------------
Conv2d tch vs metal: max_diff = 0.00005767
Conv2d tch vs ndarray: max_diff = 0.00000000

📊 ConvTranspose2d Layer:
----------------------------------------
ConvTranspose2d tch vs metal: max_diff = 0.00000000
ConvTranspose2d tch vs ndarray: max_diff = 0.00000072

📊 MaxPool2d Layer:
----------------------------------------
MaxPool2d tch vs metal: max_diff = 0.00000000
MaxPool2d tch vs ndarray: max_diff = 0.00000000

📊 AvgPool2d Layer:
----------------------------------------
AvgPool2d tch vs metal: max_diff = 0.00000000
AvgPool2d tch vs ndarray: max_diff = 0.00000000

📊 Interpolate Layer:
----------------------------------------
Interpolate(Bilinear) tch vs metal: max_diff = 0.00000638
Interpolate(Bilinear) tch vs ndarray: max_diff = 0.00000206
Interpolate(Nearest) tch vs metal: max_diff = 0.00000000
Interpolate(Nearest) tch vs ndarray: max_diff = 0.00000000

📊 Activation Functions:
----------------------------------------

Testing activations tch vs metal:
  ReLU: 0.00000000
  Sigmoid: 0.00000018
  Tanh: 0.00000012
  GELU: 0.00000024
  SiLU: 0.00000072

Testing activations tch vs ndarray:
  ReLU: 0.00000000
  Sigmoid: 0.00000012
  Tanh: 0.00000000
  GELU: 0.00000048
  SiLU: 0.00000048

========================================
SUMMARY
========================================

Layer               | Metal vs Tch    | Ndarray vs Tch  | Status
--------------------|-----------------|-----------------|--------
Linear              | 0.00000000 | 0.00000000 | ✅
Conv1d              | 0.00000000 | 0.00000000 | ✅
Conv2d              | 0.00005767 | 0.00000000 | ✅
ConvTranspose2d     | 0.00000000 | 0.00000072 | ✅
MaxPool2d           | 0.00000000 | 0.00000000 | ✅
AvgPool2d           | 0.00000000 | 0.00000000 | ✅
Interpolate(Bilin)  | 0.00000638 | 0.00000206 | ✅
Interpolate(Near)   | 0.00000000 | 0.00000000 | ✅

⚠️ Threshold for failure: 0.0001
[metal-bug]%

antimora and others added 23 commits August 17, 2025 22:16

Add model-checks integration for yolo11x model

a4289c6

Refactor YOLO11x model preparation script

37191b9

Add YOLO11x model test with inference and shape check

bbd7be3

Merge remote-tracking branch 'upstream/main' into model-checks

43f73c2

Update Cargo.lock

7888c22

Add broadcasting support for binary tensor ops

ce2ddb2

Implemented broadcasting for add, sub, mul, and div operations between tensors of different ranks in BinaryNode. Added new ONNX test models and Rust tests to verify correct broadcasting behavior for various tensor rank combinations.

Add feature flags for ndarray and tch backends in yolo11x

57cec40

Merge remote-tracking branch 'upstream/main' into model-checks

64217bb

Use TestBackend in ONNX broadcast tests

9decb70

Fix ONNX Slice operation axes parameter handling

9878acf

Refactor and improve Slice shape handling logic

6177fda

Merge remote-tracking branch 'upstream/main' into model-checks

b328511

Merge remote-tracking branch 'upstream/main' into model-checks

ac77c5b

Fix lint warnings

f3c0896

Fixes per PR feedback

3575190

fix: Ensure output layout is the same for non-inplace SIMD ops in nda…

c22234e

…rray

Handle i64::MAX as open-ended slice in SliceNode

50acac2

Updated slice range generation to treat i64::MAX as an open-ended range (..), and added checks to prevent slice end indices from exceeding i32::MAX. Also replaced alloc::vec::Vec with Vec for clarity and consistency.

Merge remote-tracking branch 'upstream/main' into model-checks

84e9ca3

Use alloc::vec::Vec for vector type annotations

1589029

Update .gitignore to exclude Cargo.lock

b67ebb9

Add artifacts directory check and save model structure

666751d

antimora mentioned this pull request Aug 24, 2025

YOLO11x model output does not match reference on metal backend #3600

Open

Merge remote-tracking branch 'upstream/main' into metal-bug

6e6a83f

antimora requested review from louisfd and wingertge September 23, 2025 00:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Test to recreate metal bug #3609

Test to recreate metal bug #3609

Uh oh!

antimora commented Aug 24, 2025 •

edited

Loading

Uh oh!

codecov bot commented Aug 24, 2025 •

edited

Loading

Uh oh!

antimora commented Sep 23, 2025

Uh oh!

nathanielsimard commented Sep 23, 2025

Uh oh!

ArthurBrussee commented Sep 23, 2025

Uh oh!

antimora commented Sep 24, 2025

Uh oh!

antimora commented Sep 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Test to recreate metal bug #3609

Are you sure you want to change the base?

Test to recreate metal bug #3609

Uh oh!

Conversation

antimora commented Aug 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Aug 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

antimora commented Sep 23, 2025

Uh oh!

nathanielsimard commented Sep 23, 2025

Uh oh!

ArthurBrussee commented Sep 23, 2025

Uh oh!

antimora commented Sep 24, 2025

Uh oh!

antimora commented Sep 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

antimora commented Aug 24, 2025 •

edited

Loading

codecov bot commented Aug 24, 2025 •

edited

Loading