Skip to content

[Setup] Better version check in smoke_test.py #1303

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 29, 2025

Conversation

vmoens
Copy link
Collaborator

@vmoens vmoens commented Apr 29, 2025

[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request Apr 29, 2025
ghstack-source-id: 50a8246
Pull-Request-resolved: #1303
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 29, 2025
@vmoens vmoens added the setup label Apr 29, 2025
[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request Apr 29, 2025
ghstack-source-id: 56624ba
Pull-Request-resolved: #1303
[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request Apr 29, 2025
ghstack-source-id: b5f1d8b
Pull-Request-resolved: #1303
@vmoens vmoens merged commit 2bc4256 into gh/vmoens/52/base Apr 29, 2025
58 of 60 checks passed
vmoens pushed a commit that referenced this pull request Apr 29, 2025
ghstack-source-id: b5f1d8b
Pull-Request-resolved: #1303
@vmoens vmoens deleted the gh/vmoens/52/head branch April 29, 2025 14:09
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 233. Improved: $\large\color{#35bf28}38$. Worsened: $\large\color{#d91a1a}2$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 29.6200μs 11.2517μs 88.8756 KOps/s 88.7129 KOps/s $\color{#35bf28}+0.18\%$
test_plain_set_stack_nested 34.3100μs 11.3109μs 88.4101 KOps/s 87.7472 KOps/s $\color{#35bf28}+0.76\%$
test_plain_set_nested_inplace 37.9000μs 12.3059μs 81.2620 KOps/s 80.7373 KOps/s $\color{#35bf28}+0.65\%$
test_plain_set_stack_nested_inplace 35.9710μs 12.2464μs 81.6567 KOps/s 81.8178 KOps/s $\color{#d91a1a}-0.20\%$
test_items 22.3710μs 2.8631μs 349.2749 KOps/s 347.5625 KOps/s $\color{#35bf28}+0.49\%$
test_items_nested 0.4278ms 0.3612ms 2.7689 KOps/s 2.7953 KOps/s $\color{#d91a1a}-0.94\%$
test_items_nested_locked 0.4200ms 0.3659ms 2.7332 KOps/s 2.7857 KOps/s $\color{#d91a1a}-1.88\%$
test_items_nested_leaf 90.7610μs 59.9309μs 16.6859 KOps/s 16.7502 KOps/s $\color{#d91a1a}-0.38\%$
test_items_stack_nested 0.4186ms 0.3647ms 2.7421 KOps/s 2.8073 KOps/s $\color{#d91a1a}-2.32\%$
test_items_stack_nested_leaf 0.1311ms 60.0116μs 16.6634 KOps/s 16.6440 KOps/s $\color{#35bf28}+0.12\%$
test_items_stack_nested_locked 0.4092ms 0.3620ms 2.7625 KOps/s 2.8075 KOps/s $\color{#d91a1a}-1.61\%$
test_keys 30.0310μs 3.4235μs 292.0953 KOps/s 292.7337 KOps/s $\color{#d91a1a}-0.22\%$
test_keys_nested 0.1206ms 88.1685μs 11.3419 KOps/s 11.4665 KOps/s $\color{#d91a1a}-1.09\%$
test_keys_nested_locked 2.1866ms 93.7476μs 10.6669 KOps/s 10.6912 KOps/s $\color{#d91a1a}-0.23\%$
test_keys_nested_leaf 0.1247ms 78.9336μs 12.6689 KOps/s 12.6692 KOps/s $-0.00\%$
test_keys_stack_nested 0.1331ms 87.6535μs 11.4086 KOps/s 11.4243 KOps/s $\color{#d91a1a}-0.14\%$
test_keys_stack_nested_leaf 0.1111ms 78.5345μs 12.7333 KOps/s 12.7491 KOps/s $\color{#d91a1a}-0.12\%$
test_keys_stack_nested_locked 0.1234ms 93.2592μs 10.7228 KOps/s 10.7155 KOps/s $\color{#35bf28}+0.07\%$
test_values 12.1902μs 0.8598μs 1.1631 MOps/s 1.1694 MOps/s $\color{#d91a1a}-0.54\%$
test_values_nested 61.9510μs 37.3771μs 26.7543 KOps/s 26.6916 KOps/s $\color{#35bf28}+0.23\%$
test_values_nested_locked 62.1310μs 39.3620μs 25.4052 KOps/s 25.2944 KOps/s $\color{#35bf28}+0.44\%$
test_values_nested_leaf 75.3010μs 42.0409μs 23.7864 KOps/s 23.6211 KOps/s $\color{#35bf28}+0.70\%$
test_values_stack_nested 61.8610μs 37.5098μs 26.6597 KOps/s 26.5974 KOps/s $\color{#35bf28}+0.23\%$
test_values_stack_nested_leaf 0.1007ms 42.2448μs 23.6715 KOps/s 23.3646 KOps/s $\color{#35bf28}+1.31\%$
test_values_stack_nested_locked 68.8810μs 39.5699μs 25.2718 KOps/s 25.2850 KOps/s $\color{#d91a1a}-0.05\%$
test_membership 2.5466μs 0.4992μs 2.0032 MOps/s 1.9480 MOps/s $\color{#35bf28}+2.84\%$
test_membership_nested 13.5855μs 1.9939μs 501.5334 KOps/s 507.8610 KOps/s $\color{#d91a1a}-1.25\%$
test_membership_nested_leaf 41.6810μs 1.9905μs 502.3940 KOps/s 501.1078 KOps/s $\color{#35bf28}+0.26\%$
test_membership_stacked_nested 26.1410μs 2.0522μs 487.2760 KOps/s 490.0988 KOps/s $\color{#d91a1a}-0.58\%$
test_membership_stacked_nested_leaf 30.4200μs 2.0389μs 490.4712 KOps/s 487.1530 KOps/s $\color{#35bf28}+0.68\%$
test_membership_nested_last 35.3110μs 3.0359μs 329.3945 KOps/s 330.1382 KOps/s $\color{#d91a1a}-0.23\%$
test_membership_nested_leaf_last 30.4110μs 3.0146μs 331.7203 KOps/s 326.3206 KOps/s $\color{#35bf28}+1.65\%$
test_membership_stacked_nested_last 26.3800μs 3.0751μs 325.1907 KOps/s 333.6684 KOps/s $\color{#d91a1a}-2.54\%$
test_membership_stacked_nested_leaf_last 42.2910μs 3.0161μs 331.5537 KOps/s 329.9557 KOps/s $\color{#35bf28}+0.48\%$
test_nested_getleaf 37.9400μs 13.0336μs 76.7247 KOps/s 76.6666 KOps/s $\color{#35bf28}+0.08\%$
test_nested_get 35.1100μs 12.4006μs 80.6416 KOps/s 80.8831 KOps/s $\color{#d91a1a}-0.30\%$
test_stacked_getleaf 46.4410μs 13.0888μs 76.4010 KOps/s 76.7453 KOps/s $\color{#d91a1a}-0.45\%$
test_stacked_get 31.8210μs 12.3677μs 80.8557 KOps/s 81.0666 KOps/s $\color{#d91a1a}-0.26\%$
test_nested_getitemleaf 40.9500μs 13.4850μs 74.1566 KOps/s 74.3213 KOps/s $\color{#d91a1a}-0.22\%$
test_nested_getitem 48.4000μs 12.8351μs 77.9114 KOps/s 78.5953 KOps/s $\color{#d91a1a}-0.87\%$
test_stacked_getitemleaf 34.9710μs 13.5415μs 73.8473 KOps/s 74.4980 KOps/s $\color{#d91a1a}-0.87\%$
test_stacked_getitem 88.2110μs 12.7375μs 78.5085 KOps/s 79.3497 KOps/s $\color{#d91a1a}-1.06\%$
test_lock_nested 1.8862ms 0.3537ms 2.8274 KOps/s 2.7431 KOps/s $\color{#35bf28}+3.07\%$
test_lock_stack_nested 0.4066ms 0.3420ms 2.9242 KOps/s 2.8265 KOps/s $\color{#35bf28}+3.46\%$
test_unlock_nested 0.4998ms 0.2962ms 3.3766 KOps/s 3.3284 KOps/s $\color{#35bf28}+1.45\%$
test_unlock_stack_nested 0.3317ms 0.2823ms 3.5418 KOps/s 3.4295 KOps/s $\color{#35bf28}+3.27\%$
test_flatten_speed 0.1087ms 76.7018μs 13.0375 KOps/s 13.1653 KOps/s $\color{#d91a1a}-0.97\%$
test_unflatten_speed 0.4544ms 0.3964ms 2.5225 KOps/s 2.5248 KOps/s $\color{#d91a1a}-0.09\%$
test_common_ops 0.9195ms 0.6394ms 1.5639 KOps/s 1.5556 KOps/s $\color{#35bf28}+0.54\%$
test_creation 76.9610μs 1.7507μs 571.2128 KOps/s 570.6401 KOps/s $\color{#35bf28}+0.10\%$
test_creation_empty 0.6993ms 7.0859μs 141.1249 KOps/s 139.3304 KOps/s $\color{#35bf28}+1.29\%$
test_creation_nested_1 98.5720μs 10.0446μs 99.5559 KOps/s 99.3546 KOps/s $\color{#35bf28}+0.20\%$
test_creation_nested_2 0.1056ms 12.9030μs 77.5013 KOps/s 77.5131 KOps/s $\color{#d91a1a}-0.02\%$
test_clone 45.1910μs 10.6478μs 93.9163 KOps/s 89.0931 KOps/s $\textbf{\color{#35bf28}+5.41\%}$
test_getitem[int] 0.1609ms 10.4561μs 95.6377 KOps/s 63.6525 KOps/s $\textbf{\color{#35bf28}+50.25\%}$
test_getitem[slice_int] 0.1176ms 20.9223μs 47.7959 KOps/s 46.1491 KOps/s $\color{#35bf28}+3.57\%$
test_getitem[range] 0.1327ms 38.3330μs 26.0872 KOps/s 25.2126 KOps/s $\color{#35bf28}+3.47\%$
test_getitem[tuple] 0.1116ms 17.6092μs 56.7884 KOps/s 54.3088 KOps/s $\color{#35bf28}+4.57\%$
test_getitem[list] 0.1240ms 33.1645μs 30.1527 KOps/s 28.8334 KOps/s $\color{#35bf28}+4.58\%$
test_setitem_dim[int] 38.9100μs 19.2485μs 51.9521 KOps/s 48.2903 KOps/s $\textbf{\color{#35bf28}+7.58\%}$
test_setitem_dim[slice_int] 58.9200μs 38.6493μs 25.8737 KOps/s 25.3032 KOps/s $\color{#35bf28}+2.25\%$
test_setitem_dim[range] 76.1220μs 53.0489μs 18.8505 KOps/s 18.5300 KOps/s $\color{#35bf28}+1.73\%$
test_setitem_dim[tuple] 65.5910μs 31.6325μs 31.6131 KOps/s 30.9549 KOps/s $\color{#35bf28}+2.13\%$
test_setitem 0.2269ms 15.5259μs 64.4087 KOps/s 62.7481 KOps/s $\color{#35bf28}+2.65\%$
test_set 0.2373ms 14.9574μs 66.8566 KOps/s 64.4842 KOps/s $\color{#35bf28}+3.68\%$
test_set_shared 0.5227ms 0.1604ms 6.2338 KOps/s 6.1434 KOps/s $\color{#35bf28}+1.47\%$
test_update 0.2353ms 18.5222μs 53.9893 KOps/s 51.8553 KOps/s $\color{#35bf28}+4.12\%$
test_update_nested 0.1224ms 28.1556μs 35.5169 KOps/s 34.4357 KOps/s $\color{#35bf28}+3.14\%$
test_update__nested 0.1100ms 25.3757μs 39.4078 KOps/s 37.9894 KOps/s $\color{#35bf28}+3.73\%$
test_set_nested 0.1414ms 16.0461μs 62.3205 KOps/s 59.5627 KOps/s $\color{#35bf28}+4.63\%$
test_set_nested_new 0.1115ms 19.9703μs 50.0743 KOps/s 51.2205 KOps/s $\color{#d91a1a}-2.24\%$
test_select 0.1296ms 30.3597μs 32.9385 KOps/s 32.4016 KOps/s $\color{#35bf28}+1.66\%$
test_select_nested 74.8010μs 44.1419μs 22.6542 KOps/s 22.9907 KOps/s $\color{#d91a1a}-1.46\%$
test_exclude_nested 97.2420μs 62.0787μs 16.1086 KOps/s 16.2740 KOps/s $\color{#d91a1a}-1.02\%$
test_empty[True] 0.3399ms 0.2887ms 3.4633 KOps/s 3.4778 KOps/s $\color{#d91a1a}-0.42\%$
test_empty[False] 2.5081μs 0.8253μs 1.2116 MOps/s 1.2142 MOps/s $\color{#d91a1a}-0.21\%$
test_to 88.6010μs 58.1021μs 17.2111 KOps/s 16.7917 KOps/s $\color{#35bf28}+2.50\%$
test_to_nonblocking 95.2320μs 52.7808μs 18.9463 KOps/s 19.7328 KOps/s $\color{#d91a1a}-3.99\%$
test_unbind_speed 0.2768ms 0.2388ms 4.1873 KOps/s 4.0172 KOps/s $\color{#35bf28}+4.23\%$
test_unbind_speed_stack0 0.3308ms 0.2347ms 4.2611 KOps/s 4.0693 KOps/s $\color{#35bf28}+4.71\%$
test_unbind_speed_stack1 92.3632ms 0.7408ms 1.3499 KOps/s 1.4514 KOps/s $\textbf{\color{#d91a1a}-6.99\%}$
test_split 93.4082ms 1.5906ms 628.7055 Ops/s 608.6045 Ops/s $\color{#35bf28}+3.30\%$
test_chunk 95.6080ms 1.5906ms 628.7068 Ops/s 609.0793 Ops/s $\color{#35bf28}+3.22\%$
test_consolidate[False-None] 95.5372ms 3.0879ms 323.8476 Ops/s 319.2898 Ops/s $\color{#35bf28}+1.43\%$
test_consolidate[default-None] 1.8645ms 1.7081ms 585.4334 Ops/s 556.8979 Ops/s $\textbf{\color{#35bf28}+5.12\%}$
test_consolidate[reduce-overhead-None] 1.8387ms 1.7358ms 576.1167 Ops/s 550.1754 Ops/s $\color{#35bf28}+4.72\%$
test_consolidate_njt[False-None] 6.9212ms 6.5506ms 152.6589 Ops/s 143.0149 Ops/s $\textbf{\color{#35bf28}+6.74\%}$
test_to[False-False-None] 1.8670ms 1.8246ms 548.0528 Ops/s 552.1201 Ops/s $\color{#d91a1a}-0.74\%$
test_to[True-False-None] 1.8897ms 1.4160ms 706.1919 Ops/s 678.5537 Ops/s $\color{#35bf28}+4.07\%$
test_to[within-False-None] 4.4514ms 4.2973ms 232.7067 Ops/s 224.6861 Ops/s $\color{#35bf28}+3.57\%$
test_to[True-default-None] 5.4509ms 5.1525ms 194.0810 Ops/s 186.6610 Ops/s $\color{#35bf28}+3.98\%$
test_to_njt[False-False-None] 7.1085ms 6.9814ms 143.2375 Ops/s 139.5972 Ops/s $\color{#35bf28}+2.61\%$
test_to_njt[True-False-None] 6.1347ms 5.4218ms 184.4394 Ops/s 179.3348 Ops/s $\color{#35bf28}+2.85\%$
test_to_njt[within-False-None] 12.2843ms 12.1835ms 82.0783 Ops/s 78.9108 Ops/s $\color{#35bf28}+4.01\%$
test_creation[device0] 0.4516ms 79.2217μs 12.6228 KOps/s 11.9644 KOps/s $\textbf{\color{#35bf28}+5.50\%}$
test_creation_from_tensor 0.4936ms 82.6648μs 12.0971 KOps/s 11.8373 KOps/s $\color{#35bf28}+2.19\%$
test_add_one[memmap_tensor0] 0.2926ms 6.6940μs 149.3877 KOps/s 139.4386 KOps/s $\textbf{\color{#35bf28}+7.14\%}$
test_contiguous[memmap_tensor0] 2.1280μs 0.4157μs 2.4058 MOps/s 2.3852 MOps/s $\color{#35bf28}+0.86\%$
test_stack[memmap_tensor0] 29.1810μs 4.4098μs 226.7672 KOps/s 200.5253 KOps/s $\textbf{\color{#35bf28}+13.09\%}$
test_memmaptd_index 1.8445ms 0.2471ms 4.0477 KOps/s 3.9981 KOps/s $\color{#35bf28}+1.24\%$
test_memmaptd_index_astensor 0.4526ms 0.3071ms 3.2565 KOps/s 3.2066 KOps/s $\color{#35bf28}+1.56\%$
test_memmaptd_index_op 0.9704ms 0.5531ms 1.8081 KOps/s 1.7335 KOps/s $\color{#35bf28}+4.30\%$
test_serialize_model 0.1327s 0.1319s 7.5835 Ops/s 7.5769 Ops/s $\color{#35bf28}+0.09\%$
test_serialize_model_pickle 1.3691s 1.2226s 0.8179 Ops/s 0.8208 Ops/s $\color{#d91a1a}-0.35\%$
test_serialize_weights 0.1318s 0.1311s 7.6279 Ops/s 7.6248 Ops/s $\color{#35bf28}+0.04\%$
test_serialize_weights_returnearly 0.3181s 52.4767ms 19.0561 Ops/s 14.6141 Ops/s $\textbf{\color{#35bf28}+30.39\%}$
test_serialize_weights_pickle 1.3788s 1.2228s 0.8178 Ops/s 0.8207 Ops/s $\color{#d91a1a}-0.36\%$
test_reshape_pytree 53.4300μs 22.0584μs 45.3341 KOps/s 43.8571 KOps/s $\color{#35bf28}+3.37\%$
test_reshape_td 51.0400μs 26.5095μs 37.7224 KOps/s 32.9409 KOps/s $\textbf{\color{#35bf28}+14.52\%}$
test_view_pytree 57.1110μs 21.9352μs 45.5889 KOps/s 45.1021 KOps/s $\color{#35bf28}+1.08\%$
test_view_td 61.3710μs 30.7436μs 32.5271 KOps/s 30.2945 KOps/s $\textbf{\color{#35bf28}+7.37\%}$
test_unbind_pytree 60.4010μs 27.8126μs 35.9550 KOps/s 34.5425 KOps/s $\color{#35bf28}+4.09\%$
test_unbind_td 0.6396ms 36.3622μs 27.5011 KOps/s 26.1538 KOps/s $\textbf{\color{#35bf28}+5.15\%}$
test_split_pytree 58.1810μs 29.0915μs 34.3743 KOps/s 33.0376 KOps/s $\color{#35bf28}+4.05\%$
test_split_td 0.8364ms 39.2795μs 25.4586 KOps/s 24.3238 KOps/s $\color{#35bf28}+4.67\%$
test_add_pytree 61.0610μs 34.1721μs 29.2636 KOps/s 27.8268 KOps/s $\textbf{\color{#35bf28}+5.16\%}$
test_add_td 0.2911ms 50.5021μs 19.8012 KOps/s 19.5906 KOps/s $\color{#35bf28}+1.07\%$
test_compile_add_one_nested[tensordict-compile] 0.1765ms 0.1253ms 7.9821 KOps/s 7.6755 KOps/s $\color{#35bf28}+3.99\%$
test_compile_add_one_nested[tensordict-eager] 0.2354ms 0.1412ms 7.0805 KOps/s 6.9555 KOps/s $\color{#35bf28}+1.80\%$
test_compile_add_one_nested[pytree-compile] 0.1413ms 96.7248μs 10.3386 KOps/s 10.0200 KOps/s $\color{#35bf28}+3.18\%$
test_compile_add_one_nested[pytree-eager] 1.6020ms 0.1560ms 6.4105 KOps/s 6.3290 KOps/s $\color{#35bf28}+1.29\%$
test_compile_copy_nested[tensordict-compile] 56.8010μs 24.0279μs 41.6183 KOps/s 40.8085 KOps/s $\color{#35bf28}+1.98\%$
test_compile_copy_nested[tensordict-eager] 70.9210μs 34.7848μs 28.7482 KOps/s 28.4908 KOps/s $\color{#35bf28}+0.90\%$
test_compile_copy_nested[pytree-compile] 0.4007ms 63.7070μs 15.6969 KOps/s 15.3842 KOps/s $\color{#35bf28}+2.03\%$
test_compile_copy_nested[pytree-eager] 94.1720μs 48.4868μs 20.6242 KOps/s 20.3743 KOps/s $\color{#35bf28}+1.23\%$
test_compile_add_one_flat[tensordict-compile] 0.2259ms 0.1429ms 6.9963 KOps/s 6.8070 KOps/s $\color{#35bf28}+2.78\%$
test_compile_add_one_flat[tensordict-eager] 0.3444ms 0.2190ms 4.5662 KOps/s 4.5498 KOps/s $\color{#35bf28}+0.36\%$
test_compile_add_one_flat[tensorclass-compile] 0.1345ms 98.3579μs 10.1670 KOps/s 10.0985 KOps/s $\color{#35bf28}+0.68\%$
test_compile_add_one_flat[tensorclass-eager] 0.4630ms 59.4160μs 16.8305 KOps/s 17.1199 KOps/s $\color{#d91a1a}-1.69\%$
test_compile_add_one_flat[pytree-compile] 0.1887ms 0.1366ms 7.3233 KOps/s 7.1913 KOps/s $\color{#35bf28}+1.84\%$
test_compile_add_one_flat[pytree-eager] 0.9212ms 0.5147ms 1.9429 KOps/s 1.9335 KOps/s $\color{#35bf28}+0.48\%$
test_compile_add_self_flat[tensordict-eager] 0.6757ms 0.2649ms 3.7743 KOps/s 3.7197 KOps/s $\color{#35bf28}+1.47\%$
test_compile_add_self_flat[tensordict-compile] 0.2233ms 0.1454ms 6.8776 KOps/s 6.8620 KOps/s $\color{#35bf28}+0.23\%$
test_compile_add_self_flat[tensorclass-eager] 0.4655ms 70.4718μs 14.1901 KOps/s 14.1611 KOps/s $\color{#35bf28}+0.20\%$
test_compile_add_self_flat[tensorclass-compile] 0.4965ms 98.6588μs 10.1359 KOps/s 10.0737 KOps/s $\color{#35bf28}+0.62\%$
test_compile_add_self_flat[pytree-eager] 0.8334ms 0.4355ms 2.2962 KOps/s 2.3179 KOps/s $\color{#d91a1a}-0.94\%$
test_compile_add_self_flat[pytree-compile] 0.5488ms 0.1363ms 7.3369 KOps/s 7.3069 KOps/s $\color{#35bf28}+0.41\%$
test_compile_copy_flat[tensordict-compile] 0.4060ms 18.8402μs 53.0780 KOps/s 53.1823 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_copy_flat[tensordict-eager] 0.4242ms 32.4194μs 30.8458 KOps/s 31.5723 KOps/s $\color{#d91a1a}-2.30\%$
test_compile_copy_flat[pytree-compile] 0.4585ms 69.8271μs 14.3211 KOps/s 14.2278 KOps/s $\color{#35bf28}+0.66\%$
test_compile_copy_flat[pytree-eager] 0.4280ms 52.2281μs 19.1468 KOps/s 19.2702 KOps/s $\color{#d91a1a}-0.64\%$
test_compile_assign_and_add[tensordict-compile] 1.6872ms 0.4037ms 2.4773 KOps/s 2.1902 KOps/s $\textbf{\color{#35bf28}+13.11\%}$
test_compile_assign_and_add[tensordict-eager] 3.0267ms 2.8336ms 352.9126 Ops/s 354.9334 Ops/s $\color{#d91a1a}-0.57\%$
test_compile_assign_and_add[pytree-compile] 1.5873ms 0.4317ms 2.3167 KOps/s 2.2490 KOps/s $\color{#35bf28}+3.01\%$
test_compile_assign_and_add[pytree-eager] 2.9841ms 2.7520ms 363.3689 Ops/s 357.4454 Ops/s $\color{#35bf28}+1.66\%$
test_compile_indexing[tensor-tensordict-compile] 0.2244ms 0.1159ms 8.6310 KOps/s 8.6988 KOps/s $\color{#d91a1a}-0.78\%$
test_compile_indexing[tensor-tensordict-eager] 0.5497ms 84.7561μs 11.7986 KOps/s 10.9494 KOps/s $\textbf{\color{#35bf28}+7.76\%}$
test_compile_indexing[tensor-tensorclass-compile] 0.1594ms 0.1093ms 9.1500 KOps/s 8.7384 KOps/s $\color{#35bf28}+4.71\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1170ms 70.8673μs 14.1109 KOps/s 13.6309 KOps/s $\color{#35bf28}+3.52\%$
test_compile_indexing[tensor-pytree-compile] 0.1549ms 0.1164ms 8.5913 KOps/s 8.9016 KOps/s $\color{#d91a1a}-3.49\%$
test_compile_indexing[tensor-pytree-eager] 0.1164ms 74.1691μs 13.4827 KOps/s 13.5082 KOps/s $\color{#d91a1a}-0.19\%$
test_compile_indexing[slice-tensordict-compile] 0.1440ms 0.1000ms 9.9972 KOps/s 9.5567 KOps/s $\color{#35bf28}+4.61\%$
test_compile_indexing[slice-tensordict-eager] 0.1458ms 18.8053μs 53.1766 KOps/s 47.7043 KOps/s $\textbf{\color{#35bf28}+11.47\%}$
test_compile_indexing[slice-tensorclass-compile] 0.2171ms 98.8559μs 10.1157 KOps/s 10.1360 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_indexing[slice-tensorclass-eager] 51.9900μs 15.4621μs 64.6743 KOps/s 61.6932 KOps/s $\color{#35bf28}+4.83\%$
test_compile_indexing[slice-pytree-compile] 0.1435ms 97.2555μs 10.2822 KOps/s 9.5798 KOps/s $\textbf{\color{#35bf28}+7.33\%}$
test_compile_indexing[slice-pytree-eager] 49.2000μs 15.4176μs 64.8610 KOps/s 61.5041 KOps/s $\textbf{\color{#35bf28}+5.46\%}$
test_compile_indexing[int-tensordict-compile] 0.2195ms 0.1060ms 9.4370 KOps/s 9.6086 KOps/s $\color{#d91a1a}-1.79\%$
test_compile_indexing[int-tensordict-eager] 0.6137ms 18.3699μs 54.4368 KOps/s 48.2289 KOps/s $\textbf{\color{#35bf28}+12.87\%}$
test_compile_indexing[int-tensorclass-compile] 0.1580ms 0.1019ms 9.8154 KOps/s 10.1287 KOps/s $\color{#d91a1a}-3.09\%$
test_compile_indexing[int-tensorclass-eager] 47.8200μs 15.5657μs 64.2437 KOps/s 61.2530 KOps/s $\color{#35bf28}+4.88\%$
test_compile_indexing[int-pytree-compile] 0.1436ms 96.7212μs 10.3390 KOps/s 9.6218 KOps/s $\textbf{\color{#35bf28}+7.45\%}$
test_compile_indexing[int-pytree-eager] 49.9310μs 15.4482μs 64.7324 KOps/s 49.7934 KOps/s $\textbf{\color{#35bf28}+30.00\%}$
test_mod_add[eager] 78.9110μs 38.1452μs 26.2156 KOps/s 25.1414 KOps/s $\color{#35bf28}+4.27\%$
test_mod_add[compile] 0.1310ms 81.5397μs 12.2640 KOps/s 11.7916 KOps/s $\color{#35bf28}+4.01\%$
test_mod_add[compile-overhead] 0.3571ms 0.1750ms 5.7127 KOps/s 5.4657 KOps/s $\color{#35bf28}+4.52\%$
test_mod_wrap[eager] 0.3296ms 0.2505ms 3.9914 KOps/s 3.7995 KOps/s $\textbf{\color{#35bf28}+5.05\%}$
test_mod_wrap[compile] 0.4037ms 0.2887ms 3.4635 KOps/s 3.2212 KOps/s $\textbf{\color{#35bf28}+7.52\%}$
test_mod_wrap[compile-overhead] 6.7376ms 3.6639ms 272.9315 Ops/s 257.5377 Ops/s $\textbf{\color{#35bf28}+5.98\%}$
test_mod_wrap_and_backward[eager] 1.8067ms 1.3960ms 716.3257 Ops/s 674.0374 Ops/s $\textbf{\color{#35bf28}+6.27\%}$
test_mod_wrap_and_backward[compile] 1.4031ms 1.2841ms 778.7458 Ops/s 698.4282 Ops/s $\textbf{\color{#35bf28}+11.50\%}$
test_mod_wrap_and_backward[compile-overhead] 1.4140ms 0.9345ms 1.0701 KOps/s 946.8949 Ops/s $\textbf{\color{#35bf28}+13.01\%}$
test_seq_add[eager] 0.3411ms 0.1259ms 7.9436 KOps/s 7.5317 KOps/s $\textbf{\color{#35bf28}+5.47\%}$
test_seq_add[compile] 0.1473ms 88.4184μs 11.3099 KOps/s 10.7102 KOps/s $\textbf{\color{#35bf28}+5.60\%}$
test_seq_add[compile-overhead] 0.5384ms 0.1317ms 7.5946 KOps/s 7.4707 KOps/s $\color{#35bf28}+1.66\%$
test_seq_wrap[eager] 1.2787ms 0.4307ms 2.3219 KOps/s 2.2479 KOps/s $\color{#35bf28}+3.30\%$
test_seq_wrap[compile] 1.1615ms 0.3068ms 3.2597 KOps/s 3.0192 KOps/s $\textbf{\color{#35bf28}+7.96\%}$
test_seq_wrap[compile-overhead] 0.6431ms 0.2280ms 4.3867 KOps/s 4.2535 KOps/s $\color{#35bf28}+3.13\%$
test_func_call_runtime[False-eager] 1.1311ms 0.7320ms 1.3660 KOps/s 1.3055 KOps/s $\color{#35bf28}+4.64\%$
test_func_call_runtime[False-compile] 1.1514ms 0.7457ms 1.3410 KOps/s 1.2802 KOps/s $\color{#35bf28}+4.75\%$
test_func_call_runtime[False-compile-overhead] 0.4235ms 0.3642ms 2.7460 KOps/s 2.6586 KOps/s $\color{#35bf28}+3.29\%$
test_func_call_runtime[True-eager] 1.2945ms 0.8972ms 1.1146 KOps/s 1.0653 KOps/s $\color{#35bf28}+4.63\%$
test_func_call_runtime[True-compile] 0.9677ms 0.7856ms 1.2729 KOps/s 1.2498 KOps/s $\color{#35bf28}+1.85\%$
test_func_call_runtime[True-compile-overhead] 0.4490ms 0.3903ms 2.5619 KOps/s 2.5359 KOps/s $\color{#35bf28}+1.02\%$
test_func_call_cm_runtime[False-eager] 0.8089ms 0.7307ms 1.3686 KOps/s 1.2491 KOps/s $\textbf{\color{#35bf28}+9.56\%}$
test_func_call_cm_runtime[False-compile] 0.8292ms 0.7579ms 1.3194 KOps/s 1.2984 KOps/s $\color{#35bf28}+1.62\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4400ms 0.3695ms 2.7066 KOps/s 2.6717 KOps/s $\color{#35bf28}+1.31\%$
test_func_call_cm_runtime[True-eager] 1.2840ms 1.0166ms 983.6331 Ops/s 956.4370 Ops/s $\color{#35bf28}+2.84\%$
test_func_call_cm_runtime[True-compile] 1.1560ms 0.9953ms 1.0047 KOps/s 965.2589 Ops/s $\color{#35bf28}+4.09\%$
test_func_call_cm_runtime[True-compile-overhead] 1.0690ms 1.0026ms 997.3785 Ops/s 967.0912 Ops/s $\color{#35bf28}+3.13\%$
test_vmap_func_call_cm_runtime[eager] 2.7886ms 2.1360ms 468.1540 Ops/s 463.9128 Ops/s $\color{#35bf28}+0.91\%$
test_vmap_func_call_cm_runtime[compile] 0.8681ms 0.8205ms 1.2188 KOps/s 1.1663 KOps/s $\color{#35bf28}+4.50\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4877ms 0.4198ms 2.3823 KOps/s 2.3326 KOps/s $\color{#35bf28}+2.13\%$
test_distributed 3.0044ms 0.2296ms 4.3554 KOps/s 8.6291 KOps/s $\textbf{\color{#d91a1a}-49.53\%}$
test_tdmodule 29.1700μs 20.0602μs 49.8499 KOps/s 45.9549 KOps/s $\textbf{\color{#35bf28}+8.48\%}$
test_tdmodule_dispatch 59.7910μs 37.8259μs 26.4369 KOps/s 25.6058 KOps/s $\color{#35bf28}+3.25\%$
test_tdseq 39.3100μs 20.2498μs 49.3831 KOps/s 49.1529 KOps/s $\color{#35bf28}+0.47\%$
test_tdseq_dispatch 60.1510μs 39.4901μs 25.3228 KOps/s 24.6674 KOps/s $\color{#35bf28}+2.66\%$
test_instantiation_functorch 1.6343ms 1.5279ms 654.5097 Ops/s 634.7379 Ops/s $\color{#35bf28}+3.11\%$
test_exec_functorch 0.3143ms 0.1456ms 6.8665 KOps/s 6.7794 KOps/s $\color{#35bf28}+1.29\%$
test_exec_functional_call 0.2252ms 0.1381ms 7.2425 KOps/s 6.8957 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_exec_td_decorator 0.3806ms 0.1881ms 5.3164 KOps/s 5.1535 KOps/s $\color{#35bf28}+3.16\%$
test_vmap_mlp_speed_decorator[True-True] 0.9081ms 0.6934ms 1.4422 KOps/s 1.4147 KOps/s $\color{#35bf28}+1.94\%$
test_vmap_mlp_speed_decorator[True-False] 0.8860ms 0.6925ms 1.4441 KOps/s 1.4200 KOps/s $\color{#35bf28}+1.70\%$
test_vmap_mlp_speed_decorator[False-True] 0.7639ms 0.6056ms 1.6513 KOps/s 1.6418 KOps/s $\color{#35bf28}+0.58\%$
test_vmap_mlp_speed_decorator[False-False] 0.7654ms 0.5997ms 1.6674 KOps/s 1.5740 KOps/s $\textbf{\color{#35bf28}+5.93\%}$
test_vmap_transformer_speed_decorator[True-True] 20.0506ms 19.3666ms 51.6352 Ops/s 50.6284 Ops/s $\color{#35bf28}+1.99\%$
test_vmap_transformer_speed_decorator[True-False] 20.0455ms 19.3405ms 51.7050 Ops/s 50.8870 Ops/s $\color{#35bf28}+1.61\%$
test_vmap_transformer_speed_decorator[False-True] 19.8239ms 19.2454ms 51.9604 Ops/s 51.3260 Ops/s $\color{#35bf28}+1.24\%$
test_vmap_transformer_speed_decorator[False-False] 19.3724ms 19.2842ms 51.8558 Ops/s 51.1606 Ops/s $\color{#35bf28}+1.36\%$
test_to_module_speed[True] 1.3120ms 0.9623ms 1.0392 KOps/s 1.0378 KOps/s $\color{#35bf28}+0.13\%$
test_to_module_speed[False] 1.3988ms 0.9600ms 1.0417 KOps/s 1.0480 KOps/s $\color{#d91a1a}-0.60\%$
test_tc_init 0.1422ms 34.6754μs 28.8389 KOps/s 28.9461 KOps/s $\color{#d91a1a}-0.37\%$
test_tc_init_tensor_only 0.1044ms 10.7100μs 93.3710 KOps/s 93.8602 KOps/s $\color{#d91a1a}-0.52\%$
test_tc_init_nested 0.1752ms 68.5711μs 14.5834 KOps/s 15.0598 KOps/s $\color{#d91a1a}-3.16\%$
test_tc_first_layer_tensor 5.7368μs 0.8121μs 1.2313 MOps/s 1.1024 MOps/s $\textbf{\color{#35bf28}+11.69\%}$
test_tc_first_layer_tensor_only 2.1355μs 0.4338μs 2.3053 MOps/s 2.4159 MOps/s $\color{#d91a1a}-4.58\%$
test_tc_first_layer_tensor_set 31.9610μs 2.8961μs 345.2871 KOps/s 344.5037 KOps/s $\color{#35bf28}+0.23\%$
test_tc_first_layer_tensor_only_set 11.1767μs 1.7939μs 557.4418 KOps/s 564.0769 KOps/s $\color{#d91a1a}-1.18\%$
test_tc_first_layer_nontensor 21.8100μs 2.3440μs 426.6241 KOps/s 429.7601 KOps/s $\color{#d91a1a}-0.73\%$
test_tc_second_layer_tensor 21.9700μs 1.7508μs 571.1562 KOps/s 576.2883 KOps/s $\color{#d91a1a}-0.89\%$
test_tc_second_layer_nontensor 46.1710μs 3.2106μs 311.4664 KOps/s 317.5092 KOps/s $\color{#d91a1a}-1.90\%$
test_unbind 0.2305s 10.7537ms 92.9911 Ops/s 77.4195 Ops/s $\textbf{\color{#35bf28}+20.11\%}$
test_full_like 7.7238ms 4.3483ms 229.9745 Ops/s 134.2820 Ops/s $\textbf{\color{#35bf28}+71.26\%}$
test_zeros_like 4.8212ms 4.3189ms 231.5382 Ops/s 230.8120 Ops/s $\color{#35bf28}+0.31\%$
test_ones_like 11.5435ms 4.3781ms 228.4092 Ops/s 137.4456 Ops/s $\textbf{\color{#35bf28}+66.18\%}$
test_clone 6.4904ms 6.3837ms 156.6483 Ops/s 155.3555 Ops/s $\color{#35bf28}+0.83\%$
test_squeeze 87.5310μs 10.3760μs 96.3758 KOps/s 97.0328 KOps/s $\color{#d91a1a}-0.68\%$
test_unsqueeze 0.1345ms 75.8766μs 13.1793 KOps/s 13.3075 KOps/s $\color{#d91a1a}-0.96\%$
test_split 0.4390ms 0.1655ms 6.0423 KOps/s 6.1457 KOps/s $\color{#d91a1a}-1.68\%$
test_permute 0.2508ms 0.1896ms 5.2746 KOps/s 5.3762 KOps/s $\color{#d91a1a}-1.89\%$
test_stack 51.1073ms 50.4997ms 19.8021 Ops/s 19.7504 Ops/s $\color{#35bf28}+0.26\%$
test_cat 50.8560ms 50.5039ms 19.8005 Ops/s 19.7981 Ops/s $\color{#35bf28}+0.01\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. setup
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants