Releases · flashinfer-ai/flashinfer

09 Aug 04:51

cyx-6

v0.2.11

fc88829

v0.2.11 Pre-release

Pre-release

What's Changed

Fix flag order by @nandor in #1392
Add flags to trim down AoT builds by @nandor in #1393
Force upgrade cuDNN to latest by @paul841029 in #1401
Adding FP8 benchmark on attention and matmul testing by @bkryu in #1390
feature: enable cublas for fp4 gemm when cudnn == 9.11.1 or >= 9.13 by @ttyio in #1405
Relax the clear_cuda_cache by @yongwww in #1406
Update autotune results for the nvfp4 cutlass moe backends for v0.2.9 by @kaixih in #1361
fix shared memory alignment conflict in sampling.cuh by @842974287 in #1402
Fix trtllm moe launcher local_num_experts by @wenscarl in #1398
[bugfix] Fix compilation failure when compiling csrc/trtllm_moe_allreduce_fusion.cu by @nvpohanh in #1410
install: remove nvidia-cudnn-12 from package dependency by @yzh119 in #1409
Add mypy to pre-commit by @cyx-6 in #1179
feat(aot): add nvshmem module for aot compilation by @EmilienM in #1261
Add ruff to pre-commit by @cyx-6 in #1201
install: remove nvidia-nvshmem-cu12 from package dependency by @EmilienM in #1426
Fix redundant kernels in moe by @fzyzcjy in #1428
ci: add arm64 to release-ci-docker.yml by @yzh119 in #1429
Fix crash when pos_encoding_mode is passed as int by @kaixih in #1413
Fix trtllm_ar failure by @nvpohanh in #1423
Use self hosted runner for arm image build by @yongwww in #1433
Remote const qualifier to avoid compilation error by @842974287 in #1421
Add multi-arch Docker image for x86-64 and arm64 by @yongwww in #1431
Add NOTICE with copyrights by @sricketts in #1432
Fix FusedMoeRunner does not exist error by @nvpohanh in #1424
Putting back cudnn_batch_prefill_with_kv_cache that was deleted by ruff by @bkryu in #1438
Decouple cutlass config version from flashinfer version by @kaixih in #1441
feat: Fused rope fp8 quantize kernel for MLA by @yzh119 in #1339

New Contributors

@paul841029 made their first contribution in #1401
@842974287 made their first contribution in #1402
@fzyzcjy made their first contribution in #1428
@sricketts made their first contribution in #1432

Full Changelog: v0.2.10...v0.2.11

Contributors

ttyio, nandor, and 12 other contributors

Assets 2

05 Aug 17:45

yzh119

v0.2.10

7c79b41

v0.2.10 Latest

Latest

What's Changed

GPT-OSS Support: Add Blackwell MoE mxfp4 implementation from TRTLLM and Attention Sink by @joker-eph in #1389
release: bump version to v0.2.10 by @yzh119 in #1391

Full Changelog: v0.2.9...v0.2.10

Contributors

joker-eph and yzh119

Assets 2

05 Aug 00:37

yzh119

v0.2.9

9158fef

v0.2.9

What's Changed

Reduce the JIT compilation time of gen_gemm_sm100_module by @jinyangyuan-nvidia in #1251
fix: correctly pass k_scale and v_scale to run() in forward_return_lse (#1023) by @vlev02 in #1254
Made AR output optional + esthetic changes by @nvmbreughe in #1265
init add gemm fp8 using cudnn backend by @ttyio in #1264
Feature/sm100 low latency nvfp4 kernels by @azhurkevich in #1214
CI: install nvidia-nvshmem-cu12 by @EmilienM in #1262
feat: enable trtllm-gen mla MTP by @yyihuang in #1258
Add trtllm-gen attention mha kernel with FP8 Q/K/V and FP8 output by @weireweire in #1242
add trtllm-gen context attention by @IwakuraRein in #1239
feat: add masked deepgemm support and benchmarking by @cyx-6 in #1266
Add missing import in comm/init,py by @joker-eph in #1275
hotfix: fix deepgemm artifactory hash by @cyx-6 in #1278
Unify groupwise fp8 GEMM test by @cyx-6 in #1281
fix: update trtllm-gen fmha benchmark by @yyihuang in #1280
fix multiCtasKvScratchPtr misalignment issue (new one) by @nvpohanh in #1286
Fix install folder regression, and JIT-vs-AOT differences by @directhex in #1279
Add shuffle matrix flag by @aleozlx in #1272
Convert scale_factor from scalar to Tensor in trt_allreduce_fusion by @ilmarkov in #1284
patch error handling by @aleozlx in #1293
Bug fix: guard fp8 e8m0 and e2m1 compile by @Edenzzzz in #1287
refactor: Improved metainfo for trtllm-gen fmha by @cyx-6 in #1292
add mm_fp4 use cudnn backend by @ttyio in #1288
fix: minor errors in cubin loader by @yyihuang in #1295
perfix: use lightweight API to query device property by @azhurkevich in #1298
refactor: refactor trtllm-gen attention kernel integration code by @yzh119 in #1289
Remove FAST_BUILD FLAG for MOE by @wenscarl in #1291
bugfix: ensure graph is captured and executed on the same stream to avoid rep… by @elfiegg in #1303
minor: some fix and cleanup for trtllm-gen mha by @yyihuang in #1302
[Feature] SM level profiler by @Edenzzzz in #1305
Heuristics + testing unification + CUDA Graphs by @azhurkevich in #1306
Update cutlass fp4 moe kernels by @wenscarl in #1294
Fix the bug of the kernel-selection heuristic in trtllm-gen by @PerkzZheng in #1307
test qkvo quantization not equal to 1. by @weireweire in #1314
[fix] fix integer overflow in FA2 customized_mask & add buffer overflow warning. by @happierpig in #1290
Addition of flashinfer_benchmark.py for benchmarking routines by @bkryu in #1323
minor: update devcontainer by @yyihuang in #1329
Fix redundant argument in TrtllmGenDecodeModule by @IwakuraRein in #1326
Optimizations for TRTLLM MNNVL Allreduce by @timlee0212 in #1321
add torch float4_e2m1fn_x2 check for cudnn fp4 backend by @ttyio in #1333
only add cudnn dependency for x86 platform by @ttyio in #1332
Make Fp8 MoE routing_bias optional by @aleozlx in #1319
feat: Add weight layout option for trtllm-gen fused moe by @aleozlx in #1297
[Fix] remove torch 2.8 requirement for FP4 GEMM by @elfiegg in #1334
Bug fix: fix duplicate launch in POD by @Edenzzzz in #1267
Add blockwise-scaled FP8 GEMM via TRTLLM-Gen. by @sergachev in #1320
feat: support output nvfp4 in trtllm-gen function call. by @weireweire in #1318
Fix bench deepgemm setting by @cyx-6 in #1344
fix: fix trtllm-gen mla error on new interface by @yyihuang in #1348
[Bugfix] Change max_size for LRU by @elfiegg in #1349
Support loading autotuned results from json for cutlass fp4 moe backends by @kaixih in #1310
Refactor scripts in benchmarks to use flasinfer.testing.bench_gpu_time by @bkryu in #1337
bugfix: Change default index in routingTopKExperts by @amirkl94 in #1347
Support passing kv_data_type to MultiLevelCascadeAttentionWrapper.plan() by @sarckk in #1350
Add trtllm-gen prefill test. Fix related wrapper issue. by @weireweire in #1346
feat: Support logits_soft_cap for Persistent attn; fix kv split limit by @Edenzzzz in #1324
chore: remove cpp benchmarks, tests, cmake path, as they are deprecated by @hypdeb in #1345
minor: add trtllm_gen_mla benchmark by @yyihuang in #1316
cleanup: retire aot-build-utils by @yzh119 in #1354
minor: more informative error message for buffer overflow by @Edenzzzz in #1357
gen_trtllm_comm_module: fix device capability detection by @dtrifiro in #1356
Refactor Fused Moe Module by @wenscarl in #1309
Add native cudnn_decode for improved cudnn decode performance by @Anerudhan in #1283
Update CI docker container to use latest cudnn by @yzh119 in #1362
feature: add fp4 mm using trtllm backend by @ttyio in #1355
support trtllm-gen prefill fp4 output by @weireweire in #1360
Allow cudnn prefill kernels to be called natively by @Anerudhan in #1317
bugfix: fix ci for aot-compile by @yzh119 in #1364
feat: auto deduce use_oneshot from token_num in all-reduce by @yyihuang in #1365
add cutlass backend for mm_fp4 by @ttyio in #1296
Support scale factor start index for fp4 mha prefill/decode by @weireweire in #1363
test: add cuda graph to comm test by @yyihuang in #1366
ci: add requests to ci docker container by @yzh119 in #1370
Artifact downloading and single sourced artifact path by @cyx-6 in #1369
[fix] remove (view) transpose to keep consistent with majorness MN requirement. by @elfiegg in #1358
hotfix: update mxfp4 groupwise-scaled gemm unittests by @yzh119 in #1359
bugfix: fixed cutlass fused moe usage of FP4QuantizationSFLayout::SWIZZLED by @yzh119 in #1371
ci: add blackwell unittest scripts by @yzh119 in #1372
Update documentation index by @cyx-6 in #1374
bugfix: do cudnn related error check only when cudnn backend is enabled. by @ttyio in #1377
bugfix: Add guard for fp4/fp8 related include headers by @yzh119 in #1376
refactor: download trtllm gemm metadata from server by @ttyio in #1378
Fix sphinx error by @cyx-6 in #1380
release: bump version to v0.2.9 by @yzh119 in #1381

New Contributors

...

Contributors

directhex, ttyio, and 28 other contributors

Assets 2

27 Jul 05:18

yzh119

v0.2.9rc2

cf39366

v0.2.9rc2 Pre-release

Pre-release

What's Changed

Reduce the JIT compilation time of gen_gemm_sm100_module by @jinyangyuan-nvidia in #1251
fix: correctly pass k_scale and v_scale to run() in forward_return_lse (#1023) by @vlev02 in #1254
Made AR output optional + esthetic changes by @nvmbreughe in #1265
init add gemm fp8 using cudnn backend by @ttyio in #1264
Feature/sm100 low latency nvfp4 kernels by @azhurkevich in #1214
CI: install nvidia-nvshmem-cu12 by @EmilienM in #1262
feat: enable trtllm-gen mla MTP by @yyihuang in #1258
Add trtllm-gen attention mha kernel with FP8 Q/K/V and FP8 output by @weireweire in #1242
add trtllm-gen context attention by @IwakuraRein in #1239
feat: add masked deepgemm support and benchmarking by @cyx-6 in #1266
Add missing import in comm/init,py by @joker-eph in #1275
hotfix: fix deepgemm artifactory hash by @cyx-6 in #1278
Unify groupwise fp8 GEMM test by @cyx-6 in #1281
fix: update trtllm-gen fmha benchmark by @yyihuang in #1280
fix multiCtasKvScratchPtr misalignment issue (new one) by @nvpohanh in #1286
Fix install folder regression, and JIT-vs-AOT differences by @directhex in #1279
Add shuffle matrix flag by @aleozlx in #1272
Convert scale_factor from scalar to Tensor in trt_allreduce_fusion by @ilmarkov in #1284
patch error handling by @aleozlx in #1293
Bug fix: guard fp8 e8m0 and e2m1 compile by @Edenzzzz in #1287
refactor: Improved metainfo for trtllm-gen fmha by @cyx-6 in #1292
add mm_fp4 use cudnn backend by @ttyio in #1288
fix: minor errors in cubin loader by @yyihuang in #1295
perfix: use lightweight API to query device property by @azhurkevich in #1298
refactor: refactor trtllm-gen attention kernel integration code by @yzh119 in #1289
Remove FAST_BUILD FLAG for MOE by @wenscarl in #1291
bugfix: ensure graph is captured and executed on the same stream to avoid rep… by @elfiegg in #1303
minor: some fix and cleanup for trtllm-gen mha by @yyihuang in #1302
[Feature] SM level profiler by @Edenzzzz in #1305
Heuristics + testing unification + CUDA Graphs by @azhurkevich in #1306
Update cutlass fp4 moe kernels by @wenscarl in #1294
Fix the bug of the kernel-selection heuristic in trtllm-gen by @PerkzZheng in #1307
test qkvo quantization not equal to 1. by @weireweire in #1314
[fix] fix integer overflow in FA2 customized_mask & add buffer overflow warning. by @happierpig in #1290
Addition of flashinfer_benchmark.py for benchmarking routines by @bkryu in #1323
minor: update devcontainer by @yyihuang in #1329
Fix redundant argument in TrtllmGenDecodeModule by @IwakuraRein in #1326
Optimizations for TRTLLM MNNVL Allreduce by @timlee0212 in #1321
add torch float4_e2m1fn_x2 check for cudnn fp4 backend by @ttyio in #1333
only add cudnn dependency for x86 platform by @ttyio in #1332
Make Fp8 MoE routing_bias optional by @aleozlx in #1319
feat: Add weight layout option for trtllm-gen fused moe by @aleozlx in #1297
[Fix] remove torch 2.8 requirement for FP4 GEMM by @elfiegg in #1334
Bug fix: fix duplicate launch in POD by @Edenzzzz in #1267

New Contributors

@vlev02 made their first contribution in #1254
@ttyio made their first contribution in #1264
@azhurkevich made their first contribution in #1214
@weireweire made their first contribution in #1242
@IwakuraRein made their first contribution in #1239
@nvpohanh made their first contribution in #1286
@directhex made their first contribution in #1279
@ilmarkov made their first contribution in #1284
@elfiegg made their first contribution in #1303
@PerkzZheng made their first contribution in #1307
@bkryu made their first contribution in #1323
@timlee0212 made their first contribution in #1321

Full Changelog: v0.2.8...v0.2.9rc2

Contributors

directhex, ttyio, and 21 other contributors

Assets 2

23 Jul 08:01

yzh119

v0.2.9rc1

d3d76b7

v0.2.9rc1 Pre-release

Pre-release

What's Changed

Reduce the JIT compilation time of gen_gemm_sm100_module by @jinyangyuan-nvidia in #1251
fix: correctly pass k_scale and v_scale to run() in forward_return_lse (#1023) by @vlev02 in #1254
Made AR output optional + esthetic changes by @nvmbreughe in #1265
init add gemm fp8 using cudnn backend by @ttyio in #1264
Feature/sm100 low latency nvfp4 kernels by @azhurkevich in #1214
CI: install nvidia-nvshmem-cu12 by @EmilienM in #1262
feat: enable trtllm-gen mla MTP by @yyihuang in #1258
Add trtllm-gen attention mha kernel with FP8 Q/K/V and FP8 output by @weireweire in #1242
add trtllm-gen context attention by @IwakuraRein in #1239
feat: add masked deepgemm support and benchmarking by @cyx-6 in #1266
Add missing import in comm/init,py by @joker-eph in #1275
hotfix: fix deepgemm artifactory hash by @cyx-6 in #1278
Unify groupwise fp8 GEMM test by @cyx-6 in #1281
fix: update trtllm-gen fmha benchmark by @yyihuang in #1280
fix multiCtasKvScratchPtr misalignment issue (new one) by @nvpohanh in #1286
Fix install folder regression, and JIT-vs-AOT differences by @directhex in #1279
Add shuffle matrix flag by @aleozlx in #1272
Convert scale_factor from scalar to Tensor in trt_allreduce_fusion by @ilmarkov in #1284
patch error handling by @aleozlx in #1293
Bug fix: guard fp8 e8m0 and e2m1 compile by @Edenzzzz in #1287
refactor: Improved metainfo for trtllm-gen fmha by @cyx-6 in #1292
add mm_fp4 use cudnn backend by @ttyio in #1288
fix: minor errors in cubin loader by @yyihuang in #1295
perfix: use lightweight API to query device property by @azhurkevich in #1298
refactor: refactor trtllm-gen attention kernel integration code by @yzh119 in #1289
Remove FAST_BUILD FLAG for MOE by @wenscarl in #1291
bugfix: ensure graph is captured and executed on the same stream to avoid rep… by @elfiegg in #1303
minor: some fix and cleanup for trtllm-gen mha by @yyihuang in #1302
[Feature] SM level profiler by @Edenzzzz in #1305
Heuristics + testing unification + CUDA Graphs by @azhurkevich in #1306
Update cutlass fp4 moe kernels by @wenscarl in #1294

New Contributors

@vlev02 made their first contribution in #1254
@ttyio made their first contribution in #1264
@azhurkevich made their first contribution in #1214
@weireweire made their first contribution in #1242
@IwakuraRein made their first contribution in #1239
@nvpohanh made their first contribution in #1286
@directhex made their first contribution in #1279
@ilmarkov made their first contribution in #1284
@elfiegg made their first contribution in #1303

Full Changelog: v0.2.8...v0.2.9rc1

Contributors

directhex, ttyio, and 17 other contributors

Assets 2

15 Jul 06:52

yzh119

v0.2.8

3f8317c

v0.2.8

What's Changed

[fix] fix BatchAttention CTA_TILE_KV mask issue by @happierpig in #1206
feat: enable and update all-reduce fused quantization by @yyihuang in #1164
Fix the issue with auxillary kernel launch and grid dim calculation by @Anerudhan in #1208
Fix test_groupwise_scaled_gemm_fp8.py by @jinyangyuan-nvidia in #1211
[TVM] Remove enable_pdl from TVM binding interface by @MasterJH5574 in #1217
misc: minor adds in readme by @yyihuang in #1218
bugfix: fix blackwell fmha hanging issue for empty kv_len by @yzh119 in #1198
update trtllm-gen decode attention kernel launcher by @wenscarl in #1189
Handle allocation cutlass fused MoE output to caller by @wenscarl in #1225
Fix missing hash in the cudnn cubin path by @Anerudhan in #1227
bugfix: add logits processor to pyproject.toml by @yzh119 in #1224
fix: add trtllm-allreduce-fusion api notes and fix memory error by @yyihuang in #1229
feat: Add non-causal cudnn prefill kernels by @Anerudhan in #1230
minor: update oneshot handling, add params notes by @yyihuang in #1232
Enable cudnn decode and add tests for the cudnn decode kernel by @Anerudhan in #1221
docker: add cuda-python to CI docker image by @yzh119 in #1233
bugfix: Fix building without get_requires*() invocation by @mgorny in #1226
bugfix: support uint8_t for vec_t class template by @chenyang78 in #1234
feat: trtllm-gen fp8 moe kernels by @aleozlx in #1212
Patch fp8 cubin availability by @aleozlx in #1240
[comm] TRT-LLM's Multi-Node NVLink All-Reduce Kernel by @nvmbreughe in #1213
feat: Support MXFP8 x MXFP4 CUTLASS grouped GEMM by @jinyangyuan-nvidia in #1241
feat: add trtllm-gen mla cubin by @yyihuang in #1222
Add DeepGEMM kernels by @cyx-6 in #1209
Remove sm100+ requirment for trtllm allreduce kernels by @yzh119 in #1249
Defer mpi import for comm module by @yzh119 in #1250
feat: support environment variable overrides for NVSHMEM paths and linker flags by @EmilienM in #1253
release: bump version to v0.2.8 by @yzh119 in #1257
TRT-LLM's Multi-Node NVLink AR + fused RMSNorm kernel by @nvmbreughe in #1255

New Contributors

@jinyangyuan-nvidia made their first contribution in #1211
@mgorny made their first contribution in #1226
@chenyang78 made their first contribution in #1234
@aleozlx made their first contribution in #1212
@nvmbreughe made their first contribution in #1213
@EmilienM made their first contribution in #1253

Full Changelog: v0.2.7.post1...v0.2.8

Contributors

mgorny, chenyang78, and 11 other contributors

Assets 2

08 Jul 18:30

yzh119

v0.2.8rc1

728e8bb

v0.2.8rc1 Pre-release

Pre-release

What's Changed

[fix] fix BatchAttention CTA_TILE_KV mask issue by @happierpig in #1206
feat: enable and update all-reduce fused quantization by @yyihuang in #1164
Fix the issue with auxillary kernel launch and grid dim calculation by @Anerudhan in #1208
Fix test_groupwise_scaled_gemm_fp8.py by @jinyangyuan-nvidia in #1211
[TVM] Remove enable_pdl from TVM binding interface by @MasterJH5574 in #1217
misc: minor adds in readme by @yyihuang in #1218
bugfix: fix blackwell fmha hanging issue for empty kv_len by @yzh119 in #1198
update trtllm-gen decode attention kernel launcher by @wenscarl in #1189
Handle allocation cutlass fused MoE output to caller by @wenscarl in #1225
Fix missing hash in the cudnn cubin path by @Anerudhan in #1227
bugfix: add logits processor to pyproject.toml by @yzh119 in #1224
fix: add trtllm-allreduce-fusion api notes and fix memory error by @yyihuang in #1229
feat: Add non-causal cudnn prefill kernels by @Anerudhan in #1230
minor: update oneshot handling, add params notes by @yyihuang in #1232
Enable cudnn decode and add tests for the cudnn decode kernel by @Anerudhan in #1221
docker: add cuda-python to CI docker image by @yzh119 in #1233
bugfix: Fix building without get_requires*() invocation by @mgorny in #1226
bugfix: support uint8_t for vec_t class template by @chenyang78 in #1234

New Contributors

@jinyangyuan-nvidia made their first contribution in #1211
@mgorny made their first contribution in #1226
@chenyang78 made their first contribution in #1234

Full Changelog: v0.2.7.post1...v0.2.8rc1

Contributors

mgorny, chenyang78, and 7 other contributors

Assets 2

01 Jul 18:14

yzh119

v0.2.7.post1

3fb73b3

v0.2.7.post1

What's Changed

[feat] optimize persistent batch attention perf. by @happierpig in #1200
Feature/cudnn dynamic cubin by @Anerudhan in #1187
Fix flashinfer.comm module missing by @BBuf in #1203
chore: bump flashinfer v0.2.7.post1 by @zhyncs in #1205

New Contributors

@Anerudhan made their first contribution in #1187
@BBuf made their first contribution in #1203

Full Changelog: v0.2.7...v0.2.7.post1

Contributors

Anerudhan, BBuf, and 2 other contributors

Assets 2

30 Jun 19:39

yzh119

v0.2.7

4d3fb6d

v0.2.7

What's Changed

ci: Update images for self-hosted ARM64 runner by @yongwww in #1128
Fix pointer dtype bug in rope by @Edenzzzz in #1129
feat: update and test create_ipc_buffer by @yyihuang in #1130
misc: update runllm widget by @yzh119 in #1132
misc: correct runllm widget (again) by @MasterJH5574 in #1133
[Feature] Support PDL for batch Prefill and Decode by @Edenzzzz in #1117
fix: negative zero by type trait --> binary value by @yyihuang in #1136
fix: sync after create_workspace by @yyihuang in #1138
refactor: use functools.cache instead of global dict for caching modules by @yzh119 in #1135
[feat] add unified batch attention w/ correctness tests. by @happierpig in #1137
Fix FA2 and FA3 multi-item scoring and cuda illegal memory access error by @arde171 in #1140
feat: Add support for FLASHINFER_EXTRA_LDFLAGS environment variable by @jennifgcrl in #1144
misc: remove sync between persistent runners and use packed_causal_kv_end for SM90Plan by @Edenzzzz in #1146
[fix] fix precision errors when applying causal mask on Qwen-2.5 series models by @happierpig in #1148
ci: Install mpi4py by @yongwww in #1149
feat: add trtllm moe_allreduce_fusion by @yyihuang in #1108
feat: add trtllm all-reduce fusion by @yyihuang in #1131
Add more logging to TRTLLM-GEN debug trace (NFC) by @joker-eph in #1158
feat: update non-fused moe by @yyihuang in #1161
Add fp4 quantization swizzling tests by @wenscarl in #1157
refactor: communication module by @yyihuang in #1162
feat: add finalize_moe_allreduce from trtllm by @yyihuang in #1159
feat: experimental support of green ctx by @yzh119 in #1163
feat: Fused temperature online softmax kernel by @xslingcn in #1153
MNNVL MoE All-to-All Support by @cyx-6 in #1134
feat: nvshmem python bindings by @yzh119 in #1160
Fix missing symbols in trtllm_utils.so by @tiran in #1168
feat: logits processor fustion rule for temperature softmax by @xslingcn in #1170
Expose fp4 blockscale swizzling kernel by @wenscarl in #1176
add nvshmem sum_reduce for mnnvl allreduce by @Amir-19 in #1152
bugfix: softmax NaN results caused by large -inf masks by @xslingcn in #1178
[CI] Update is_last_build by @yongwww in #1183
[feat] support block sparse attention w/ variable block sizes and head-wise sparse patterns by @happierpig in #1177
bugfix: fix invalid blackwell fmha unittests by @yzh119 in #1181
feat: support green ctx creation by a list of SM counts by @Conless in #1190
fix: trtllm_comm module aot arch issues by @yyihuang in #1196
bugfix: fix broken docs build by adding missing dependencies by @Conless in #1197
chore: bump v0.2.7 by @zhyncs in #1199

New Contributors

@jennifgcrl made their first contribution in #1144
@tiran made their first contribution in #1168
@Amir-19 made their first contribution in #1152
@Conless made their first contribution in #1190

Full Changelog: v0.2.6.post1...v0.2.7

Contributors

tiran, joker-eph, and 14 other contributors

Assets 2

07 Jun 03:24

yzh119

v0.2.6.post1

bc50f1a

v0.2.6.post1

What's Changed

[CI] Add x86_64 tag for x86 self-hosted runner by @yongwww in #1126
hotfix: fix installation script behavior by @yzh119 in #1125

Full Changelog: v0.2.6...v0.2.6.post1

Contributors

yongwww and yzh119

Assets 2

Releases: flashinfer-ai/flashinfer

v0.2.11

What's Changed

New Contributors

Contributors

Uh oh!

v0.2.10

What's Changed

Contributors

Uh oh!

v0.2.9

What's Changed

New Contributors

Contributors

Uh oh!

v0.2.9rc2

What's Changed

New Contributors

Contributors

Uh oh!

v0.2.9rc1

What's Changed

New Contributors

Contributors

Uh oh!

v0.2.8

What's Changed

New Contributors

Contributors

Uh oh!

v0.2.8rc1

What's Changed

New Contributors

Contributors

Uh oh!

v0.2.7.post1

What's Changed

New Contributors

Contributors

Uh oh!

v0.2.7

What's Changed

New Contributors

Contributors

Uh oh!

v0.2.6.post1

What's Changed

Contributors

Uh oh!