-
Notifications
You must be signed in to change notification settings - Fork 734
Pull requests: flashinfer-ai/flashinfer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
benchmark: Enable speculative decode microbenchmarking for paged decode
#2628
opened Feb 24, 2026 by
bkryu
Loading…
5 tasks done
fix: trtllm_mxint4_block_scale_moe unit test to index output list
#2627
opened Feb 24, 2026 by
jimmyzho
Loading…
5 tasks
Protect against null clusterUuid in mnnvl.py
#2626
opened Feb 23, 2026 by
akshaver
Loading…
4 of 5 tasks
feat: Fuse shared experts into trtllm_gen moe (fp8)
run-ci
#2625
opened Feb 23, 2026 by
nv-yunzheq
Loading…
5 tasks
feat: add pool+indices support to gated_delta_rule_decode_pretranspose (bf16 path)
#2619
opened Feb 22, 2026 by
kaixih
Loading…
perf(gdn): optimize MTP kernel with ILP rows and SMEM v caching
#2618
opened Feb 22, 2026 by
ameynaik-hub
Loading…
5 tasks done
fix: Add tests for the AutoTuner and fix bug in _find_nearest_profile
run-ci
v0.6.5
release blocker label for v0.6.5
#2617
opened Feb 22, 2026 by
danisereb
Loading…
4 of 5 tasks
feat: port act_and_mul activation kernels to CuTe-DSL
#2616
opened Feb 22, 2026 by
bledden
Loading…
3 tasks
docs: Fix incorrect column-major scale layout in FP8 GEMM docstrings
#2614
opened Feb 21, 2026 by
bledden
Loading…
fix: Support non-power-of-2 dimensions in act_and_mul kernels
run-ci
#2613
opened Feb 21, 2026 by
bledden
Loading…
fix: use type-specific FP8 max value for clamping in RMSNorm quantization kernels
run-ci
#2612
opened Feb 21, 2026 by
Bias92
Loading…
support qk_nope_head_dim for 192 check for GLM-5
run-ci
#2607
opened Feb 21, 2026 by
rainj-me
Loading…
5 tasks done
[bugfix] Fix FilteredTopK overflow correctness
run-ci
#2605
opened Feb 20, 2026 by
jiangyinzuo
Loading…
6 tasks done
feat: add CuTe DSL flash attention backend for SM120 GPUs
#2598
opened Feb 20, 2026 by
blake-snc
Loading…
feat: Add comprehensive performance optimization ecosystem
#2593
opened Feb 19, 2026 by
divyanshu-iitian
Loading…
5 tasks
[BugFix] guard against uint32 underflow in multi-CTA TopK chunk calculation
#2592
opened Feb 19, 2026 by
LopezCastroRoberto
Loading…
Perf: Optimize GDN decode pretranspose kernel for all batch sizes
#2588
opened Feb 19, 2026 by
ameynaik-hub
Loading…
5 tasks
feat: trtllm tinygemm2 in flashinfer as bf16 routergemm
#2587
opened Feb 19, 2026 by
jimmyzho
Loading…
5 tasks
FlashInfer Jetson Orin series compatibility (sm87)
#2580
opened Feb 18, 2026 by
EricEttes
Loading…
5 tasks done
Previous Next
ProTip!
no:milestone will show everything without a milestone.