-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Pull requests: microsoft/DeepSpeed
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Training ops kernels: Speeding up the Llama-based MoE architectures
#6734
opened Nov 8, 2024 by
RezaYazdaniAminabadi
•
Draft
support autoTP with weight only quantization in DS inference path
#4750
opened Nov 29, 2023 by
ftian1
Loading…
Add FALCON-40B Inference-Kernel Support
#3656
opened Jun 1, 2023 by
RezaYazdaniAminabadi
Loading…
1 task done
Optimizer state loading fix for bitsandbytes 8-bit optimizers.
#1582
opened Nov 22, 2021 by
TimDettmers
Loading…
Add 4-bit quantized inference to run BLOOM-176B on 2 A100 GPUs
#2526
opened Nov 18, 2022 by
RezaYazdaniAminabadi
Loading…
pre/post forward calls to engine + generate method
#2832
opened Feb 14, 2023 by
jeffra
Loading…
2 of 3 tasks
Previous Next
ProTip!
Follow long discussions with comments:>50.