Tags: zeliu98/DeepSpeed
Tags
Prevent creation of local temp directory (deepspeedai#1494) Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
[zero_to_fp32] adapt to 4-bytes alignment in z2 (deepspeedai#1372) Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
DeepSpeed MoE (deepspeedai#1310) Co-authored-by: Alex Muzio <Alex.Muzio@microsoft.com> Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: Conglong Li <conglong.li@gmail.com> Co-authored-by: Felipe Cruz Salinas <Andres.Cruz@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Samyam Rajbhandari <samyamr@microsoft.com> Co-authored-by: Shaden Smith <shaden.smith@microsoft.com> Co-authored-by: Young Jin Kim <youki@microsoft.com> Co-authored-by: bapatra <bapatra@microsoft.com> Co-authored-by: Samyam Rajbhandari <samyamr@microsoft.com> Co-authored-by: Shaden Smith <shaden.smith@microsoft.com> Co-authored-by: Young Jin Kim <youki@microsoft.com>
Use correct input size for splits (deepspeedai#1284) * Use correct input size for splits * Use smarter partitioning
[Doc] round_robin_gradients (deepspeedai#1261) * Fix docstring * Make screenshots clickable for easier viewing * Navigation menu in alphabetical order; More clicable screenshots * Rename 1Cycle doc * Tweak naming * Remove no longer used flag * ZeRO3 Offload release * Single GPU results * Rearrange figures * Single GPU text * tweak intro * zero3-offload section * Add asynchronous i/o docs * Fix print_per_steps doc * Document round_robin_gradients * Tweak description * Trigger CI
revert part of deepspeedai#1220 (deepspeedai#1221) deepspeedai#1220 fixed the leak, but lead to another problem. reverting that part so that we could do release and will work on it after the release. @jeffra
clean up logging (deepspeedai#1190) Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
PreviousNext