Skip to content

Commit

Permalink
deepspeed4science chinese blog (#4366)
Browse files Browse the repository at this point in the history
* deepspeed4science chinese blog

* deepspeed4science chinese blog
  • Loading branch information
conglongli authored Sep 20, 2023
1 parent dcd3ae1 commit 3592a22
Show file tree
Hide file tree
Showing 17 changed files with 169 additions and 3 deletions.
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ repos:
# Do not check files that are automatically generated
'--skip=docs/Gemfile.lock,tests/unit/gpt2-merges.txt,tests/unit/gpt2-vocab.json',
'--ignore-regex=\\n', # Do not count the 'n' in an escaped newline as part of a word
'--ignore-words-list=youn,unsupport', # Word used in error messages that need rewording
'--ignore-words-list=youn,unsupport,noe', # Word used in error messages that need rewording
--check-filenames,
--check-hidden
]
Expand Down
7 changes: 7 additions & 0 deletions blogs/deepspeed4science/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
<div align="center">

# Announcing the DeepSpeed4Science Initiative: Enabling large-scale scientific discovery through sophisticated AI system technologies

</div>

[https://www.microsoft.com/en-us/research/blog/announcing-the-deepspeed4science-initiative-enabling-large-scale-scientific-discovery-through-sophisticated-ai-system-technologies/](https://www.microsoft.com/en-us/research/blog/announcing-the-deepspeed4science-initiative-enabling-large-scale-scientific-discovery-through-sophisticated-ai-system-technologies/)
145 changes: 145 additions & 0 deletions blogs/deepspeed4science/chinese/README.md

Large diffs are not rendered by default.

Binary file added blogs/deepspeed4science/media/Figure1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added blogs/deepspeed4science/media/Figure2-1.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added blogs/deepspeed4science/media/Figure2-2.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added blogs/deepspeed4science/media/Figure3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added blogs/deepspeed4science/media/Figure4.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added blogs/deepspeed4science/media/Figure5.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added blogs/deepspeed4science/media/Figure6-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added blogs/deepspeed4science/media/Figure6-2.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added blogs/deepspeed4science/media/Figure7.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added blogs/deepspeed4science/media/Figure8.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added blogs/deepspeed4science/media/Figure9.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions docs/_pages/deepspeed4science.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,11 @@ In line with Microsoft's mission to solve humanity's most pressing challenges, t

## New Megatron-DeepSpeed for Large-Scale AI4Science Model Training

We are proud to introduce [new Megatron-DeepSpeed](https://github.com/microsoft/Megatron-DeepSpeed), which is an updated framework for large-scale model training. We rebased and enabled DeepSpeed with the newest Megatron-LM for long sequence support and many other capabilities. With the new Megatron-DeepSpeed, users can now train their large AI4Science models like GenSLMS with much longer sequences via a synergetic combination of ZeRO-style data parallelism, tensor parallelism, sequence parallelism, pipeline parallelism, model state offloading, and several newly added memory optimization techniques such as attention mask offloading and position embedding partitoining.
We are proud to introduce [new Megatron-DeepSpeed](https://github.com/microsoft/Megatron-DeepSpeed), which is an updated framework for large-scale model training. We rebased and enabled DeepSpeed with the newest Megatron-LM for long sequence support and many other capabilities. With the new Megatron-DeepSpeed, users can now train their large AI4Science models like GenSLMs with much longer sequences via a synergetic combination of ZeRO-style data parallelism, tensor parallelism, sequence parallelism, pipeline parallelism, model state offloading, and several newly added memory optimization techniques such as attention mask offloading and position embedding partitioning.

![new Megatron-DeepSpeed](/assets/images/new-megatron-ds.png){: .align-center}
<p align="center">
<em>The figure depicts system capability in terms of enabling long sequence lengths for training a 33B parameter GPT-like model using our new Megatron-DeepSpeed framework. The results show that the new Megatron-DeepSpeed enables 9x onger sequence lengths than NVIDIA's Megatron-LM without triggering out-of-memory error. </em>
<em>The figure depicts system capability in terms of enabling long sequence lengths for training a 33B parameter GPT-like model using our new Megatron-DeepSpeed framework. The results show that the new Megatron-DeepSpeed enables 9x longer sequence lengths than NVIDIA's Megatron-LM without triggering out-of-memory error. </em>
</p>

To see how the new Megatron-DeepSpeed helps enabling new system capabilities, such as training models with massive sequences length, please read our [tutorial](https://github.com/microsoft/Megatron-DeepSpeed/tree/main/examples_deepspeed/deepspeed4science/megatron_long_seq_support).
Expand Down
7 changes: 7 additions & 0 deletions docs/_posts/2023-09-19-deepspeed4science-chinese.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
title: "DeepSpeed4Science:利用先进的AI系统优化技术实现科学发现"
excerpt: ""
link: https://github.com/microsoft/DeepSpeed/blob/master/blogs/deepspeed4science/chinese/README.md
date: 2023-09-19 00:00:00
tags: training inference science Chinese
---
7 changes: 7 additions & 0 deletions docs/_posts/2023-09-19-deepspeed4science.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
title: "Announcing the DeepSpeed4Science Initiative: Enabling large-scale scientific discovery through sophisticated AI system technologies"
excerpt: ""
link: https://www.microsoft.com/en-us/research/blog/announcing-the-deepspeed4science-initiative-enabling-large-scale-scientific-discovery-through-sophisticated-ai-system-technologies/
date: 2023-09-19 00:00:00
tags: training inference science English
---

0 comments on commit 3592a22

Please sign in to comment.