diff --git a/README.md b/README.md
index 7e762ce96954..5bfea3cb2394 100755
--- a/README.md
+++ b/README.md
@@ -82,6 +82,8 @@ DeepSpeed has been used to train many different large-scale models, below is a l
   * [BLOOM (176B)](https://huggingface.co/blog/bloom-megatron-deepspeed)
   * [YaLM (100B)](https://github.com/yandex/YaLM-100B)
   * [GPT-NeoX (20B)](https://github.com/EleutherAI/gpt-neox)
+  * [AlexaTM (20B)](https://www.amazon.science/blog/20b-parameter-alexa-model-sets-new-marks-in-few-shot-learning)
+  * [METRO-LM (5.4B)](https://arxiv.org/pdf/2204.06644.pdf)
 
 DeepSpeed has been integrated with several different popular open-source DL frameworks such as:
 
diff --git a/docs/index.md b/docs/index.md
index e5a512d414c3..7303e7c41611 100755
--- a/docs/index.md
+++ b/docs/index.md
@@ -65,6 +65,8 @@ DeepSpeed has been used to train many different large-scale models, below is a l
   * [BLOOM (176B)](https://huggingface.co/blog/bloom-megatron-deepspeed)
   * [YaLM (100B)](https://github.com/yandex/YaLM-100B)
   * [GPT-NeoX (20B)](https://github.com/EleutherAI/gpt-neox)
+  * [AlexaTM (20B)](https://www.amazon.science/blog/20b-parameter-alexa-model-sets-new-marks-in-few-shot-learning)
+  * [METRO-LM (5.4B)](https://arxiv.org/pdf/2204.06644.pdf)
 
 DeepSpeed has been integrated with several different popular open-source DL frameworks such as: