model_summary.md - Restore link to Harvard's Annotated Transformer. #29702

gamepad-coder · 2024-03-17T20:12:14Z

What does this PR do?

Why:

Action item from conversation here:
Refactor model summary #21408 (comment)

What

Single parenthetical added
to the first sentence of the intro paragraph under the The Transformer model family header (just after the link to the original Transformer paper).
Restores a link to Harvard's iconic "The Annotated Transformer"
http://nlp.seas.harvard.edu/2018/04/03/attention.html
(See before and after mockup @ bottom of this PR description)

Page updated

https://huggingface.co/docs/transformers/model_summary

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

=== Mockup using DevTools in Chrome ===

Before PR (current main on left)
After PR (this branch on right)

https://huggingface.co/docs/transformers/model_summary

stevhliu

LGTM!

stevhliu · 2024-03-18T16:59:42Z

docs/source/en/model_summary.md

@@ -16,7 +16,7 @@ rendered properly in your Markdown viewer.

 # The Transformer model family

-Since its introduction in 2017, the [original Transformer](https://arxiv.org/abs/1706.03762) model has inspired many new and exciting models that extend beyond natural language processing (NLP) tasks. There are models for [predicting the folded structure of proteins](https://huggingface.co/blog/deep-learning-with-proteins), [training a cheetah to run](https://huggingface.co/blog/train-decision-transformers), and [time series forecasting](https://huggingface.co/blog/time-series-transformers). With so many Transformer variants available, it can be easy to miss the bigger picture. What all these models have in common is they're based on the original Transformer architecture. Some models only use the encoder or decoder, while others use both. This provides a useful taxonomy to categorize and examine the high-level differences within models in the Transformer family, and it'll help you understand Transformers you haven't encountered before.
+Since its introduction in 2017, the [original Transformer](https://arxiv.org/abs/1706.03762) model has inspired many new and exciting models that extend beyond natural language processing (NLP) tasks. There are models for [predicting the folded structure of proteins](https://huggingface.co/blog/deep-learning-with-proteins), [training a cheetah to run](https://huggingface.co/blog/train-decision-transformers), and [time series forecasting](https://huggingface.co/blog/time-series-transformers). With so many Transformer variants available, it can be easy to miss the bigger picture. What all these models have in common is they're based on the original Transformer architecture. Some models only use the encoder or decoder, while others use both. This provides a useful taxonomy to categorize and examine the high-level differences within models in the Transformer family, and it'll help you understand Transformers you haven't encountered before. For a gentle introduction see the [Annotated Transformer](http://nlp.seas.harvard.edu/2018/04/03/attention.html).


Suggested change

Since its introduction in 2017, the [original Transformer](https://arxiv.org/abs/1706.03762) model has inspired many new and exciting models that extend beyond natural language processing (NLP) tasks. There are models for [predicting the folded structure of proteins](https://huggingface.co/blog/deep-learning-with-proteins), [training a cheetah to run](https://huggingface.co/blog/train-decision-transformers), and [time series forecasting](https://huggingface.co/blog/time-series-transformers). With so many Transformer variants available, it can be easy to miss the bigger picture. What all these models have in common is they're based on the original Transformer architecture. Some models only use the encoder or decoder, while others use both. This provides a useful taxonomy to categorize and examine the high-level differences within models in the Transformer family, and it'll help you understand Transformers you haven't encountered before. For a gentle introduction see the [Annotated Transformer](http://nlp.seas.harvard.edu/2018/04/03/attention.html).

Since its introduction in 2017, the [original Transformer](https://arxiv.org/abs/1706.03762) model (see the [Annotated Transformer](http://nlp.seas.harvard.edu/2018/04/03/attention.html)) blog post for a gentle technical introduction) has inspired many new and exciting models that extend beyond natural language processing (NLP) tasks. There are models for [predicting the folded structure of proteins](https://huggingface.co/blog/deep-learning-with-proteins), [training a cheetah to run](https://huggingface.co/blog/train-decision-transformers), and [time series forecasting](https://huggingface.co/blog/time-series-transformers). With so many Transformer variants available, it can be easy to miss the bigger picture. What all these models have in common is they're based on the original Transformer architecture. Some models only use the encoder or decoder, while others use both. This provides a useful taxonomy to categorize and examine the high-level differences within models in the Transformer family, and it'll help you understand Transformers you haven't encountered before.

=== ✅ Suggestion Implemented ===

I actually do like that a lot more, thanks @stevhliu !

Updated my PR in two commits to match your suggestion.
(Minor fyi: Couldn't commit it directly because the suggestion parenthetical had an extra ))

3d0ec4d

matches your suggestion's wording + placement of link

but typo: commit accidentally removed "has" when I pasted the suggestion

9378f4d

restores "has" after the parenthetical

=== VS Code: Diff Screenshots ===

GitHub struggles with single-line diffs, so here's my local diff in VS Code.
Just OCD, but feel free to verify for yourself too :)

diff for 3d0ec4d -- note the oops, missing "has"

diff for 9378f4d -- "has" restored

And finally: current PR's overall diff with main from last week

If there are no more change requests, feel free to merge it whenever.
And if there are any typos I missed, feel free to directly update the PR.

Sidenote:
I do like the annotated link much much better next to the original Transformer link, but I was hesitant to add too much noise to the intro sentence as a newcomer contributer. So thanks for that suggestion. 👍

PS:
I really appreciate the responsiveness and culture in this project, thanks for the help and the positive experience. :)

HuggingFaceDocBuilderDev · 2024-03-18T17:18:58Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…sis next to the link to the original paper (great idea, stevhliu!)

…sis next to the link to the original paper (commit pt. 2, accidentally removed "has" in pt. 1)

…uggingface#29702) * model_summary.md - Add link to Harvard's Annotated Transformer. * model_summary.md - slight wording change + capitalize name of the paper * model_summary.md - moves the Annotated Transformer link in a praenthesis next to the link to the original paper (great idea, stevhliu!) * model_summary.md - moves the Annotated Transformer link in a praenthesis next to the link to the original paper (commit pt. 2, accidentally removed "has" in pt. 1)

…29702) * model_summary.md - Add link to Harvard's Annotated Transformer. * model_summary.md - slight wording change + capitalize name of the paper * model_summary.md - moves the Annotated Transformer link in a praenthesis next to the link to the original paper (great idea, stevhliu!) * model_summary.md - moves the Annotated Transformer link in a praenthesis next to the link to the original paper (commit pt. 2, accidentally removed "has" in pt. 1)

gamepad-coder added 2 commits March 17, 2024 15:10

model_summary.md - Add link to Harvard's Annotated Transformer.

e06ca70

model_summary.md - slight wording change + capitalize name of the paper

c9e9090

stevhliu approved these changes Mar 18, 2024

View reviewed changes

gamepad-coder added 2 commits March 23, 2024 16:06

model_summary.md - moves the Annotated Transformer link in a praenthe…

3d0ec4d

…sis next to the link to the original paper (great idea, stevhliu!)

model_summary.md - moves the Annotated Transformer link in a praenthe…

9378f4d

…sis next to the link to the original paper (commit pt. 2, accidentally removed "has" in pt. 1)

stevhliu merged commit 76a33a1 into huggingface:main Mar 24, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model_summary.md - Restore link to Harvard's Annotated Transformer. #29702

model_summary.md - Restore link to Harvard's Annotated Transformer. #29702

gamepad-coder commented Mar 17, 2024 •

edited

Loading

stevhliu left a comment

stevhliu Mar 18, 2024

gamepad-coder Mar 23, 2024

gamepad-coder Mar 23, 2024

HuggingFaceDocBuilderDev commented Mar 18, 2024

model_summary.md - Restore link to Harvard's Annotated Transformer. #29702

model_summary.md - Restore link to Harvard's Annotated Transformer. #29702

Conversation

gamepad-coder commented Mar 17, 2024 • edited Loading

What does this PR do?

Before submitting

Who can review?

stevhliu left a comment

Choose a reason for hiding this comment

stevhliu Mar 18, 2024

Choose a reason for hiding this comment

gamepad-coder Mar 23, 2024

Choose a reason for hiding this comment

=== ✅ Suggestion Implemented ===

=== VS Code: Diff Screenshots ===

gamepad-coder Mar 23, 2024

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Mar 18, 2024

gamepad-coder commented Mar 17, 2024 •

edited

Loading