Description
Hey friends! 👋
We are currently in the process of improving the Transformers model cards by making them more directly useful for everyone. The main goal is to:
- Standardize all model cards with a consistent format so users know what to expect when moving between different model cards or trying to learn how to use a new model.
- Include a brief description of the model (what makes it unique/different) written in a way that's accessible to everyone.
- Provide ready to use code examples featuring the
Pipeline
,AutoModel
, andtransformers-cli
with available optimizations included. For large models, provide a quantization example so its easier for everyone to run the model. - Include an attention mask visualizer for currently supported models to help users visualize what a model is seeing (refer to Add attention visualization tool #36630) for more details.
Compare the before and after model cards below:
With so many models in Transformers, we could really use some a hand with standardizing the existing model cards. If you're interested in making a contribution, pick a model from the list below and then you can get started!
Steps
Each model card should follow the format below. You can copy the text exactly as it is!
# add appropriate badges
<div style="float: right;">
<div class="flex flex-wrap space-x-1">
<img alt="" src="" >
</div>
</div>
# Model name
[Model name](https://huggingface.co/papers/...) ...
A brief description of the model and what makes it unique/different. Try to write this like you're talking to a friend.
You can find all the original [Model name] checkpoints under the [Model name](link) collection.
> [!TIP]
> Click on the [Model name] models in the right sidebar for more examples of how to apply [Model name] to different [insert task types here] tasks.
The example below demonstrates how to generate text based on an image with [`Pipeline`] or the [`AutoModel`] class.
<hfoptions id="usage">
<hfoption id="Pipeline>
insert pipeline code here
</hfoption>
<hfoption id="AutoModel">
add AutoModel code here
</hfoption>
<hfoption id="transformers-cli">
add transformers-cli usage here if applicable/supported, otherwise close the hfoption block
</hfoption>
</hfoptions
Quantization reduces the memory burden of large models by representing the weights in a lower precision. Refer to the [Quantization](../quantization/overview) overview for more available quantization backends.
The example below uses [insert quantization method here](link to quantization method) to only quantize the weights to __.
# add if this is supported for your model
Use the [AttentionMaskVisualizer](https://github.com/huggingface/transformers/blob/beb9b5b02246b9b7ee81ddf938f93f44cfeaad19/src/transformers/utils/attention_visualizer.py#L139) to better understand what tokens the model can and cannot attend to.
\```py
from transformers.utils.attention_visualizer import AttentionMaskVisualizer
visualizer = AttentionMaskVisualizer("google/gemma-3-4b-it")
visualizer("<img>What is shown in this image?")
\```
# upload image to https://huggingface.co/datasets/huggingface/documentation-images/tree/main/transformers/model_doc and ping me to merge
<div class="flex justify-center">
<img src=""/>
</div>
## Notes
- Any other model-specific notes should go here.
\```py
<insert relevant code snippet here related to the note if its available>
\ ```
For examples, take a look at #36469 or the BERT, Llama, Llama 2, Gemma 3, PaliGemma, ViT, and Whisper model cards on the main
version of the docs.
Once you're done or if you have any questions, feel free to ping @stevhliu to review. Don't add fix
to your PR to avoid closing this issue.
I'll also be right there working alongside you and opening PRs to convert the model cards so we can complete this faster together! 🤗
Models
- albert - Updated Albert model Card #37753
- align - Updated the Model docs - for the ALIGN model #38072
- altclip - Update altCLIP model card #38306
- aria - Updated Aria model card #38472
- audio_spectrogram_transformer - assigned to @KishanPipariya
- autoformer.- Update model-card for Autofomer #37231
- aya_vision - Updated aya_vision.md #38749
- bamba - Update bamba model card #38853
- bark
- bart - New bart model card #37858
- barthez
- bartpho
- beit
- bert
- bert_generation
- bert_japanese - assigned to @KeshavSingh29
- bertweet - Updated BERTweet model card. #37981
- big_bird - Updated BigBird Model card as per #36979. #37959
- bigbird_pegasus
- biogpt - Update BioGPT model card #38214
- bit
- blenderbot
- blenderbot_small
- blip. - Update blip model card #38513
- blip_2 - assigned to @olccihyeon
- bloom
- bridgetower
- bros
- byt5 - Standardize ByT5 model card format #38699
- camembert
- canine - New canine model card #38631
- chameleon
- chinese_clip
- clap
- clip - Updated the model card for CLIP #37040
- clipseg
- clvp
- code_llama - chore: Update model doc for code_llama #37115
- codegen
- cohere - Update model card for Cohere #37056
- cohere2
- colpali - Refactor ColPali model documentation #37309
- conditional_detr
- convbert - Add detailed ConvBERT model card with usage, architecture, and refere… #38470
- convnext - assigned to @aleksmaksimovic
- convnextv2 - assigned to @aleksmaksimovic
- cpm
- cpmant
- ctrl - assigned to @Ishubhammohole
- cvt - Update CvT documentation with improved usage examples and additional … #38731
- dab_detr
- dac
- data2vec
- dbrx
- deberta - chore: standardize DeBERTa model card #37409
- deberta_v2
- decision_transformer
- deformable_detr
- deit
- deprecated
- depth_anything - Update model card for Depth Anything #37065
- depth_pro
- detr
- dialogpt
- diffllama
- dinat
- dinov2 - Update model-card for DINOv2 #37104
- dinov2_with_registers
- distilbert - Updated model card for distilbert #37157
- dit - [Docs] New DiT model card #38721
- donut - Updated Model-card for donut #37290
- dpr
- dpt
- efficientnet - assigned to @Sudhesh-Rajan27
- electra - Update model card for electra #37063
- emu3
- encodec
- encoder_decoder
- ernie
- esm
- falcon - Update falcon model card #37184
- falcon_mamba - Update falcon mamba card #37253
- fastspeech2_conformer - Update fastspeech2 model card #37377
- flaubert
- flava
- fnet
- focalnet
- fsmt
- funnel
- fuyu
- gemma - Update model card for Gemma #37674
- gemma2 - Improvements in Gemma2 model card #37076
- gemma3
- git
- glm
- glpn
- got_ocr2
- gpt2 - Update Model card for GPT2 #37101
- gpt_bigcode
- gpt_neo - New gpt neo model card #38505
- gpt_neox - Improve GPTNeoX model card following standardization guidelines #38550
- gpt_neox_japanese
- gpt_sw3
- gptj
- granite - Update granite.md #37791
- granitemoe
- granitemoeshared
- grounding_dino
- groupvit
- helium
- herbert
- hiera
- hubert
- ibert
- idefics
- idefics2
- idefics3
- ijepa
- imagegpt
- informer
- instructblip
- instructblipvideo
- jamba - Update Model Card for Jamba #37152
- jetmoe
- kosmos2
- layoutlm
- layoutlmv2
- layoutlmv3 - docs: Update LayoutLMv3 model card with standardized format and impro… #37155
- layoutxlm
- led
- levit
- lilt
- llama
- llama2
- llama3 - assigned to @capnmav77
- llava
- llava_next
- llava_next_video
- llava_onevision
- longformer - Update longformer.md #37622
- longt5
- luke
- lxmert
- m2m_100
- mamba - Update Model Card for Mamba #37863
- mamba2 - Update Model Card for Mamba-2 #37951
- marian
- markuplm
- mask2former
- maskformer
- mbart - Updated model card for mbart and mbart50 #37619
- mbart50 - Updated model card for mbart and mbart50 #37619
- megatron_bert
- megatron_gpt2
- mgp_str
- mimi
- mistral - updated model card for Mistral #37156
- mistral3 - assigned to @cassiasamp
- mixtral - assigned to @darmasrmez
- mllama - added mllama doc #37647
- mluke
- mobilebert - mobilebert model card update #37256
- mobilenet_v1 - Model card for mobilenet v1 and v2 #37948
- mobilenet_v2 - Model card for mobilenet v1 and v2 #37948
- mobilevit
- mobilevitv2
- modernbert - Update Model Card for ModernBERT #37052
- moonshine - Updated moonshine modelcard #38711
- moshi
- mpnet - assigned to @SanjayDevarajan03
- mpt
- mra
- mt5
- musicgen
- musicgen_melody
- mvp
- myt5
- nemotron
- nllb
- nllb_moe
- nougat
- nystromformer
- olmo
- olmo2 - Updated model card for OLMo2 #38394
- olmoe
- omdet_turbo
- oneformer
- openai - Update OpenAI GPT model card #37255
- opt
- owlv2
- owlvit
- paligemma
- patchtsmixer
- patchtst
- pegasus - Update pegasus model card #38675
- pegasus_x
- perceiver
- persimmon
- phi - Refactor phi doc #37583
- phi3 - assigned to @arpitsinghgautam
- phi4_multimodal - Update phi4_multimodal.md #38830
- phimoe
- phobert
- pix2struct
- pixtral - assigned to @BryanBradfo
- plbart
- poolformer
- pop2piano
- prompt_depth_anything
- prophetnet
- pvt
- pvt_v2
- qwen2 - Updated model card for Qwen2 #37192
- qwen2_5_vl - feat: updated model card for qwen_2.5_vl #37099
- qwen2_audio
- qwen2_moe - Add Qwen2 MoE model card #38649
- qwen2_vl - assigned to @SaiSanthosh1508
- rag
- recurrent_gemma
- reformer
- regnet
- rembert
- resnet - assigned to @BettyChen0616
- roberta - [docs] updated roberta model card #38777
- roberta_prelayernorm
- roc_bert - Update roc bert docs #38835
- roformer - [docs]: update roformer.md model card #37946
- rt_detr
- rt_detr_v2
- rwkv
- sam
- seamless_m4t
- seamless_m4t_v2
- segformer - assigned to @GSNCodes
- seggpt
- sew
- sew_d
- shieldgemma2 - assigned to @BryanBradfo
- siglip - chore: update model card for SigLIP #37585
- siglip2 - chore: update SigLIP2 model card #37624
- smolvlm - assigned to @udapy
- speech_encoder_decoder
- speech_to_text
- speecht5
- splinter
- squeezebert
- stablelm
- starcoder2
- superglue
- superpoint
- swiftformer
- swin - assigned to @BryanBradfo
- swin2sr
- swinv2 - docs(swinv2): Update SwinV2 model card to new standard format #37942
- switch_transformers
- t5 - Updated T5 model card with standardized format #37261
- table_transformer
- tapas
- textnet
- time_series_transformer
- timesformer
- timm_backbone
- timm_wrapper
- trocr
- tvp
- udop
- umt5
- unispeech
- unispeech_sat
- univnet
- upernet
- video_llava
- videomae - assigned to @mreraser
- vilt
- vipllava
- vision_encoder_decoder - assigned to @Bhavay-2001
- vision_text_dual_encoder
- visual_bert
- vit
- vit_mae - Updated the model card for ViTMAE #38302
- vit_msn
- vitdet
- vitmatte
- vitpose - [docs] ViTPose #38630
- vitpose_backbone
- vits - Update VITS model card #37335
- vivit
- wav2vec2 - assigned to @AshAnand34
- wav2vec2_bert - assigned to @AshAnand34
- wav2vec2_conformer - assigned to @AshAnand34
- wav2vec2_phoneme - assigned to @AshAnand34
- wav2vec2_with_lm - assigned to @AshAnand34
- wavlm
- whisper
- x_clip
- xglm
- xlm - Created model card for XLM model #38595
- xlm_roberta - Update XLM-RoBERTa model documentation with enhanced usage examples and improved layout #38596
- xlm_roberta_xl - Created model card for xlm-roberta-xl #38597
- xlnet
- xmod
- yolos
- yoso
- zamba
- zamba2
- zoedepth - Updated Zoedepth model card #37898