Skip to content

Commit

Permalink
support paligemma
Browse files Browse the repository at this point in the history
  • Loading branch information
hiyouga committed May 20, 2024
1 parent e55c85a commit 2a67457
Show file tree
Hide file tree
Showing 3 changed files with 34 additions and 10 deletions.
11 changes: 6 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,12 +69,12 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/

## Changelog

[24/05/20] We supported fine-tuning the **PaliGemma** series models. Note that the PaliGemma models are pre-trained models, you need to fine-tune them with `gemma` template for chat completion.

[24/05/18] We supported **[KTO](https://arxiv.org/abs/2402.01306)** algorithm for preference learning. See [examples](examples/README.md) for usage.

[24/05/14] We supported training and inference on the Ascend NPU devices. Check [installation](#installation) section for details.

[24/05/13] We supported fine-tuning the **Yi-1.5** series models.

<details><summary>Full Changelog</summary>

[24/04/26] We supported fine-tuning the **LLaVA-1.5** multimodal LLMs. See [examples](examples/README.md) for usage.
Expand Down Expand Up @@ -160,6 +160,7 @@ Compared to ChatGLM's [P-Tuning](https://github.com/THUDM/ChatGLM2-6B/tree/main/
| [LLaVA-1.5](https://huggingface.co/llava-hf) | 7B/13B | q_proj,v_proj | vicuna |
| [Mistral/Mixtral](https://huggingface.co/mistralai) | 7B/8x7B/8x22B | q_proj,v_proj | mistral |
| [OLMo](https://huggingface.co/allenai) | 1B/7B | q_proj,v_proj | - |
| [PaliGemma](https://huggingface.co/google) | 3B | q_proj,v_proj | gemma |
| [Phi-1.5/2](https://huggingface.co/microsoft) | 1.3B/2.7B | q_proj,v_proj | - |
| [Phi-3](https://huggingface.co/microsoft) | 3.8B | qkv_proj | phi |
| [Qwen](https://huggingface.co/Qwen) | 1.8B/7B/14B/72B | c_attn | qwen |
Expand Down Expand Up @@ -284,10 +285,10 @@ huggingface-cli login
| ------------ | ------- | --------- |
| python | 3.8 | 3.10 |
| torch | 1.13.1 | 2.2.0 |
| transformers | 4.37.2 | 4.40.1 |
| transformers | 4.37.2 | 4.41.0 |
| datasets | 2.14.3 | 2.19.1 |
| accelerate | 0.27.2 | 0.30.0 |
| peft | 0.9.0 | 0.10.0 |
| accelerate | 0.27.2 | 0.30.1 |
| peft | 0.9.0 | 0.11.1 |
| trl | 0.8.1 | 0.8.6 |

| Optional | Minimum | Recommend |
Expand Down
11 changes: 6 additions & 5 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,12 +69,12 @@ https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd

## 更新日志

[24/05/20] 我们支持了 **PaliGemma** 系列模型的微调。注意 PaliGemma 是预训练模型,你需要使用 `gemma` 模板进行微调使其获得对话能力。

[24/05/18] 我们支持了 **[KTO](https://arxiv.org/abs/2402.01306)** 偏好对齐算法。详细用法请参照 [examples](examples/README_zh.md)

[24/05/14] 我们支持了昇腾 NPU 设备的训练和推理。详情请查阅[安装](#安装-llama-factory)部分。

[24/05/13] 我们支持了 Yi-1.5 系列模型的微调。

<details><summary>展开日志</summary>

[24/04/26] 我们支持了多模态模型 **LLaVA-1.5** 的微调。详细用法请参照 [examples](examples/README_zh.md)
Expand Down Expand Up @@ -160,6 +160,7 @@ https://github.com/hiyouga/LLaMA-Factory/assets/16256802/ec36a9dd-37f4-4f72-81bd
| [LLaVA-1.5](https://huggingface.co/llava-hf) | 7B/13B | q_proj,v_proj | vicuna |
| [Mistral/Mixtral](https://huggingface.co/mistralai) | 7B/8x7B/8x22B | q_proj,v_proj | mistral |
| [OLMo](https://huggingface.co/allenai) | 1B/7B | q_proj,v_proj | - |
| [PaliGemma](https://huggingface.co/google) | 3B | q_proj,v_proj | gemma |
| [Phi-1.5/2](https://huggingface.co/microsoft) | 1.3B/2.7B | q_proj,v_proj | - |
| [Phi-3](https://huggingface.co/microsoft) | 3.8B | qkv_proj | phi |
| [Qwen](https://huggingface.co/Qwen) | 1.8B/7B/14B/72B | c_attn | qwen |
Expand Down Expand Up @@ -284,10 +285,10 @@ huggingface-cli login
| ------------ | ------- | --------- |
| python | 3.8 | 3.10 |
| torch | 1.13.1 | 2.2.0 |
| transformers | 4.37.2 | 4.40.1 |
| transformers | 4.37.2 | 4.41.0 |
| datasets | 2.14.3 | 2.19.1 |
| accelerate | 0.27.2 | 0.30.0 |
| peft | 0.9.0 | 0.10.0 |
| accelerate | 0.27.2 | 0.30.1 |
| peft | 0.9.0 | 0.11.1 |
| trl | 0.8.1 | 0.8.6 |

| 可选项 | 至少 | 推荐 |
Expand Down
22 changes: 22 additions & 0 deletions src/llamafactory/extras/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -716,6 +716,28 @@ def register_model_group(
)


register_model_group(
models={
"PaliGemma-3B-pt-224": {
DownloadSource.DEFAULT: "google/paligemma-3b-pt-224",
},
"PaliGemma-3B-pt-448": {
DownloadSource.DEFAULT: "google/paligemma-3b-pt-448",
},
"PaliGemma-3B-pt-896": {
DownloadSource.DEFAULT: "google/paligemma-3b-pt-896",
},
"PaliGemma-3B-mix-224": {
DownloadSource.DEFAULT: "google/paligemma-3b-mix-224",
},
"PaliGemma-3B-mix-448": {
DownloadSource.DEFAULT: "google/paligemma-3b-mix-448",
},
},
vision=True,
)


register_model_group(
models={
"Phi-1.5-1.3B": {
Expand Down

0 comments on commit 2a67457

Please sign in to comment.