Skip to content

mindspore-lab/mindformers

Repository files navigation

MindSpore Transformers (MindFormers)

LICENSE Downloads PyPI

1. Introduction

The goal of the MindSpore Transformers suite is to build a full-process development suite for Large model pre-training, fine-tuning, evaluation, inference, and deployment. It provides mainstream Transformer-based Large Language Models (LLMs) and Multimodal Models (MMs). It is expected to help users easily realize the full process of large model development.

Based on MindSpore's built-in parallel technology and component-based design, the MindSpore Transformers suite has the following features:

  • One-click initiation of single or multi card pre-training, fine-tuning, evaluation, inference, and deployment processes for large models;
  • Provides rich multi-dimensional hybrid parallel capabilities for flexible and easy-to-use personalized configuration;
  • System-level deep optimization on large model training and inference, native support for ultra-large-scale cluster efficient training and inference, rapid fault recovery;
  • Support for configurable development of task components. Any module can be enabled by unified configuration, including model network, optimizer, learning rate policy, etc.;
  • Provide real-time visualization of training accuracy/performance monitoring indicators.

For details about MindSpore Transformers tutorials and API documents, see MindSpore Transformers Documentation. The following are quick jump links to some of the key content:

If you have any suggestions on MindSpore Transformers, contact us through an issue, and we will address it promptly.

Models List

The following table lists models supported by MindSpore Transformers.

Model Specifications Model Type Latest Version
DeepSeek-V3 671B Sparse LLM In-development version, 1.5.0
GLM4 9B Dense LLM In-development version, 1.5.0
Llama3.1 8B/70B Dense LLM In-development version, 1.5.0
Qwen2.5 0.5B/1.5B/7B/14B/32B/72B Dense LLM In-development version, 1.5.0
TeleChat2 7B/35B/115B Dense LLM In-development version, 1.5.0
CodeLlama 34B Dense LLM 1.5.0
CogVLM2-Image 19B MM 1.5.0
CogVLM2-Video 13B MM 1.5.0
DeepSeek-V2 236B Sparse LLM 1.5.0
DeepSeek-Coder-V1.5 7B Dense LLM 1.5.0
DeepSeek-Coder 33B Dense LLM 1.5.0
GLM3-32K 6B Dense LLM 1.5.0
GLM3 6B Dense LLM 1.5.0
InternLM2 7B/20B Dense LLM 1.5.0
Llama3.2 3B Dense LLM 1.5.0
Llama3.2-Vision 11B MM 1.5.0
Llama3 8B/70B Dense LLM 1.5.0
Llama2 7B/13B/70B Dense LLM 1.5.0
Mixtral 8x7B Sparse LLM 1.5.0
Qwen2 0.5B/1.5B/7B/57B/57B-A14B/72B Dense/Sparse LLM 1.5.0
Qwen1.5 7B/14B/72B Dense LLM 1.5.0
Qwen-VL 9.6B MM 1.5.0
TeleChat 7B/12B/52B Dense LLM 1.5.0
Whisper 1.5B MM 1.5.0
Yi 6B/34B Dense LLM 1.5.0
YiZhao 12B Dense LLM 1.5.0
Baichuan2 7B/13B Dense LLM 1.3.2
GLM2 6B Dense LLM 1.3.2
GPT2 124M/13B Dense LLM 1.3.2
InternLM 7B/20B Dense LLM 1.3.2
Qwen 7B/14B Dense LLM 1.3.2
CodeGeex2 6B Dense LLM 1.1.0
WizardCoder 15B Dense LLM 1.1.0
Baichuan 7B/13B Dense LLM 1.0
Blip2 8.1B MM 1.0
Bloom 560M/7.1B/65B/176B Dense LLM 1.0
Clip 149M/428M MM 1.0
CodeGeex 13B Dense LLM 1.0
GLM 6B Dense LLM 1.0
iFlytekSpark 13B Dense LLM 1.0
Llama 7B/13B Dense LLM 1.0
MAE 86M MM 1.0
Mengzi3 13B Dense LLM 1.0
PanguAlpha 2.6B/13B Dense LLM 1.0
SAM 91M/308M/636M MM 1.0
Skywork 13B Dense LLM 1.0
Swin 88M MM 1.0
T5 14M/60M Dense LLM 1.0
VisualGLM 6B MM 1.0
Ziya 13B Dense LLM 1.0
Bert 4M/110M Dense LLM 0.8

The model maintenance strategy follows the Life Cycle And Version Matching Strategy of the corresponding latest supported version.

2. Installation

Version Mapping

Currently, the Atlas 800T A2 training server is supported.

Python 3.11.4 is recommended for the current suite.

MindSpore Transformers MindSpore CANN Driver/Firmware Image Link
In-development version In-development version In-development version In-development version Not involved

Historical Version Supporting Relationships:

MindSpore Transformers MindSpore CANN Driver/Firmware Image Link
1.5.0 2.6.0-rc1 8.1.RC1 25.0.RC1 Link
1.3.2 2.4.10 8.0.0 24.1.0 Link
1.3.0 2.4.0 8.0.RC3 24.1.RC3 Link
1.2.0 2.3.0 8.0.RC2 24.1.RC2 Link

Installation Using the Source Code

Currently, MindSpore Transformers can be compiled and installed using the source code. You can run the following commands to install MindSpore Transformers:

git clone -b dev https://gitee.com/mindspore/mindformers.git
cd mindformers
bash build.sh

3. User Guide

MindSpore Transformers supports distributed pre-training, fine-tuning, and inference tasks for large models with one click. You can click the link of each model in Model List to see the corresponding documentation, and you can also refer to Start Tasks to learn how to start the above tasks.

For more information about the functions of MindSpore Transformers, please refer to MindSpore Transformers Documentation.

4. Life Cycle And Version Matching Strategy

MindSpore Transformers version has the following five maintenance phases:

Status Duration Description
Plan 1-3 months Planning function.
Develop 3 months Build function.
Preserve 6 months Incorporate all solved problems and release new versions.
No Preserve 0β€”3 months Incorporate all the solved problems, there is no full-time maintenance team, and there is no plan to release a new version.
End of Life (EOL) N/A The branch is closed and no longer accepts any modifications.

MindSpore Transformers released version preservation policy:

MindSpore Transformers Version Corresponding Label Current Status Release Time Subsequent Status EOL Date
1.5.0 v1.5.0 Preserve 2025/04/29 No preserve expected from 2025/10/29 2026/01/29
1.3.2 v1.3.2 Preserve 2024/12/20 No preserve expected from 2025/06/20 2025/09/20
1.2.0 v1.2.0 End of Life 2024/07/12 - 2025/04/12
1.1.0 v1.1.0 End of Life 2024/04/15 - 2025/01/15

5. Disclaimer

  1. scripts/examples directory are provided as reference examples and do not form part of the commercially released products. They are only for users' reference. If it needs to be used, the user should be responsible for transforming it into a product suitable for commercial use and ensuring security protection. MindSpore does not assume security responsibility for the resulting security problems.
  2. With regard to datasets, MindSpore Transformers only suggests datasets that can be used for training. MindSpore Transformers does not provide any datasets. If you use these datasets for training, please note that you should comply with the licenses of the corresponding datasets, and that MindSpore Transformers is not responsible for any infringement disputes that may arise from the use of the datasets.
  3. If you do not want your dataset to be mentioned in MindSpore Transformers, or if you want to update the description of your dataset in MindSpore Transformers, please submit an issue to Gitee, and we will remove or update the description of your dataset according to your issue request. We sincerely appreciate your understanding and contribution to MindSpore Transformers.

6. Contribution

We welcome contributions to the community. For details, see MindSpore Transformers Contribution Guidelines.

7. License

Apache 2.0 License

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages