Skip to content

Commit b08765a

Browse files
authored
Merge pull request #282 from FlagAI-Open/master
update master into gpm_dev
2 parents 1aef572 + 590d178 commit b08765a

File tree

69 files changed

+734
-432
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

69 files changed

+734
-432
lines changed

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,8 +27,9 @@ datasets
2727
qqp
2828
glm_large_qqp_pytorch
2929
wandb
30+
clip_benchmark_datasets
3031
examples/AltCLIP/clip_benchmark_datasets
3132
examples/glm_pretrain/data.lazy
3233
examples/glm_pretrain/examples/glm_pretrain/data.lazy
3334
examples/vit_cifar100/cifar100
34-
examples/vit_cifar100/data
35+
examples/vit_cifar100/data

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,11 +15,12 @@ FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensibl
1515

1616
* These models can be applied to (Chinese/English) Text, for tasks like text classification, information extraction, question answering, summarization, and text generation.
1717

18-
* FlagAI is backed by the three most popular data/model parallel libraries — [PyTorch](https://pytorch.org/)/[Deepspeed](https://www.deepspeed.ai/)/[Megatron-LM](https://github.com/NVIDIA/Megatron-LM)/[BMTrain](https://github.com/OpenBMB/BMTrain) — with seamless integration between them. Users can parallel their training/testing process with less than ten lines of code.
18+
* FlagAI is backed by the four most popular data/model parallel libraries — [PyTorch](https://pytorch.org/)/[Deepspeed](https://www.deepspeed.ai/)/[Megatron-LM](https://github.com/NVIDIA/Megatron-LM)/[BMTrain](https://github.com/OpenBMB/BMTrain) — with seamless integration between them. Users can parallel their training/testing process with less than ten lines of code.
1919

2020
The code is partially based on [GLM](https://github.com/THUDM/GLM), [Transformers](https://github.com/huggingface/transformers)[timm](https://github.com/rwightman/pytorch-image-models) and [DeepSpeedExamples](https://github.com/microsoft/DeepSpeedExamples/tree/master/Megatron-LM).
2121

2222
## News
23+
- [2 Mar 2023] release v1.6.1, Support Galactica model [#234](https://github.com/FlagAI-Open/FlagAI/pull/234); BMInf, a low-resource inference package [#238](https://github.com/FlagAI-Open/FlagAI/pull/238), and examples for p-tuning [#227](https://github.com/FlagAI-Open/FlagAI/pull/238)
2324
- [12 Jan 2023] release v1.6.0, support a new parallel lib called [**BMTrain**](https://github.com/OpenBMB/BMTrain) and integate [**Flash Attention**](https://github.com/HazyResearch/flash-attention) to speedup training of Bert and Vit models, examples in [FlashAttentionBERT](https://github.com/FlagAI-Open/FlagAI/blob/master/examples/bert_title_generation_english/train_flash_atten.py) and [FlashAttentionViT](https://github.com/FlagAI-Open/FlagAI/blob/master/examples/vit_cifar100/train_single_gpu_flash_atten.py). Also add the contrastive search based text generation method [**SimCTG**](https://github.com/yxuansu/SimCTG) and DreamBooth finetuning based on AltDiffusion, examples in [AltDiffusionNaruto](https://github.com/FlagAI-Open/FlagAI/blob/master/examples/AltDiffusion/dreambooth.py).
2425
- [28 Nov 2022] release v1.5.0, support 1.1B [**EVA-CLIP**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/EVA_CLIP) and [ALM: A large Arabic Language Model based on GLM], examples in [**ALM**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/ALM)
2526
- [10 Nov 2022] release v1.4.0, support [AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities](https://arxiv.org/abs/2211.06679v1), examples in [**AltCLIP**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/AltCLIP) and [**AltDiffusion**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/AltDiffusion)
@@ -259,6 +260,6 @@ The majority of FlagAI is licensed under the [Apache 2.0 license](LICENSE), howe
259260
### ↳ Star History
260261
<div align="center">
261262

262-
[![Star History Chart](https://api.star-history.com/svg?repos=FlagAI-Open/FlagAI&type=Date)](https://star-history.com/#baaivision/EVA&Date)
263+
![Star History Chart](https://api.star-history.com/svg?repos=FlagAI-Open/FlagAI&type=Date)]
263264

264265
</div>

README_zh.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,12 +15,13 @@
1515

1616
* 这些模型可以应用于文本,用于文本分类、信息提取、问答、摘要、文本生成等任务,尤其是中文。
1717

18-
* 飞智由三个最流行的数据/模型并行库([PyTorch](https://pytorch.org/)/[Deepspeed](https://www.deepspeed.ai/)/[Megatron-LM](https://github.com/NVIDIA/Megatron-LM)/[BMTrain](https://github.com/OpenBMB/BMTrain))提供支持,它们之间实现了无缝集成。 你可以用不到十行代码来并行你的训练/测试过程。
18+
* 飞智由四个最流行的数据/模型并行库([PyTorch](https://pytorch.org/)/[Deepspeed](https://www.deepspeed.ai/)/[Megatron-LM](https://github.com/NVIDIA/Megatron-LM)/[BMTrain](https://github.com/OpenBMB/BMTrain))提供支持,它们之间实现了无缝集成。 你可以用不到十行代码来并行你的训练/测试过程。
1919

2020

2121
本项目的部分代码基于[GLM](https://github.com/THUDM/GLM)[Transformers](https://github.com/huggingface/transformers)[timm](https://github.com/rwightman/pytorch-image-models)[DeepSpeedExamples](https://github.com/microsoft/DeepSpeedExamples/tree/master/Megatron-LM).
2222

2323
## 动态
24+
- [2 Mar 2023] 支持v1.6.1版本, 增加Galactica模型 [#234](https://github.com/FlagAI-Open/FlagAI/pull/234), 大模型推理的低资源工具包BMInf [#238](https://github.com/FlagAI-Open/FlagAI/pull/238), 以及P-tuning样例 [#227](https://github.com/FlagAI-Open/FlagAI/pull/238)
2425
- [12 Jan 2023] 发布v1.6.0版本, 新增支持并行训练库 [**BMTrain**](https://github.com/OpenBMB/BMTrain) 以及集成 [**Flash Attention**](https://github.com/HazyResearch/flash-attention) 到 Bert 和 Vit 模型提速端到端训练, 示例见 [FlashAttentionBERT](https://github.com/FlagAI-Open/FlagAI/blob/master/examples/bert_title_generation_english/train_flash_atten.py)[FlashAttentionViT](https://github.com/FlagAI-Open/FlagAI/blob/master/examples/vit_cifar100/train_single_gpu_flash_atten.py). 同时增加了基于对比搜索的文本生成方法 [**SimCTG**](https://github.com/yxuansu/SimCTG) 以及基于 AltDiffusion 进行 DreamBooth 个性化微调, 示例见 [AltDiffusionNaruto](https://github.com/FlagAI-Open/FlagAI/blob/master/examples/AltDiffusion/dreambooth.py).
2526
- [28 Nov 2022] 发布v1.5.0版本, 支持1.1B参数的 [**EVA-CLIP**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/EVA_CLIP) 以及[ALM: 基于GLM的阿拉伯语大模型], 示例见[**ALM**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/ALM)
2627
- [10 Nov 2022] 发布v1.4.0版本, 支持[AltCLIP: 更改CLIP中的语言编码器以扩展语言功能](https://arxiv.org/abs/2211.06679v1), 示例见[**AltCLIP**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/AltCLIP)以及[**AltDiffusion**](https://github.com/FlagAI-Open/FlagAI/tree/master/examples/AltDiffusion)

doc_zh/TUTORIAL_15_BERT_EXAMPLE_TITLE_GENERATION.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@
2424
### 1. 数据加载
2525
样例数据位于 /examples/bert_title_generation/data/
2626

27-
需要在 ```trianer.py```文件中定义数据读取过程,例如:
27+
需要在 ```trainer.py```文件中定义数据读取过程,例如:
2828
```python
2929
def read_file():
3030
src = []

doc_zh/TUTORIAL_21_OPTIMIZER.md

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
# 如何使用优化器
2+
3+
## 优化器是什么?
4+
在机器学习和深度学习的语境下,
5+
优化器(Optimizer)是指用于更新模型参数的算法或方法,以便最小化预测输出和实际输出之间的误差。
6+
7+
优化器的目标是找到最优的参数组合,以在给定任务上获得最佳性能。
8+
这个过程通常在机器学习模型的训练阶段执行。
9+
10+
优化器通过计算损失函数相对于模型参数的梯度,并使用这些信息来更新参数,以减少损失。
11+
有多种可用的优化算法,例如随机梯度下降(SGD)、Adagrad、Adam、RMSprop等,每种算法都有其优点和缺点。
12+
13+
优化器的选择取决于特定问题、数据集的大小、模型的复杂性和其他因素。
14+
一个好的优化器可以显著提高模型的训练速度和准确性。
15+
16+
17+
18+
19+
## 加载优化器
20+
21+
### 依赖
22+
#### adan
23+
```
24+
python3 -m pip install git+https://github.com/sail-sg/Adan.git
25+
```
26+
#### lion
27+
```
28+
$ pip install lion-pytorch
29+
```
30+
#### lamb
31+
```
32+
$ pip install torch_optimizer
33+
```
34+
#### 例子
35+
```python
36+
>>> # currently FlagAI support adam, adamw, lion, adan, adafactor and lamb, which can be defined by setting optimizer_type when defining Trainer
37+
>>> trainer = Trainer(env_type='pytorch',
38+
>>> epochs=1,
39+
>>> batch_size=2,
40+
>>> eval_interval=100,
41+
>>> log_interval=10,
42+
>>> experiment_name='glm_large_bmtrain',
43+
>>> pytorch_device='cuda',
44+
>>> load_dir=None,
45+
>>> lr=1e-4,
46+
>>> num_gpus = 1,
47+
>>> weight_decay=1e-2,
48+
>>> save_interval=1000,
49+
>>> hostfile='./hostfile',
50+
>>> training_script=__file__,
51+
>>> deepspeed_config='./deepspeed.json',
52+
>>> optimizer_type='lion') #load optimizer
53+
```
54+

doc_zh/TUTORIAL_3_MODEL.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
## From_pretrain
2121

2222
`From_pretrain` 函数用于加载模型。同一个模型结构的模型可以用同一个class进行加载,比如`BERT-base-ch``Roberta-base-ch`模型都能用`BertModel`这个`Class`进行加载。`From_pretrain`为了数据/模型并行的模型加载进行了特定优化,避免重复下载导致的资源浪费。
23-
通过调用`ClassName.from_pretrian()`来进行加载.
23+
通过调用`ClassName.from_pretrain()`来进行加载.
2424
### 从modelhub加载
2525
现在我们支持从modelhub中下载[常用模型](#所有支持模型),可以直接通过`from_pretrain`下载模型配置文件`config.json`,模型权重`pytorch_model.bin`,以及字典文件`vocab.txt`。例子:
2626
```python

docs/TUTORIAL_21_OPTIMIZER.md

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# How to use Optimizer
2+
3+
## What is Optimizer?
4+
In the context of machine learning and deep learning,
5+
an optimizer is an algorithm or method used to update the parameters of a model in order to minimize the error between the predicted output and the actual output.
6+
7+
The goal of an optimizer is to find the optimal set of parameters that can achieve the best performance on a given task.
8+
This process is typically performed during the training phase of a machine learning model.
9+
10+
Optimizers work by computing the gradients of the loss function with respect to the model parameters,
11+
and using this information to update the parameters in the direction that reduces the loss.
12+
There are various optimization algorithms available,
13+
such as stochastic gradient descent (SGD), Adagrad, Adam, RMSprop, and more, each with their own advantages and disadvantages.
14+
15+
The choice of optimizer depends on the specific problem, the size of the dataset,
16+
the complexity of the model, and other factors.
17+
A good optimizer can significantly improve the training speed and accuracy of a model.
18+
19+
20+
21+
22+
## Loading optimizer
23+
24+
### dependencies
25+
#### adan
26+
```
27+
python3 -m pip install git+https://github.com/sail-sg/Adan.git
28+
```
29+
#### lion
30+
```
31+
$ pip install lion-pytorch
32+
```
33+
#### lamb
34+
```
35+
$ pip install torch_optimizer
36+
```
37+
#### example
38+
```python
39+
>>> # currently FlagAI support adam, adamw, lion, adan, adafactor and lamb, which can be defined by setting optimizer_type when defining Trainer
40+
>>> trainer = Trainer(env_type='pytorch',
41+
>>> epochs=1,
42+
>>> batch_size=2,
43+
>>> eval_interval=100,
44+
>>> log_interval=10,
45+
>>> experiment_name='glm_large_bmtrain',
46+
>>> pytorch_device='cuda',
47+
>>> load_dir=None,
48+
>>> lr=1e-4,
49+
>>> num_gpus = 1,
50+
>>> weight_decay=1e-2,
51+
>>> save_interval=1000,
52+
>>> hostfile='./hostfile',
53+
>>> training_script=__file__,
54+
>>> deepspeed_config='./deepspeed.json',
55+
>>> optimizer_type='lion') #load optimizer
56+
```
57+

docs/TUTORIAL_3_MODEL.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ All supported models now support the three most common model types [encoder, dec
2525

2626
### load model from modelhub
2727

28-
By calling `ClassName.from_pretrian()` to load following [supported models](#all-supported-models), it will automatically download the model configuration file `config.json`, model weights `pytorch_model.bin`, and dictionary files `vocab .txt`.
28+
By calling `ClassName.from_pretrain()` to load following [supported models](#all-supported-models), it will automatically download the model configuration file `config.json`, model weights `pytorch_model.bin`, and dictionary files `vocab .txt`.
2929

3030
```python
3131
>>> # Downloading GLM-large-ch from modelhub

examples/bert_title_generation_english/generate.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77

88
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
99

10-
model_dir = "../state_dict/"
10+
model_dir = "./checkpoints/"
1111

1212
# Note "./checkpoints_seq2seq/{}/mp_rank_00_model_states.pt", {} is a directory in the checkpoints_seq2seq.
1313
model_save_path = "./checkpoints_seq2seq/7079/mp_rank_00_model_states.pt"

examples/bert_title_generation_english/train.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
# Copyright © 2022 BAAI. All rights reserved.
22
#
33
# Licensed under the Apache License, Version 2.0 (the "License")
4-
import sys
54
import os
65
import torch
76
from torch.utils.data import Dataset

0 commit comments

Comments
 (0)