Support 5.2 bloom #7846

zhoutianzi666 · 2024-01-15T02:29:44Z

PR types

New features：支持bloom类模型的5.2 推理

PR changes

Description

动态图运行代码

python3.8 predictor.py --model_name_or_path /root/.paddlenlp/models/bigscience/bloom-7b1/ --dtype float16 --src_length 102 --max_length 1024 --block_attn --batch_size 2 --inference_model > dynamic_2.txt

动态图运行weight only int8代码

python3.8 predictor.py --model_name_or_path /root/.paddlenlp/models/bigscience/bloom-7b1/ --dtype float16 --src_length 102 --max_length 1024 --block_attn --batch_size 2 --inference_model --quant_type weight_only_int8

动转静命令和静态图推理命令

python3.8 export_model.py --model_name_or_path /root/.paddlenlp/models/bigscience/bloom-7b1/ --inference_model --output_path ./inference --dtype float16 --block_attn --quant_type weight_only_int8

python3.8 predictor.py --model_name_or_path ./inference --inference_model --dtype "float16" --mode "static" --batch_size 2 --block_attn

wint8 动转静和静态图推理命令

python3.8 export_model.py --model_name_or_path /root/.paddlenlp/models/bigscience/bloom-7b1/ --inference_model --output_path ./inference_wint8 --dtype float16 --block_attn --quant_type weight_only_int8

python3.8 predictor.py --model_name_or_path ./inference_wint8 --inference_model --dtype "float16" --batch_size 2 --mode "static" --quant_type weight_only_int8 --block_attn

paddle-bot · 2024-01-15T02:29:49Z

Thanks for your contribution!

codecov · 2024-01-15T03:09:53Z

Codecov Report

Attention: 83 lines in your changes are missing coverage. Please review.

Comparison is base (4069f22) 56.95% compared to head (1dafb45) 56.90%.
Report is 9 commits behind head on develop.

Files	Patch %	Lines
...dlenlp/experimental/transformers/bloom/modeling.py	0.00%	78 Missing ⚠️
...enlp/experimental/transformers/generation_utils.py	0.00%	5 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #7846      +/-   ##
===========================================
- Coverage    56.95%   56.90%   -0.05%     
===========================================
  Files          587      587              
  Lines        88628    88724      +96     
===========================================
+ Hits         50480    50492      +12     
- Misses       38148    38232      +84

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

wj-Mcat

PR 质量很高，除了一个小 comment，我就两点点：

把单测给加上，仿照 llama inference 5.2 在 tests/llm/test_predictor.py 里面给加上。
上面的脚本你看看要不要更新到 llm/docs/inference.md 文件里面去。

wj-Mcat · 2024-01-16T09:04:27Z

llm/predictor.py

-                )
+                if predictor_args.block_attn:
+                    from paddlenlp.experimental.transformers import (
+                        BlommForCausalBlockLMInferenceModel as Model,


你这边就仿照 llama block_attn 的写法呗，就直接 as BloomInferenceModel ？

你这边就仿照 llama block_attn 的写法呗，就直接 as BloomInferenceModel ？

已经done，感谢review

wj-Mcat · 2024-01-16T09:04:43Z

llm/predictor.py

+                    config.max_seq_len = predictor_args.total_max_length
+                else:
+                    from paddlenlp.experimental.transformers import (
+                        BloomForCausalLMInferenceModel as Model,


同上。

已经done，感谢review

zhoutianzi666 · 2024-01-16T10:48:42Z

PR 质量很高，除了一个小 comment，我就两点点：

把单测给加上，仿照 llama inference 5.2 在 tests/llm/test_predictor.py 里面给加上。

上面的脚本你看看要不要更新到 llm/docs/inference.md 文件里面去。

单测已经加上了。这个命令我觉得可以不放进去，因为和里面已有的命令是类似的。感谢review

wj-Mcat

LGTM

initial commit

429d149

zhoutianzi666 added 4 commits January 15, 2024 02:36

remove debug code

c6f1272

remove debug code

044eaf6

modify predictor.py

33b542c

remove print

e76e87b

zhoutianzi666 added 2 commits January 15, 2024 08:58

remove rope_emb in bloom

5fd7ff6

format

b60ea16

zhoutianzi666 force-pushed the support_bloom branch from 99d897b to b60ea16 Compare January 16, 2024 02:27

wj-Mcat requested changes Jan 16, 2024

View reviewed changes

add test

1dafb45

wj-Mcat approved these changes Jan 17, 2024

View reviewed changes

wawltor merged commit cf907fc into PaddlePaddle:develop Jan 17, 2024
8 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support 5.2 bloom #7846

Support 5.2 bloom #7846

zhoutianzi666 commented Jan 15, 2024 •

edited

Loading

paddle-bot bot commented Jan 15, 2024

codecov bot commented Jan 15, 2024 •

edited

Loading

wj-Mcat left a comment •

edited

Loading

wj-Mcat Jan 16, 2024

zhoutianzi666 Jan 16, 2024

wj-Mcat Jan 16, 2024

zhoutianzi666 Jan 16, 2024

zhoutianzi666 commented Jan 16, 2024

wj-Mcat left a comment

Support 5.2 bloom #7846

Support 5.2 bloom #7846

Conversation

zhoutianzi666 commented Jan 15, 2024 • edited Loading

PR types

PR changes

Description

paddle-bot bot commented Jan 15, 2024

codecov bot commented Jan 15, 2024 • edited Loading

Codecov Report

wj-Mcat left a comment • edited Loading

Choose a reason for hiding this comment

wj-Mcat Jan 16, 2024

Choose a reason for hiding this comment

zhoutianzi666 Jan 16, 2024

Choose a reason for hiding this comment

wj-Mcat Jan 16, 2024

Choose a reason for hiding this comment

zhoutianzi666 Jan 16, 2024

Choose a reason for hiding this comment

zhoutianzi666 commented Jan 16, 2024

wj-Mcat left a comment

Choose a reason for hiding this comment

zhoutianzi666 commented Jan 15, 2024 •

edited

Loading

codecov bot commented Jan 15, 2024 •

edited

Loading

wj-Mcat left a comment •

edited

Loading