PPStructure中的SER+RE任务对内存要求是多大？ #8602

tanjh · 2022-12-12T06:17:19Z

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

系统环境/System Environment：Linux debian 3.10.0-1127.el7.x86_64 Upload PaddleOCR code #1
版本号/Version：Paddle：2.4.0 PaddleOCR： Release 2.6
问题相关组件/Related components：ppstructure
运行指令/Command Code：

python3 predict_system.py \
  --kie_algorithm=LayoutXLM \
  --re_model_dir=./inference/re_vi_layoutxlm_xfund_infer \
  --ser_model_dir=./inference/ser_vi_layoutxlm_xfund_infer \
  --image_dir=../pg/HJ0332.jpg \
  --ser_dict_path=../ppocr/utils/dict/kie_dict/xfund_class_list.txt \
  --vis_font_path=../doc/fonts/simfang.ttf \
  --ocr_order_method="tb-yx" \
  --mode=kie

完整报错/Complete Error Message：
root@a1793fdcfb9f:/test_ppocr/PaddleOCR/ppstructure# python3 predict_system.py \

--kie_algorithm=LayoutXLM
--re_model_dir=./inference/re_vi_layoutxlm_xfund_infer
--ser_model_dir=./inference/ser_vi_layoutxlm_xfund_infer
--image_dir=../pg/HJ0332.jpg
--ser_dict_path=../ppocr/utils/dict/kie_dict/xfund_class_list.txt
--vis_font_path=../doc/fonts/simfang.ttf
--ocr_order_method="tb-yx"
--mode=kie

[2022-12-12 01:45:13,606] [ INFO] - Already cached /root/.paddlenlp/models/layoutxlm-base-uncased/sentencepiece.bpe.model
[2022-12-12 01:45:14,234] [ INFO] - tokenizer config file saved in /root/.paddlenlp/models/layoutxlm-base-uncased/tokenizer_config.json
[2022-12-12 01:45:14,234] [ INFO] - Special tokens file saved in /root/.paddlenlp/models/layoutxlm-base-uncased/special_tokens_map.json
E1212 01:45:14.372604 768 analysis_config.cc:96] Please compile with gpu to EnableGpu()
E1212 01:45:21.777576 768 analysis_config.cc:96] Please compile with gpu to EnableGpu()
[2022/12/12 01:45:36] ppocr INFO: [0/1] ../pg/HJ0332.jpg

Socket error Event: 32 Error: 10053.
Connection closing...Socket close.

Connection closed by foreign host.

Disconnected from remote host(192.168.1.220) at 10:10:21.

介绍：尝试使用PaddleOCR的PPStructure的SER+RE对图片进行文字识别，并提取关键信息。
以python 3.7-slim镜像为基础，构建了paddlehub、paddleOCR环境后，下载了ser_vi_layoutxlm_xfund_infer.tar 和 re_vi_layoutxlm_xfund_infer.tar模型，做成镜像，在一台4C8G服务器（4.71G内存可用）上运行该镜像，执行上述命令后，内存飙升，服务器卡死，最后只能通过重启服务器恢复。
HJ0332.jpg大小是397571Byte。
现象：内存飙升，服务器卡死，ssh都不可用，输入无响应

问题：请问 ppstructure的SER+RE做关键信息提取任务时，对内存需求是什么要求？

The text was updated successfully, but these errors were encountered:

tanjh · 2022-12-12T06:18:22Z

tanjh · 2022-12-12T06:27:47Z

现在直接执行 https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppstructure/docs/inference.md中的SER+RE命令：
python3 predict_system.py
--kie_algorithm=LayoutXLM
--re_model_dir=./inference/re_vi_layoutxlm_xfund_infer
--ser_model_dir=./inference/ser_vi_layoutxlm_xfund_infer
--image_dir=./docs/kie/input/zh_val_42.jpg
--ser_dict_path=../ppocr/utils/dict/kie_dict/xfund_class_list.txt
--vis_font_path=../doc/fonts/simfang.ttf
--ocr_order_method="tb-yx"
--mode=kie
可以看到服务器上该任务内存和CPU占用都很高。

LDOUBLEV · 2022-12-13T01:04:08Z

没太关注过内存上限，内存占用大小和输入图像有关，较小的图像，占用内存更小

jingsongliujing · 2022-12-13T03:22:41Z

建议GPU用v100,cpu的话16核就够了

tanjh · 2022-12-13T05:06:04Z

没太关注过内存上限，内存占用大小和输入图像有关，较小的图像，占用内存更小

本次实验图片大小为1.4M，3.9M，这个算大图片还是小图片？

tanjh · 2022-12-13T05:07:01Z

建议GPU用v100,cpu的话16核就够了

不应用GPU计算，单纯只考虑CPU版本，这个对于硬件有强制性要求吗？是个什么样的要求？

jingsongliujing · 2022-12-13T05:34:47Z

就平时办公电脑的配置试试

tanjh · 2022-12-13T06:32:20Z

今天按照https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/ppstructure/kie/README_ch.md 中4.2的操作实验了389KB，1.8M的图片，发现CPU计算最终需要耗费5G左右才能出SER+RE的结果。

github-actions · 2023-07-09T02:16:56Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

paddle-bot bot assigned LDOUBLEV Dec 12, 2022

github-actions bot added the stale label Jul 9, 2023

github-actions bot closed this as completed Jul 16, 2023

paddle-bot bot added the status/close label Jul 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PPStructure中的SER+RE任务对内存要求是多大？ #8602

PPStructure中的SER+RE任务对内存要求是多大？ #8602

tanjh commented Dec 12, 2022 •

edited

Loading

tanjh commented Dec 12, 2022

tanjh commented Dec 12, 2022

LDOUBLEV commented Dec 13, 2022

jingsongliujing commented Dec 13, 2022

tanjh commented Dec 13, 2022

tanjh commented Dec 13, 2022

jingsongliujing commented Dec 13, 2022

tanjh commented Dec 13, 2022 •

edited

Loading

github-actions bot commented Jul 9, 2023

PPStructure中的SER+RE任务对内存要求是多大？ #8602

PPStructure中的SER+RE任务对内存要求是多大？ #8602

Comments

tanjh commented Dec 12, 2022 • edited Loading

tanjh commented Dec 12, 2022

tanjh commented Dec 12, 2022

LDOUBLEV commented Dec 13, 2022

jingsongliujing commented Dec 13, 2022

tanjh commented Dec 13, 2022

tanjh commented Dec 13, 2022

jingsongliujing commented Dec 13, 2022

tanjh commented Dec 13, 2022 • edited Loading

github-actions bot commented Jul 9, 2023

tanjh commented Dec 12, 2022 •

edited

Loading

tanjh commented Dec 13, 2022 •

edited

Loading