Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: 【紧急Bug,待修复】在使用label_studio转换模型时,doc_parser.parse()返回为空数组 #6724

Open
1 task done
Viserion-nlper opened this issue Aug 15, 2023 · 1 comment
Assignees
Labels
bug Something isn't working triage

Comments

@Viserion-nlper
Copy link

软件环境

- paddlepaddle:x
- paddlepaddle-gpu:2.5.1 
- paddlenlp: 2.5.2
- paddleocr: 2.6.1.3

重复问题

  • I have searched the existing issues

错误描述

#6598同样的问题

在使用uie-x进行数据预处理label_studio.py的时候,使用paddleocr解析完后 没有返回值,没有train.txt 、test.txt内容,

稳定复现步骤 & 代码

稳定复现,
在paddlenlp.utils.tools.DataConverter 378行
https://github.com/PaddlePaddle/PaddleNLP/blob/bc8df6ef875dab07862282c9d0ad22996c71a9e9/paddlenlp/utils/tools.py#L378C9-L378C93
DocParser.parse()返回值中的layout为空
parsed_doc["layout"] 没有值为空数组。
所以导致后续没有任何值输出 请问是哪里问题?
paddleocr解析失败?

@Viserion-nlper Viserion-nlper added the bug Something isn't working label Aug 15, 2023
@Viserion-nlper
Copy link
Author

按照文档和给出的数据集,进行版面分析的时候,数据处理的值为0
请速度处理下该bug 谢谢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

2 participants