Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

公式识别及其附近的文字识别出现问题 #170

Open
Pekary opened this issue Nov 5, 2024 · 1 comment
Open

公式识别及其附近的文字识别出现问题 #170

Pekary opened this issue Nov 5, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@Pekary
Copy link

Pekary commented Nov 5, 2024

Description of the bug | 错误描述

  1. 部分多行公式无法被解析,见test_1.pdf "V(2, 2, 2)=..."部分。
    test_1.pdf

  2. 公式后面的部分文字未被解析,见test2.pdf "Optimal Value Function"前面的“where”开头文字。
    test_2.pdf

How to reproduce the bug | 如何复现

magic-pdf -p test_1.pdf -o output/
magic-pdf -p test_2.pdf -o output/

Operating system | 操作系统

Linux

Python version | Python 版本

3.10

Software version | 软件版本 (magic-pdf --version)

0.9.0

Device mode | 设备模式

cuda

@Pekary Pekary added the bug Something isn't working label Nov 5, 2024
@myhloli myhloli transferred this issue from opendatalab/MinerU Nov 5, 2024
@myhloli
Copy link
Collaborator

myhloli commented Nov 5, 2024

1.公式识别的结果

\begin{array}{l}{{V(2,2,2)=\mathrm{min}\left(V(2,3,2)+(p_{22}+s_{22})w_{22}-s_{22}w_{22},\right.}}\\ {{\ }}\\ {{V(2,3,1)+(p_{22}+s_{21})w_{22}-s_{21}w_{22}\right)}}\\ {{\ }}\\ {{=\mathrm{min}\ \left(0+(1+0)1-0\times1,\ \ 0+(1+2)1-2\times1\right)\ =\ 1}}\end{array}

多了一个\right,属于语法错误。
2.mfd没有识别到这个公式
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants