-
Notifications
You must be signed in to change notification settings - Fork 9.7k
Description
🔎 Search before asking
- I have searched the PaddleOCR Docs and found no similar bug report.
- I have searched the PaddleOCR Issues and found no similar bug report.
- I have searched the PaddleOCR Discussions and found no similar bug report.
🐛 Bug (问题描述)
識別pdf表格效果很差,pdf裡面的內容文字比較小,需要放大後才看得見
🏃♂️ Environment (运行环境)
docker-compose
services:
paddleocr-vl-api:
image: ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddleocr-vl:${API_IMAGE_TAG_SUFFIX}
container_name: paddleocr-vl-api
volumes:
- ./pipeline_config_${VLM_BACKEND}.yaml:/home/paddleocr/pipeline_config_${VLM_BACKEND}.yaml
ports:
- 8097:8080
depends_on:
paddleocr-vlm-server:
condition: service_healthy
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ["0"]
capabilities: [gpu]
# TODO: Allow using a regular user
user: root
restart: unless-stopped
environment:
- VLM_BACKEND=${VLM_BACKEND:-vllm}
command: /bin/bash -c "paddlex --serve --pipeline /home/paddleocr/pipeline_config_${VLM_BACKEND}.yaml"
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"]
paddleocr-vlm-server:
image: ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/paddleocr-genai-${VLM_BACKEND}-server:${VLM_IMAGE_TAG_SUFFIX}
container_name: paddleocr-vlm-server
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ["0"]
capabilities: [gpu]
# TODO: Allow using a regular user
user: root
restart: unless-stopped
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:8080/health || exit 1"]
start_period: 300s
🌰 Minimal Reproducible Example (最小可复现问题的Demo)
http://10.157.189.148:8097/layout-parsing
{
"file": "https://s3webform.efoxconn.com:8080/aiimg/ectp/file/20260108026.pdf",
"fileType": 0,
"minPixels":12400000
}
出來的效果特別差,感覺是pdf一頁比較大,我自己打開的時候需要放大很多倍才看得清,然後識別的時候,不會放大去做的感覺