Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pdf to image rendering ignore optional content offs #3806

Open
PasaOpasen opened this issue Aug 26, 2024 · 6 comments
Open

pdf to image rendering ignore optional content offs #3806

PasaOpasen opened this issue Aug 26, 2024 · 6 comments
Labels
upstream bug bug outside this package

Comments

@PasaOpasen
Copy link

Description of the bug

with some docs with already disabled optional content layers the rendered pages still contain them;

example link: https://dropmefiles.com/zTbp4

How to reproduce the bug

f = 'path/oc-for-ocr.pdf'
dpi = 150

import fitz
from PIL import Image

doc = fitz.open(f)
print(doc.layer_ui_configs())  # shows that almost all layers except "Text" are off

pix = doc[0].get_pixmap(matrix=fitz.Matrix(dpi/72, dpi/72))
img =Image.frombytes('RGB', (pix.width, pix.height), pix.samples)

img.show()  # displays image containing all layers info

PyMuPDF version

1.24.9

Operating system

Windows

Python version

3.8

@JorjMcKie
Copy link
Collaborator

Cannot download from the supplied link. Please provide a working one.

@PasaOpasen
Copy link
Author

@JorjMcKie sent to your email

@JorjMcKie
Copy link
Collaborator

This is an upstream error. Opening a MuPDF report. Is this a confidential file, or can I simply attach it here?

@JorjMcKie JorjMcKie added the upstream bug bug outside this package label Aug 26, 2024
@PasaOpasen
Copy link
Author

@JorjMcKie sorry, it is confidential

is there any way to hotfix the problem? Like removing hidden layers content

@PasaOpasen
Copy link
Author

pypdfium2 encounters same problem on this doc but pdf2image and python-poppler work well

@JorjMcKie
Copy link
Collaborator

A similar result from all the browsers: some show Text only, some show also other content.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
upstream bug bug outside this package
Projects
None yet
Development

No branches or pull requests

2 participants