rotation angle for non native pages are coming wrong #4308
-
Description of the bugI have a use case where I upload a PDF to a pipeline, If it is native PDF, I can see the rotation angle is detected correctly by pymupdf, But when I have non native PDFs with rotated image, I am getting the page.rotation as 0, Is this an expected behaviour ? If I get the page.rotation correctly I can probably do set_rotation as 0, but for a rotated non native pdf I am getting page.rotation as 0. cc @JorjMcKie How to reproduce the bugdoc = fitz.open() PyMuPDF version1.25.1 Operating systemLinux Python version3.11 |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 1 reply
-
This is not an issue but a typical "Discussions" item ... transferring. |
Beta Was this translation helpful? Give feedback.
-
There is no such differentiation as between "native" or "non-native" PDFs. You could look at what the page knows knows about the image(s) it displays. Depending on information delivered to it, the page might be aware of some transformation that has taken place to create its display, like so: page.get_images()
[(4, 0, 1704, 2200, 1, 'DeviceGray', '', 'Im1', 'CCITTFaxDecode')]
page.get_image_info()
[{'number': 0, 'bbox': (0.0, -6.103515625e-05, 613.4400024414062, 792.0), 'transform': (613.4400024414062, 0.0, -0.0, 792.0000610351562, 0.0, -6.103515625e-05), 'width': 1704, 'height': 2200, 'colorspace': 1, 'cs-name': 'DeviceGray', 'xres': 96, 'yres': 96, 'bpc': 1, 'size': 35461}] Here, the transformation matrix So the page is not aware of any rotation for that image. |
Beta Was this translation helpful? Give feedback.
-
On your other side remark: As I wrote before: this example already has rotation 0. The problem has been caused by the person operating the scanner. Admittedly, not all scanners support advanced options though. |
Beta Was this translation helpful? Give feedback.
There is no such differentiation as between "native" or "non-native" PDFs.
Your example simply is a normal PDF showing a full-page image - obviously created by a scanner. The person operating the scanner didn't bother about how to dump the original on the scanner's glass and / or telling the scanner how to interpret the page orientation.
That's what you have.
You could look at what the page knows knows about the image(s) it displays. Depending on information delivered to it, the page might be aware of some transformation that has taken place to create its display, like so: