Skip to content

Inconsistent results when cropping an already cropped page #245

Closed
@samkit-jain

Description

@samkit-jain

Describe the bug

When cropping an already cropped page, the objects are not preserved.

Code to reproduce the problem

import pdfplumber

# Make sure the file is downloaded at file.pdf
pdf = pdfplumber.open("file.pdf")
page = pdf.pages[0]

# Crop and save the top page and keep only the bottom 20%.
bottom = page.crop((0, 0.8 * float(page.height), page.width, page.height))
im = bottom.to_image(resolution=150)
im.save("bottom.png", format="PNG")

# Now, crop and save the left half of the cropped page.
bottom_left = bottom.crop((0, 0, 0.5 * float(bottom.width), bottom.height))
im = bottom_left.to_image(resolution=150)
im.save("bottom_left.png", format="PNG")

# Now, crop and save the right half of the cropped page.
bottom_right = bottom.crop((0.5 * float(bottom.width), 0, bottom.width, bottom.height))
im = bottom_right.to_image(resolution=150)
im.save("bottom_right.png", format="PNG")

PDF file

examples/pdfs/ag-energy-round-up-2017-02-24.pdf

Expected behavior

  • bottom.png - The bottom portion of the page is saved.
  • bottom_left.png - The left half of bottom portion of the page is saved.
  • bottom_right.png - The right half of the bottom portion of the page is saved.

Actual behavior

  • bottom.png - The bottom portion of the page is saved.
  • bottom_left.png - The left half of the top portion of the page is saved.
  • bottom_right.png - The right half of the top portion of the page is saved.

Screenshots

bottom.png
bottom

bottom_left.png
bottom_left

bottom_right.png
bottom_right

Environment

  • pdfplumber version: 0.5.22
  • Python version: 3.8.2
  • OS: Ubuntu 18.04 LTS

Additional context

The issue was found when working on #244

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions