Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent extra EPS header validations #8144

Merged
merged 2 commits into from
Jun 25, 2024

Conversation

Yay295
Copy link
Contributor

@Yay295 Yay295 commented Jun 16, 2024

This is the only return statement in this function, and it appears to be unnecessary since breaking from the loop at this location accomplishes the same thing, and matches what the rest of this loop does.

The only change is that check_required_header_comments() is called at the end of the function. This was being skipped before due to the early return, but it will be called now that it's just breaking from the loop.

Before 8f75cc2 this return was inside a nested loop, so breaking wouldn't have worked, but now it does. Also, check_required_header_comments() didn't exist at that time, so the early return wasn't skipping anything.

@radarhere
Copy link
Member

The only change is that check_required_header_comments() is called at the end of the function. This was being skipped before due to the early return, but it will be called now that it's just breaking from the loop.

It's already been called by this point.

reading_header_comments starts out as true.

reading_header_comments = True

The image data part of the code can't be reached until reading_header_comments is false

if reading_header_comments:

elif bytes_mv[:11] == b"%ImageData:":

But it is only set to false after check_required_header_comments() is called.

check_required_header_comments()
reading_header_comments = False

check_required_header_comments()
reading_header_comments = False

If anyone is looking at this and thinks 'What about the size check?'

if not self._size:
msg = "cannot determine EPS bounding box"
raise OSError(msg)

That has to be set by the end of the image data part of the code.

self._size = columns, rows

So the change suggested here doesn't make any functional difference.

@Yay295
Copy link
Contributor Author

Yay295 commented Jun 17, 2024

So actually, if check_required_header_comments() was added at line 273, I think it could be removed from the end.

@radarhere
Copy link
Member

Ok, you're describing

diff --git a/src/PIL/EpsImagePlugin.py b/src/PIL/EpsImagePlugin.py
index 380b1cf0e..20ef0e5e2 100644
--- a/src/PIL/EpsImagePlugin.py
+++ b/src/PIL/EpsImagePlugin.py
@@ -270,6 +270,7 @@ class EpsImageFile(ImageFile.ImageFile):
             if byte == b"":
                 # if we didn't read a byte we must be at the end of the file
                 if bytes_read == 0:
+                    check_required_header_comments()
                     break
             elif byte in b"\r\n":
                 # if we read a line ending character, ignore it and parse what
@@ -365,8 +366,6 @@ class EpsImageFile(ImageFile.ImageFile):
                 trailer_reached = True
             bytes_read = 0
 
-        check_required_header_comments()
-
         if not self._size:
             msg = "cannot determine EPS bounding box"
             raise OSError(msg)

Sure, that looks correct, if you like it as a fractional performance increase.

@radarhere
Copy link
Member

I think your first commit could be removed? As I was saying, it doesn't make any functional difference. It just unnecessarily runs the check(s) at the end.

@Yay295
Copy link
Contributor Author

Yay295 commented Jun 17, 2024

I think it makes the function easier to read by only having continues and breaks, and not having one early return.

@radarhere
Copy link
Member

I don't personally agree, but we can see what others think.

@wiredfool
Copy link
Member

I'm generally a fan of early exit rather than introducing conditional checks to allow a method to run to completion when there shouldn't be code executed anyway.

@Yay295
Copy link
Contributor Author

Yay295 commented Jun 21, 2024

I would normally agree, but in this case this is the only return in the method, and it's inside a loop, so it's easy to miss it, and this change also stops the check from happening twice in some situations.

@radarhere
Copy link
Member

this change also stops the check from happening twice in some situations.

As per my earlier comment, the return change doesn't have any functional impact. Your second commit on the other hand, does stop check_required_header_comments() from potentially being called twice.

@Yay295 Yay295 force-pushed the eps_plugin_return_break branch from bf014af to a2c0a90 Compare June 22, 2024 14:33
@Yay295
Copy link
Contributor Author

Yay295 commented Jun 22, 2024

I still think it's bad form to have that return there, but I've removed the first commit and rebased on main.

@Yay295 Yay295 changed the title Break from loop instead of returning Prevent extra header validations Jun 22, 2024
@radarhere radarhere changed the title Prevent extra header validations Prevent extra EPS header validations Jun 23, 2024
Co-authored-by: Andrew Murray <3112309+radarhere@users.noreply.github.com>
@Yay295 Yay295 force-pushed the eps_plugin_return_break branch from a8b6858 to 065aeae Compare June 24, 2024 13:54
@hugovk hugovk merged commit 1661343 into python-pillow:main Jun 25, 2024
56 checks passed
@Yay295 Yay295 deleted the eps_plugin_return_break branch June 25, 2024 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants