-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix image addon not being able to render some jpeg images in IIP #5139
Conversation
@lusingander I have no well informed opinion about the JPEG format, so I dont really know, whats right or wrong here. Back when I implemented the dimension handling I already was under the impression, that JPEG is the "weakest" of those formats regarding finderprinting caps. On a sidenote - because of the "static" dimension extraction WebP is also not supported atm. I wonder if we could reshape the handler to let the browser do all the nasty dimension extraction as well. By doing that we would inherit all browser supported formats automatically. Back then I was not able to find a single pass way to do it, only a 2-pass way which lowers throughput a lot. |
libmagic and other file type detection libraries (that I know of) also seem to be able to determine the type by detecting Perhaps checking for EOI (the segment that indicates the end of the file) or the presence of the SOF segment (which contains the image size, etc.) that we already check could improve the accuracy a little, but I'm not sure how much it would help. (I don't know much about this either so I apologize if I'm saying something incorrect...) |
Oh well, so the 3. byte is always guaranteed to be 0xFF in any JPEG configuration? Sounds good to me as a minimal effort compromise. Gives at least 50% more confidence. I think we can go with that then. And yeah, imho digging further down into JPEG shenanigans and trying to cover every possibility of header configs is not worth it, in the end the browser has the last say, whether it can decode it or not. |
30e9d30
to
b307adb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thx for finding this one.
Yes, this seems fine since every segment starts with
In the end it still won't be able to render the broken files, and problematic files (such as very large files or files with aggressive byte sequences) won't be detected anyway, so in practice this seems like a good enough 😀 |
By the way, does |
2-pass there means, that a browser API would have to be asked twice, first for initial decoding --> grab dimensions & apply resizing policy, 2nd time for resizing. Currently it only needs one browser API call for the final bitmap creation with already calc'ed target dimensions. Thats only possible with dimension extraction as we do atm. But thats not so easy for WebP. Plz feel free to look into it, I haven't yet (beside a basic check in the docs). Edit: Oh and yes, it is similar to JPEG with WebP, it also has a concept of sub entities (dunno how it is called in their spec, was it subframes?), which can have different subformat and encode dimension differently. And, similar to JPEG, it can reside anywhere in the byte blob (well an indexed position, if I remember right) - still most the bytes have to be there just like with JPEG. So the headers of both formats are not very streaming/chunking friendly, as a streaming decoder might have to read deep into the bytes before it can start layouting its pixel array. (I really dont get it, why they shaped it this way, PNG got that right from the beginning and still supports subsections). Link to the WebP spec (if in doubt you gonna have to check their macros in the ref impl) |
In the image addon, when using the Inline image protocol (IIP), some jpeg images were not rendered.
For example, the image added to the test case (created by
exiftool -all= w3c_home.jpg
) does not have an application segment.In this case, IIPMetrics determines that it is not a jpeg image, so the image is not rendered.
(Running in xterm.js demo)
This is a valid jpeg image, so it can be opened in a general viewer or displayed on other terminals that implement IIP.
(Running in iTerm2)
(Running in WezTerm)
JPEGs that do not have an application segment immediately after the SOI or that do not include an application segment are valid, so it would be better if they could be rendered as well.
Specifically, I think it is sufficient to determine whether it is a jpeg if you can confirm that the first two bytes are SOI (
FF D8
). From the perspective of rendering processing, there seems to be no reason to have an application segment.