Fix image addon not being able to render some jpeg images in IIP #5139

lusingander · 2024-08-28T07:50:19Z

In the image addon, when using the Inline image protocol (IIP), some jpeg images were not rendered.

For example, the image added to the test case (created by exiftool -all= w3c_home.jpg) does not have an application segment.

In this case, IIPMetrics determines that it is not a jpeg image, so the image is not rendered.

(Running in xterm.js demo)

This is a valid jpeg image, so it can be opened in a general viewer or displayed on other terminals that implement IIP.

(Running in iTerm2)

(Running in WezTerm)

JPEGs that do not have an application segment immediately after the SOI or that do not include an application segment are valid, so it would be better if they could be rendered as well.

Specifically, I think it is sufficient to determine whether it is a jpeg if you can confirm that the first two bytes are SOI (FF D8). From the perspective of rendering processing, there seems to be no reason to have an application segment.

jerch · 2024-08-28T11:21:33Z

@lusingander I have no well informed opinion about the JPEG format, so I dont really know, whats right or wrong here. Back when I implemented the dimension handling I already was under the impression, that JPEG is the "weakest" of those formats regarding finderprinting caps.
What bothers me though - the identification is now reduced to a 2-byte lookup, any idea to get it more foolproof? Would a lookup in the libmagic database give us more clues what should be considered JPEG?

On a sidenote - because of the "static" dimension extraction WebP is also not supported atm. I wonder if we could reshape the handler to let the browser do all the nasty dimension extraction as well. By doing that we would inherit all browser supported formats automatically. Back then I was not able to find a single pass way to do it, only a 2-pass way which lowers throughput a lot.

lusingander · 2024-08-28T12:39:51Z

I have no well informed opinion about the JPEG format, so I dont really know, whats right or wrong here. Back when I implemented the dimension handling I already was under the impression, that JPEG is the "weakest" of those formats regarding finderprinting caps.
What bothers me though - the identification is now reduced to a 2-byte lookup, any idea to get it more foolproof? Would a lookup in the libmagic database give us more clues what should be considered JPEG?

libmagic and other file type detection libraries (that I know of) also seem to be able to determine the type by detecting FF D8 or the next FF(The FF in the third byte is the first byte of any segment).

Perhaps checking for EOI (the segment that indicates the end of the file) or the presence of the SOF segment (which contains the image size, etc.) that we already check could improve the accuracy a little, but I'm not sure how much it would help.

(I don't know much about this either so I apologize if I'm saying something incorrect...)

jerch · 2024-08-28T14:05:41Z

... or the next FF(The FF in the third byte is the first byte of any segment).

Oh well, so the 3. byte is always guaranteed to be 0xFF in any JPEG configuration? Sounds good to me as a minimal effort compromise. Gives at least 50% more confidence. I think we can go with that then.

And yeah, imho digging further down into JPEG shenanigans and trying to cover every possibility of header configs is not worth it, in the end the browser has the last say, whether it can decode it or not.

jerch

LGTM, thx for finding this one.

lusingander · 2024-08-28T14:26:06Z

Oh well, so the 3. byte is always guaranteed to be 0xFF in any JPEG configuration? Sounds good to me as a minimal effort compromise. Gives at least 50% more confidence. I think we can go with that then.

Yes, this seems fine since every segment starts with FF.

And yeah, imho digging further down into JPEG shenanigans and trying to cover every possibility of header configs is not worth it, in the end the browser has the last say, whether it can decode it or not.

In the end it still won't be able to render the broken files, and problematic files (such as very large files or files with aggressive byte sequences) won't be detected anyway, so in practice this seems like a good enough 😀

lusingander · 2024-08-28T14:38:56Z

On a sidenote - because of the "static" dimension extraction WebP is also not supported atm. I wonder if we could reshape the handler to let the browser do all the nasty dimension extraction as well. By doing that we would inherit all browser supported formats automatically. Back then I was not able to find a single pass way to do it, only a 2-pass way which lowers throughput a lot.

By the way, does 2-pass way mean that you have to look at the whole file like with jpg? Or does it require some other processing?
Either way, it sounds good to have WebP support 🙂

jerch · 2024-08-28T15:08:12Z

2-pass there means, that a browser API would have to be asked twice, first for initial decoding --> grab dimensions & apply resizing policy, 2nd time for resizing. Currently it only needs one browser API call for the final bitmap creation with already calc'ed target dimensions. Thats only possible with dimension extraction as we do atm. But thats not so easy for WebP. Plz feel free to look into it, I haven't yet (beside a basic check in the docs).

Edit: Oh and yes, it is similar to JPEG with WebP, it also has a concept of sub entities (dunno how it is called in their spec, was it subframes?), which can have different subformat and encode dimension differently. And, similar to JPEG, it can reside anywhere in the byte blob (well an indexed position, if I remember right) - still most the bytes have to be there just like with JPEG. So the headers of both formats are not very streaming/chunking friendly, as a streaming decoder might have to read deep into the bytes before it can start layouting its pixel array. (I really dont get it, why they shaped it this way, PNG got that right from the beginning and still supports subsections).

Link to the WebP spec (if in doubt you gonna have to check their macros in the ref impl)

Add test for jpg without APP segment to IIPMetrics

a07d97a

Fix IIPMetrics jpg detection

b307adb

lusingander force-pushed the image-addon-iip-jpeg branch from 30e9d30 to b307adb Compare August 28, 2024 14:16

jerch approved these changes Aug 28, 2024

View reviewed changes

jerch merged commit 1b48cd0 into xtermjs:master Aug 28, 2024
12 checks passed

jerch added this to the 6.0.0 milestone Aug 28, 2024

lusingander deleted the image-addon-iip-jpeg branch August 28, 2024 14:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix image addon not being able to render some jpeg images in IIP #5139

Fix image addon not being able to render some jpeg images in IIP #5139

lusingander commented Aug 28, 2024

jerch commented Aug 28, 2024

lusingander commented Aug 28, 2024 •

edited

Loading

jerch commented Aug 28, 2024 •

edited

Loading

jerch left a comment

lusingander commented Aug 28, 2024

lusingander commented Aug 28, 2024

jerch commented Aug 28, 2024 •

edited

Loading

Fix image addon not being able to render some jpeg images in IIP #5139

Fix image addon not being able to render some jpeg images in IIP #5139

Conversation

lusingander commented Aug 28, 2024

jerch commented Aug 28, 2024

lusingander commented Aug 28, 2024 • edited Loading

jerch commented Aug 28, 2024 • edited Loading

jerch left a comment

Choose a reason for hiding this comment

lusingander commented Aug 28, 2024

lusingander commented Aug 28, 2024

jerch commented Aug 28, 2024 • edited Loading

lusingander commented Aug 28, 2024 •

edited

Loading

jerch commented Aug 28, 2024 •

edited

Loading

jerch commented Aug 28, 2024 •

edited

Loading