Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for JPEGXL encoded images #690

Open
wants to merge 14 commits into
base: work/jpegxl
Choose a base branch
from

Conversation

shtrophic
Copy link

@shtrophic shtrophic commented Oct 20, 2024

Hey there. This is a proof-of-concept for JPEGXL decoding of the image/jxl mimetype.
I figured that if there is WEBP support, there should be JPEGXL as well.
After all, indie web requires indie codecs, amirite?

Right now, this adds an (optional) dependency on libjxl. That is a BSD-3-clause license, so it should be compatible?
If you think this is something you'd merge, I can have a look about android (are there build instructions available already?). Can't do iOS though since I don't have apple hardware.

@skyjake
Copy link
Owner

skyjake commented Oct 21, 2024

Thanks. This looks reasonable, although I recommend running Clang-Format on your additions using the included .clang-format file in the repository. I can merge this once you're happy with it.

When it comes to mobile, I've been building the image decoder libraries from source and using those in the build, so in theory little is needed to use this on Android and iOS except for some fiddling with the build environment. Of course, it would be nicer if the platform's native APIs were used, since they might be more efficient, but I don't think it's worth the implementation cost. Images are not Gemini's core use case.

@shtrophic
Copy link
Author

shtrophic commented Oct 23, 2024

One more thing that might be interesting: since you track isPartial on media files, it would make sense to pass that down to makeTexture_GmImage and then to loadJxl_ since jxl supports progressive decoding. Though, then it would probably be necessary to keep the decoder state somewhere, for it to continue on data that comes in later. Not sure if that would be a sensible thing to do, but the possibility exists. Maybe for another PR.

When it comes to mobile, I've been building the image decoder libraries from source and using those in the build, so in theory little is needed to use this on Android and iOS except for some fiddling with the build environment.

So, will you figure this out then? Available Android build instructions would be nice though!

Of course, it would be nicer if the platform's native APIs were used.

I'm not sure if any android native implementations for jxl even exist though.

@shtrophic shtrophic marked this pull request as ready for review October 23, 2024 21:23
@shtrophic
Copy link
Author

shtrophic commented Oct 25, 2024

Hey there again. I had some time on my hands and gave progressive decoding a shot. You can see it in action when receiving a jxl over a slow internet connection or decoding a rather large image.

If you think that this is too much for now / needs further testing, feel free to tell me;
My suggestion for this PR would then be to revert faec6f6 and create another PR for that.

If not, there are a couple of concerns I have. There is now a global map to keep track of the iStatefulJxlDecoders, which is not thread-safe. I wasn't sure if your drawing code is multi threaded? If yes, this probably needs a mutex as well I think. Other than that, I chose the iGmLinkId to be the hashmap's keys. This assumes, that they stay consistent... Maybe there needs to be a more robust one?

For some jxl images, you can try gemini://sir-photch.xyz/jxltest.gmi.

@shtrophic shtrophic force-pushed the dev branch 2 times, most recently from 300276f to 205c5c3 Compare October 25, 2024 21:41
@skyjake
Copy link
Owner

skyjake commented Oct 28, 2024

Nice! Progressive decoding sounds useful, although I'll have to review your code when I have some more time available. It's certainly not mandatory to have, though, if the complexity is too high.

I wasn't sure if your drawing code is multi threaded?

Anything related to drawing is done in the main thread. Other threads are used for network requests, subscription updates, and searching for navbar lookup terms, not for anything that touches the UI widgets.

@shtrophic
Copy link
Author

shtrophic commented Oct 28, 2024

It's certainly not mandatory to have, though, if the complexity is too high.

Well as I see it there are two ways to do it:

  • Create-destroy decoders with every chunk of the image we receive. This is less complex, but also less efficient, since a JxlDecoder can keep it's progress when it has output an intermediate image, to then continue on any chunks that conclude the original stream
  • Create encoders upon first request, then keep them somewhere until we have fed them with all data, step by step, while we consume intermediate images to then receive the final image and free the decoder

I opted with the latter since it seemed like the proper way to me. The actual usage of the libjxl API does not get more complicated. And for larger images, keeping progress makes sense.

But as I have looked at the_Foundation now, I have probably misinterpreted on how to create a subclass of iMapNode. I'll see what you suggest in your review though before I continue hacking on this :)

@skyjake
Copy link
Owner

skyjake commented Dec 14, 2024

@Sir-Photch I reviewed the code and made some corrections. Please see the notes in 1c6a9b0 about the changes. If you keep working on this, I recommend using the "work/jpegxl" branch as the basis.

The major issue that remains fixing here is that the Map containing the decoders is global while link IDs are specific to one GmDocument. I recommend moving the map to be owned by Media (which is owned by GmDocument).

When it comes to including this in prebuilt Lagrange binaries, I am worried about the size of the JXL library. It looks like the binaries would grow by several megabytes, which seems a lot for something that is needed very seldom. I may opt to not include this in the prebuilt binaries. People may of course enable JXL when building from source.

@shtrophic shtrophic changed the base branch from dev to work/jpegxl December 15, 2024 20:41
@shtrophic
Copy link
Author

I recommend moving the map to be owned by Media (which is owned by GmDocument).

Done, though I am not sure on how to use the_Foundation correctly... deinit_,delete_,collect_,clear_? There isn't much documentation on the usage of the Map API there as well.

It's probably not ready to ship yet because there is still some kind of race-condition or bug, but I am not sure whether or not this comes from my changes. You can reproduce by quickly clicking on two inline jxls to trigger two decodings at the same time, collapse them again, and so on. Best case, the images load, mediocre case, it displays Failed to load Image while still decoding in the background (but not displaying intermediate frames), worst case: SIGSEGV.

When it comes to including this in prebuilt Lagrange binaries, I am worried about the size of the JXL library. It looks like the binaries would grow by several megabytes, which seems a lot for something that is needed very seldom.

Makes sense, though, also when statically linking and stripping?

@skyjake
Copy link
Owner

skyjake commented Dec 21, 2024

Done, though I am not sure on how to use the_Foundation correctly... deinit_,delete_,collect_,clear_?

Some of the common conventions are explained here, in the repository README: https://git.skyjake.fi/skyjake/the_Foundation/

To elaborate on the ones you mentioned:

  • deinit methods free any resources owned by an object. The object's memory is NOT freed. (cf. C++ destructor method)
  • delete calls deinit and frees the object's memory, too. (cf. C++ delete)
  • collect puts a pointer to the object into the garbage collection queue, to be deleted later.
  • clear empties all elements/contents; the container remains usable for putting more elements in.

The first three methods are automatically generated by macros for a given type, so they are generally available for every declared type.

There isn't much documentation on the usage of the Map API there as well.

True. I would've used Hash here myself. (Hash is also used in many places in the app, while Map is not.) Map exists primarily for sorted keys, and that is not a requirement here. Switching to a Hash shouldn't be too difficult.

The memory model for Hash and Map is the same: the user is responsible for the element ownership, while Hash/Map just maintains a lookup mapping.

some kind of race-condition or bug

Hmm, but you are using separate decoders per link, and also in that test case of two images, isn't all the encoded data already in memory? I'll try to take a look in the debugger. Perhaps it is trying to show the image before decoding has completed, without the higher code levels realizing this.

when statically linking and stripping?

Remains to be seen... I tried compiling libjxl statically but for some reason the build failed (tried on macOS 15.2), lots of library not found for -lcrt0.o errors. There might be system libraries/APIs provided for decoding JPEG XL on macOS nowadays, but I didn't have time to look closer. I would prefer not to add much, if any, platform-dependent code for this, though.

@skyjake
Copy link
Owner

skyjake commented Dec 21, 2024

An important note about async decoding: all the previous decoders in the app are synchronous, so once all the data is available, the image is decoded and drawn on screen. The UI thread is blocked during the decoding (!).

The implicit assumption is that when all the data is available, the image can be immediately displayed on screen.

To properly handle an async decode, you'd have to draw the UI/document with some placeholder while decoding is still happening, and then when decoding is finished, you'd trigger a document relayout so it'll update all the inline images and refresh the UI. One thread-safe way to trigger the layout is postCommand_App("document.layout.changed redo:1"); that posts an SDL event that is then handled in the main thread as soon as possible.

I would be fine with synchronous jxl decoding, if the above starts feeling too cumbersome.

@shtrophic
Copy link
Author

So I think I have the stability issues ironed out.

The remaining issue is caused by the fact that the drawing code expects the decoding of every image to be finished once any image is finished downloading; In other words, if you click on two inline image links and one downloads and decodes faster than the other, the slower one gets displayed with a caption indicating that loading the image failed. I worked around this in 0c2c014, such that the download progress of any unfinished image is being kept displayed.

It is a little crooked however, since I wasn't sure where & how the drawing offsets are calculated. This is how it currently looks like:

recording.mp4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants