Skip to content

Use libjpeg-turbo for improved jpg compatibility and speed #104347

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 2, 2025

Conversation

DanielKinsman
Copy link
Contributor

@DanielKinsman DanielKinsman commented Mar 19, 2025

Alternative to #104298 using libjpeg-turbo instead. In my testing it runs in 85% of the time of libjpeg, with further improvement possible if we enable SIMD.


  • implement compression and saving
  • more error checking
  • test on linux
  • test on android
  • test on windows
  • tested on Mac / iOS via unit tests in build (don't have a mac)
  • license and copyright information
  • remove manually created jconfig.h, jversion.h, and jconfigint.h, move to patches
  • pair down added third-party files to the essentials
  • have thorvg use this library instead of jpeg-compressor
  • test basis_universal in editor after removing jpeg-compressor
  • replace jpg module with this one and keep the jpg module name
  • remove bmp file support (or use it instead of the bmp module)
  • remove ppm file support

@akien-mga
Copy link
Member

Did the same test as #104298 for impact on size, libjpeg-turbo (in the current implementation, without SIMD, etc.) seems to add 352 KiB, so that starts being significant (but can be justified if the gains are worth it).

master (fc827bb) libjpeg (rebased on fc827bb) libjpeg-turbo (rebased on fc827bb)
Compilation time 2m39.430s / 2m39.581s 2m40.609s / 2m39.223s 2m40.166s
Size 66.346 MiB 66.495 MiB 66.698 MiB

I'm a bit wary seeing the 138k lines of code vs 25k for libjpeg. I believe some of them might not be needed and we can likely reduce this a bit.

jpeg decoding and encoding is not the main workflow of Godot, where we mostly rely on webp/png for sprites, and VRAM compression for textures used in 3D, so I'm not sure how far we should go in using the most optimized paths for jpeg specifically.

I expect enabling SIMD should bring more gains than the current 85% decoding time measured here, but from experience it also means significant added complexity on the buildsystem side (and potential edge cases for third-party platforms to handle where the arch assumptions may not match the ones we have for first-party platforms).

@fire
Copy link
Member

fire commented Mar 19, 2025

what happens if you remove the jpeg module and only use jpg turbo

@akien-mga
Copy link
Member

akien-mga commented Mar 19, 2025

what happens if you remove the jpeg module and only use jpg turbo

Weirdly enough, compiling libjpeg-turbo (rebased on fc827bb) with module_jpg_enabled=no gives me 66.702 MiB, i.e. an increase of 4 KiB compared to libjpeg-turbo + jpgd module.

It's tiny and within margin of error for LTO optimization possibilities, but basically this means that the jpgd module weighs practically nothing. That's why we were using such a lightweight library in the first place, little complexity and binary size impact.

@DanielKinsman
Copy link
Contributor Author

I haven't made any effort yet to remove the unused code, will do that tomorrow.

The SIMD stuff is all via nasm, rather than spending time trying to get that working in godot i'll just provide some raw benchmark results from the command line version of libjpeg-turbo with/without SIMD, hopefully that will be enough information to make a decision.

That last result is very strange given that disabling the jpg modules also disables the dependent thorvg and basis-universal modules too... I imagine if you removed the dependency check they would hit linker errors.

@DanielKinsman
Copy link
Contributor Author

Oh and for extra context, I am working on a game which pulls images from the internet at run time, much like the museum and many things does, so can't be in control of the process.

@akien-mga
Copy link
Member

That last result is very strange given that disabling the jpg modules also disables the dependent thorvg and basis-universal modules too... I imagine if you removed the dependency check they would hit linker errors.

Yeah I was surprised too at first but the basis_universal module only depends on jpgd for editor builds, and I'm testing with a release export template (since it's the one where size matters most).

thorvg seems to be using its own version of jpgd currently, not our copy in thirdparty/jpeg-compressor/.

@DanielKinsman
Copy link
Contributor Author

DanielKinsman commented Mar 20, 2025

Comparison time

  master (97241ff) libjpeg diff w master libjpeg-turbo diff w master Libjpeg-turbo SIMD
git diff –stat 0 31654   52352   ?
linux export template compilation time 06:15.70 06:20.86 101.37% 06:16.72 100.27% ?
linux export template size (bytes) 78289120 78415968 100.16% 78588384 100.38% ?
1000 jpg decompressions wall clock time 34.1 34.1 99.96% 27.1 79.49% ?

The lines of code count might be slightly off as I have added the copyright stuff for libjpeg but not libjpeg-turbo yet. I only did a single run of the compile times so there might be some variance there.

As for SIMD, I haven't put that in godot but here is the difference it makes using the command line libjpeg-turbo tools.

Libjpeg-turbo tjbench command no SIMD SIMD diff
compression 8bit 4:4:4 Q70 megapixels/s 65.3 325.0 497.92%
decompression 8bit 4:4:4 Q70 megapixels/s 169.7 386.9 227.97%

Thoughts? Personally I don't think SIMD is worth the effort and complexity, despite the performance.

@akien-mga
Copy link
Member

akien-mga commented Mar 20, 2025

1000 jpg decompressions wall clock time

I'm surprised that there's virtually no performance difference between the reference libjpeg implementation and the minimal single-header jpeg-compressor. I guess Rich did a good job there to get on par with libjpeg.

Thoughts? Personally I don't think SIMD is worth the effort and complexity, despite the performance.

Yeah I agree. Not including SIMD reduces the LOCs quite a bit and the complexity significantly. We can always add it later if/when we evaluate that it's worth the trouble.

I\ve played around quickly with using libjpeg-turbo for thorvg, and removing jpeg-compressor:

https://github.com/akien-mga/godot/commits/libjpeg_turbo/

If we go this route, I would suggest we fully remove modules/jpg, and maybe reuse that name for the libjpeg-turbo based module. We tend to name modules after their purpose (svg, jpg, fbx, ogg, websocket, etc.) and not after their underlying library implementation (though there are exceptions).

My patch should also fix the MSVC build but more work is likely needed to make jconfigint.h properly cross-platform without us having to run a config step for it on all platforms (we can just hardcode the #ifdefs instead of relying on CMake to inject platform-specific code).


With the current version of this PR (62ee999) and my patch removing jpeg-compressor, I get (libturbo-jpeg-v2 binary):

-rwxr-xr-x.  1 root root 69724792 Mar 18 16:45 godot.linuxbsd.template_release.x86_64.libjpeg
-rwxr-xr-x.  1 root root 69942232 Mar 20 09:58 godot.linuxbsd.template_release.x86_64.libjpeg-turbo
-rwxr-xr-x.  1 root root 69757912 Mar 20 10:03 godot.linuxbsd.template_release.x86_64.libjpeg-turbo-v2
-rwxr-xr-x.  1 root root 69663448 Mar 18 17:48 godot.linuxbsd.template_release.x86_64.master

So the diff between master and libjpeg-turbo + no jpeg-compressor module seems to be 92.25 KiB. That's more than reasonable.

Note: The libjpeg-turbo binary in the above test is the one from #104347 (comment), i.e. commit 940bcdc, not 62ee999.

@DanielKinsman
Copy link
Contributor Author

That all sounds good. I'll cherry pick your additions and carry on with the rest of the work 👍

@@ -5,3 +5,4 @@ deadlock:tests/core/templates/test_command_queue.h
deadlock:modules/text_server_adv/text_server_adv.cpp
deadlock:modules/text_server_fb/text_server_fb.cpp
race:modules/navigation/nav_map_3d.cpp
race:thirdparty/thorvg/src/loaders/external_jpg/tvgJpgLoader.cpp
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume it is ok to suppress errors in third party code? Or does that still leave us vulnerable to crashes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The best I can do is try to file a bug report with @hermet at Thorvg

@DanielKinsman DanielKinsman marked this pull request as ready for review March 24, 2025 04:36
@DanielKinsman DanielKinsman requested review from a team as code owners March 24, 2025 04:36
@DanielKinsman DanielKinsman changed the title [DRAFT] use libjpeg-turbo for improved jpg compatibility and speed use libjpeg-turbo for improved jpg compatibility and speed Mar 24, 2025
Copy link
Member

@Calinou Calinou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested locally, it works as expected.

This has a positive impact on MovieWriter speed when using AVI (MJPEG) output. Example with godot-reflection's benchmark mode with JPEG quality set to 100. The time represented is the time it takes to record a movie of the full benchmark run (1 minute at 60 FPS):

Before After
2m43s (36% of real-time speed) 2m18s (43% of real-time speed)

On the other hand, binary size grows a fair bit for stripped Linux x86_64 release export templates (production=yes lto=full):

Before After
72,679,096 bytes 73,150,360 bytes (+ 471 KB)

I'm not sure why the size increase is so large compared to what @akien-mga reported.

@akien-mga akien-mga self-requested a review March 24, 2025 19:14
@akien-mga akien-mga changed the title use libjpeg-turbo for improved jpg compatibility and speed Use libjpeg-turbo for improved jpg compatibility and speed Mar 28, 2025
@DanielKinsman DanielKinsman force-pushed the libjpeg_turbo branch 2 times, most recently from 10e5dbb to fb7f3a6 Compare April 2, 2025 22:46
@DanielKinsman DanielKinsman force-pushed the libjpeg_turbo branch 2 times, most recently from b0270e3 to 3263a32 Compare May 1, 2025 02:10
@DanielKinsman
Copy link
Contributor Author

This PR is now back in a working state and ready for review. If any of you want me to keep it updated with master after today, add a comment, or perhaps let me know which meeting to attend.

Copy link
Member

@akien-mga akien-mga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a final review pass and some light testing, everything looks good to me.

Just pointed out a few minor adjustments to copyright, documentation and buildsystem stuff.

Final size impact comparison on a Linux release export template:

$ lsmb bin
-rwxr-xr-x. 1 akien akien   67.70MB May 1 11:17 godot.linuxbsd.template_release.x86_64.libjpeg-turbo
-rwxr-xr-x. 1 akien akien   67.36MB May 1 11:14 godot.linuxbsd.template_release.x86_64.master

So it adds 340 KiB.

@DanielKinsman DanielKinsman force-pushed the libjpeg_turbo branch 2 times, most recently from 23791f2 to 0132b24 Compare May 2, 2025 02:32
@DanielKinsman
Copy link
Contributor Author

I've made all the requested changes in a new commit to make re-reviewing easier, will squash them together before merge.

Copy link
Member

@akien-mga akien-mga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@akien-mga
Copy link
Member

I got a linking issue when building with the jpg module first, and then with the module disabled:

mold: error: undefined symbol: JpgLoader::JpgLoader()
>>> referenced by tvgLoader.cpp
>>>               bin/obj/modules/libmodule_svg.linuxbsd.editor.dev.x86_64.a(tvgLoader.linuxbsd.editor.dev.x86_64.o):(_find(FileType))

I'll look into it, SCons should normally properly recompile the dependent files if the config changed, so maybe we missed something.

@akien-mga
Copy link
Member

akien-mga commented May 2, 2025

The problem comes from my use of our scons_cache option, I don't think it's caused by this PR, but it's a bug in our cache handling.

Building first with:

scons dev_build=yes dev_mode=yes linker=mold scu_build=all cache_path=/home/akien/.scons_cache/godot/ module_jpg_enabled=yes

and then:

scons dev_build=yes dev_mode=yes linker=mold scu_build=all cache_path=/home/akien/.scons_cache/godot/ module_jpg_enabled=no

I see:

Retrieved `bin/obj/thirdparty/thorvg/src/renderer/tvgLoader.linuxbsd.editor.dev.x86_64.o' from cache

while the env_svg should definitely have changed with the removal of the THORVG_JPG_LOADER_SUPPORT which should cause the tvgLoader.cpp file to be recompiled (together all other thorvg files).

Without scons_cache, all files properly get recompiled whenever the value of module_jpg_enabled changes.

CC @Repiteo @AThousandShips

This shouldn't block merging this PR, it's an orthogonal issue.

@akien-mga
Copy link
Member

akien-mga commented May 2, 2025

On the other hand there's this one change that should be added too, even though currently it has no impact. It registers the svg module's two optional dependencies as such:

diff --git a/modules/svg/config.py b/modules/svg/config.py
index d22f9454ed..d37f02188f 100644
--- a/modules/svg/config.py
+++ b/modules/svg/config.py
@@ -1,4 +1,5 @@
 def can_build(env, platform):
+    env.module_add_dependencies("svg", ["jpg", "webp"], True)
     return True
 
 

(True makes it optional. If not optional, then disabling one of those prevents building with that module enabled. The optional flag makes it do nothing, but we still track those, so in the future we might use it to provide useful context e.g. in the --help output, or print info messages that some features are disabled.)

@akien-mga
Copy link
Member

I linked a few more issues I could find which this fixes.

Should be ready to merge after CI passes.

Co-authored-by: Rémi Verschelde <rverschelde@gmail.com>
@akien-mga akien-mga merged commit a8ece29 into godotengine:master May 2, 2025
20 checks passed
@akien-mga
Copy link
Member

Thanks! Amazing work 🎉

@DanielKinsman
Copy link
Contributor Author

I couldn't have done it without you, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants