Description
I hope I'm in the right place and this isn't directly related to GraalVM. So please excuse me if I'm wasting yourn time.
You can find all the code I'm talking about right here: https://github.com/Schaka/janitorr/tree/bazarr-support
The image is built using the Spring-Boot bootImage step via Gradle and I'm passing these ENV variables.
"BPE_DEFAULT_LANG" to "en_US.UTF-8",
"BPE_LANG" to "en_US.UTF-8",
"BPE_LC_ALL" to "en_US.UTF-8",
"JAVA_TOOL_OPTIONS" to """
-Dsun.jnu.encoding=UTF-8
-Dfile.encoding=UTF-8
""".trimIndent(),
"BP_NATIVE_IMAGE_BUILD_ARGUMENTS" to """
-march=compatibility
-H:+AddAllCharsets
-Dsun.jnu.encoding=UTF-8
-Dfile.encoding=UTF-8
""".trimIndent()
My host (Debian 12) has LANG set correctly and LC_ALL not set at all.
According to the docs, I also passed these arguments to Docker via compose.yml
According to the docs, this would not print correctly to console (docker logs) otherwise, but definitely seems to. Granted, I use logback and not any direct prints, so there is a chance this fixes things magically.
services:
janitorr:
container_name: janitorr
image: ghcr.io/schaka/janitorr:native-amd64-80-merge
user: 1000:1000
ports:
- 8978:8978 # Technically, we don't publish any endpoints, so this isn't strictly required
volumes:
- /appdata/janitorr/config/application.yml:/workspace/application.yml
- /share_media:/data
environment:
- LC_ALL=en_US.UTF-8
- LANG=en_US.UTF-8
Yet, the second I use Path.of("a path with an ümläüt"), I run into the following exception:
java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: /data/media/anime-movies/Nausicaä of the Valley of the Wind (1984) [imdbid-tt0087544]/Nausicaä of the Valley of the Wind (1984) [imdbid-tt0087544] - [Bluray-1080p][FLAC 2.0][x264].mkv
at java.base@23/sun.nio.fs.UnixPath.encode(UnixPath.java:131) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
at java.base@23/sun.nio.fs.UnixPath.<init>(UnixPath.java:77) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
at java.base@23/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:312) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
at java.base@23/java.nio.file.Path.of(Path.java:148) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
at com.github.schaka.janitorr.mediaserver.AbstractMediaServerService.pathStructure$janitorr(AbstractMediaServerService.kt:71) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
at com.github.schaka.janitorr.mediaserver.AbstractMediaServerService.createLinks(AbstractMediaServerService.kt:99) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
Is there something I'm missing here, or could this be a bug in GraalVM somehow?
Looking at the code, UnixFileSystem definitely reads sun.jnu.encoding
. The filepath is received as a valid UTF-8 string via REST.
Logging from within the image provides:
2024-10-14T09:56:37.360Z INFO 1 --- [ main] c.g.s.j.config.RuntimeEnvironment : Default charset UTF-8
2024-10-14T09:56:37.360Z INFO 1 --- [ main] c.g.s.j.config.RuntimeEnvironment : sun.jnu.encoding ANSI_X3.4-1968
2024-10-14T09:56:37.360Z INFO 1 --- [ main] c.g.s.j.config.RuntimeEnvironment : sun.stdout.encoding null
2024-10-14T09:56:37.360Z INFO 1 --- [ main] c.g.s.j.config.RuntimeEnvironment : sun.stderr.encoding null
2024-10-14T09:56:37.360Z INFO 1 --- [ main] c.g.s.j.config.RuntimeEnvironment : ENV JAVA_TOOL_OPTIONS null
2024-10-14T09:56:37.360Z INFO 1 --- [ main] c.g.s.j.config.RuntimeEnvironment : ENV LANG en_US.UTF-8
2024-10-14T09:56:37.360Z INFO 1 --- [ main] c.g.s.j.config.RuntimeEnvironment : ENV LANGUAGE null
2024-10-14T09:56:37.360Z INFO 1 --- [ main] c.g.s.j.config.RuntimeEnvironment : ENV LC_ALL en_US.UTF-8