Skip to content

Ignoring locale settings and sun.jnu.encoding #344

Open
@Schaka

Description

@Schaka

I hope I'm in the right place and this isn't directly related to GraalVM. So please excuse me if I'm wasting yourn time.
You can find all the code I'm talking about right here: https://github.com/Schaka/janitorr/tree/bazarr-support

The image is built using the Spring-Boot bootImage step via Gradle and I'm passing these ENV variables.

"BPE_DEFAULT_LANG" to "en_US.UTF-8",
"BPE_LANG" to "en_US.UTF-8",
"BPE_LC_ALL" to "en_US.UTF-8",
"JAVA_TOOL_OPTIONS" to """
    -Dsun.jnu.encoding=UTF-8
    -Dfile.encoding=UTF-8
""".trimIndent(),
"BP_NATIVE_IMAGE_BUILD_ARGUMENTS" to """
    -march=compatibility
    -H:+AddAllCharsets
    -Dsun.jnu.encoding=UTF-8
    -Dfile.encoding=UTF-8
""".trimIndent()

My host (Debian 12) has LANG set correctly and LC_ALL not set at all.
According to the docs, I also passed these arguments to Docker via compose.yml

According to the docs, this would not print correctly to console (docker logs) otherwise, but definitely seems to. Granted, I use logback and not any direct prints, so there is a chance this fixes things magically.

services:
  janitorr:
    container_name: janitorr
    image: ghcr.io/schaka/janitorr:native-amd64-80-merge
    user: 1000:1000
    ports:
      - 8978:8978 # Technically, we don't publish any endpoints, so this isn't strictly required
    volumes:
      - /appdata/janitorr/config/application.yml:/workspace/application.yml
      - /share_media:/data
    environment:
      - LC_ALL=en_US.UTF-8
      - LANG=en_US.UTF-8

Yet, the second I use Path.of("a path with an ümläüt"), I run into the following exception:

java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: /data/media/anime-movies/Nausicaä of the Valley of the Wind (1984) [imdbid-tt0087544]/Nausicaä of the Valley of the Wind (1984) [imdbid-tt0087544] - [Bluray-1080p][FLAC 2.0][x264].mkv
	at java.base@23/sun.nio.fs.UnixPath.encode(UnixPath.java:131) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
	at java.base@23/sun.nio.fs.UnixPath.<init>(UnixPath.java:77) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
	at java.base@23/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:312) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
	at java.base@23/java.nio.file.Path.of(Path.java:148) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
	at com.github.schaka.janitorr.mediaserver.AbstractMediaServerService.pathStructure$janitorr(AbstractMediaServerService.kt:71) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
	at com.github.schaka.janitorr.mediaserver.AbstractMediaServerService.createLinks(AbstractMediaServerService.kt:99) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]

Is there something I'm missing here, or could this be a bug in GraalVM somehow?
Looking at the code, UnixFileSystem definitely reads sun.jnu.encoding. The filepath is received as a valid UTF-8 string via REST.

Logging from within the image provides:

2024-10-14T09:56:37.360Z  INFO 1 --- [           main] c.g.s.j.config.RuntimeEnvironment        : Default charset UTF-8
2024-10-14T09:56:37.360Z  INFO 1 --- [           main] c.g.s.j.config.RuntimeEnvironment        : sun.jnu.encoding ANSI_X3.4-1968
2024-10-14T09:56:37.360Z  INFO 1 --- [           main] c.g.s.j.config.RuntimeEnvironment        : sun.stdout.encoding null
2024-10-14T09:56:37.360Z  INFO 1 --- [           main] c.g.s.j.config.RuntimeEnvironment        : sun.stderr.encoding null
2024-10-14T09:56:37.360Z  INFO 1 --- [           main] c.g.s.j.config.RuntimeEnvironment        : ENV JAVA_TOOL_OPTIONS null
2024-10-14T09:56:37.360Z  INFO 1 --- [           main] c.g.s.j.config.RuntimeEnvironment        : ENV LANG en_US.UTF-8
2024-10-14T09:56:37.360Z  INFO 1 --- [           main] c.g.s.j.config.RuntimeEnvironment        : ENV LANGUAGE null
2024-10-14T09:56:37.360Z  INFO 1 --- [           main] c.g.s.j.config.RuntimeEnvironment        : ENV LC_ALL en_US.UTF-8

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions