Skip to content

Commit 2d2253f

Browse files
kbleesdscho
authored andcommitted
gettext: always use UTF-8 on native Windows
On native Windows, Git exclusively uses UTF-8 for console output (both with MinTTY and native Win32 Console). Gettext uses `setlocale()` to determine the output encoding for translated text, however, MSVCRT's `setlocale()` does not support UTF-8. As a result, translated text is encoded in system encoding (as per `GetAPC()`), and non-ASCII chars are mangled in console output. Side note: There is actually a code page for UTF-8: 65001. In practice, it does not work as expected at least on Windows 7, though, so we cannot use it in Git. Besides, if we overrode the code page, any process spawned from Git would inherit that code page (as opposed to the code page configured for the current user), which would quite possibly break e.g. diff or merge helpers. So we really cannot override the code page. In `init_gettext_charset()`, Git calls gettext's `bind_textdomain_codeset()` with the character set obtained via `locale_charset()`; Let's override that latter function to force the encoding to UTF-8 on native Windows. In Git for Windows' SDK, there is a `libcharset.h` and therefore we define `HAVE_LIBCHARSET_H` in the MINGW-specific section in `config.mak.uname`, therefore we need to add the override before that conditionally-compiled code block. Rather than simply defining `locale_charset()` to return the string `"UTF-8"`, though, we are careful not to break `LC_ALL=C`: the `ab/no-kwset` patch series, for example, needs to have a way to prevent Git from expecting UTF-8-encoded input. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
1 parent aa25c82 commit 2d2253f

File tree

1 file changed

+19
-1
lines changed

1 file changed

+19
-1
lines changed

gettext.c

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,25 @@
1212
#ifndef NO_GETTEXT
1313
# include <locale.h>
1414
# include <libintl.h>
15-
# ifdef HAVE_LIBCHARSET_H
15+
# ifdef GIT_WINDOWS_NATIVE
16+
17+
static const char *locale_charset(void)
18+
{
19+
const char *env = getenv("LC_ALL"), *dot;
20+
21+
if (!env || !*env)
22+
env = getenv("LC_CTYPE");
23+
if (!env || !*env)
24+
env = getenv("LANG");
25+
26+
if (!env)
27+
return "UTF-8";
28+
29+
dot = strchr(env, '.');
30+
return !dot ? env : dot + 1;
31+
}
32+
33+
# elif defined HAVE_LIBCHARSET_H
1634
# include <libcharset.h>
1735
# else
1836
# include <langinfo.h>

0 commit comments

Comments
 (0)