-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Labels
Description
Explain the problem.
Converting from RTF (using characters of Chinese, Japanese, and Korean languages) into Markdown causes CJK characters to be messed up.
Hello! English and CJK.rtf
(input)
{\rtf1\ansi\ansicpg932\deff0\nouicompat\deflang1033\deflangfe1041{\fonttbl{\f0\fnil\fcharset128 Arial Unicode MS;}{\f1\fnil\fcharset129 Arial Unicode MS;}}
{\*\generator Riched20 10.0.19041}\viewkind4\uc1
\pard\sa200\sl276\slmult1\f0\fs22\lang17 Hello! English and CJK\par
\u20320?\'8d\'44\'81\'49\par
\lang1041\'82\'b1\'82\'f1\'82\'c9\'82\'bf\'82\'cd\'81\'49\par
\f1\'be\'c8\'b3\'e7\'c7\'cf\'bc\'bc\'bf\'e4\f0\lang1033 !\lang17\par
}
Open this input with Windows Wordpad (write.exe "Hello! English and CJK.rtf"
).
Command:
pandoc -o "Hello! English and CJK.md" "Hello! English and CJK.rtf"
Hello! English and CJK.md
(output)
Actual
Hello! English and CJK
你D�I
‚±‚ñ‚É‚¿‚Í�I
¾È³çÇϼ¼¿ä!
Expected
Hello! English and CJK
你好!
こんにちは!
안녕하세요!
Pandoc version?
Pandoc is 3.1.13
which is installed with pandoc-3.1.13-windows-x86_64.msi
.
pandoc 3.1.13
Features: +server +lua
Scripting engine: Lua 5.4
User data directory: C:\Users\KU\AppData\Roaming\pandoc
Copyright (C) 2006-2023 John MacFarlane. Web: https://pandoc.org
This is free software; see the source for copying conditions. There is no
warranty, not even for merchantability or fitness for a particular purpose.
Using Windows 10 Pro, Japanese edition.
Microsoft Windows [Version 10.0.19045.4291]