Skip to content

Commit

Permalink
feat: 同步 自动纠错 Lua
Browse files Browse the repository at this point in the history
同步 雾凇拼音 自动纠错 Lua
  • Loading branch information
Mintimate committed Apr 26, 2024
1 parent 93ecf16 commit bf7a4e3
Show file tree
Hide file tree
Showing 4 changed files with 150 additions and 94 deletions.
8 changes: 8 additions & 0 deletions double_pinyin_flypy.schema.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -176,13 +176,21 @@ reduce_english_filter:
# 中国农历配置
chineseLunarCalendar_translator: lunar

# Lua 配置:为 corrector 格式化 comment,占位符为 {comment}
# 默认 "{comment}" 输入 hun dun 时会在「馄饨」旁边生成 hún tun 的 comment
# 例如左右加个括号 "({comment})" 就会变成 (hún tun)
corrector: "{comment}"


translator:
# 字典文件
dictionary: rime_mint # 使用的字典文件
prism: double_pinyin_flypy # 多方案共用一个词库时,为避免冲突,需要用 prism 指定一个名字。
spelling_hints: 8 # corrector.lua :为了让错音错字提示的 Lua 同时适配全拼双拼,将拼音显示在 comment 中
always_show_comments: true # corrector.lua :Rime 默认在 preedit 等于 comment 时取消显示 comment,这里强制一直显示,供 corrector.lua 做判断用。
comment_format: # 标记拼音注释,供 corrector.lua 做判断用
- xform/^/[/
- xform/$/]/
preedit_format:
- xform/([bpmfdtnljqx])n/$1iao/
- xform/(\w)g/$1eng/
Expand Down
220 changes: 126 additions & 94 deletions lua/corrector_filter.lua
Original file line number Diff line number Diff line change
Expand Up @@ -6,102 +6,134 @@
为了让这个 Lua 同时适配全拼与双拼,使用 `spelling_hints` 生成的 comment(全拼拼音)作为通用的判断条件。
感谢大佬@[Shewer Lu](https://github.com/shewer)提供的思路。
容错词在 cn_dicts/others.dict.yaml 中,有新增建议可以提个 issue
容错词在 dicts/rime_ice.others.dict.yaml 中,定期同步 雾凇拼音 ,有新增建议可以在 雾凇拼音 地址提个 issue(嘿嘿)
--]]

local corrections = {
-- 错音
["hun dun"] = { text = "馄饨", comment = "hun tun" },
["zhu jiao"] = { text = "主角", comment = "zhu jue" },
["jiao se"] = { text = "角色", comment = "jue se" },
["pi sa"] = { text = "比萨", comment = "bi sa" },
["chi pi sa"] = { text = "吃比萨", comment = "chi bi sa" },
["pi sa bing"] = { text = "比萨饼", comment = "bi sa bing" },
["shui fu"] = { text = "说服", comment = "shuo fu" },
["dao hang"] = { text = "道行", comment = "dao heng" },
["mo yang"] = { text = "模样", comment = "mu yang" },
["you mo you yang"] = { text = "有模有样", comment = "you mu you yang" },
["yi mo yi yang"] = { text = "一模一样", comment = "yi mu yi yang" },
["zhuang mo zuo yang"] = { text = "装模作样", comment = "zhuang mu zuo yang" },
["ren mo gou yang"] = { text = "人模狗样", comment = "ren mu gou yang" },
["mo ban"] = { text = "模板", comment = "mu ban" },
["a mi tuo fo"] = { text = "阿弥陀佛", comment = "e mi tuo fo" },
["na mo a mi tuo fo"] = { text = "南无阿弥陀佛", comment = "na mo e mi tuo fo" },
["nan wu a mi tuo fo"] = { text = "南无阿弥陀佛", comment = "na mo e mi tuo fo" },
["nan wu e mi tuo fo"] = { text = "南无阿弥陀佛", comment = "na mo e mi tuo fo" },
["gei yu"] = { text = "给予", comment = "ji yu" },
["bin lang"] = { text = "槟榔", comment = "bing lang" },
["zhang bai zhi"] = { text = "张柏芝", comment = "zhang bo zhi" },
["teng man"] = { text = "藤蔓", comment = "teng wan" },
["nong tang"] = { text = "弄堂", comment = "long tang" },
["xin kuan ti pang"] = { text = "心宽体胖", comment = "xin kuan ti pan" },
["mai yuan"] = { text = "埋怨", comment = "man yuan" },
["xu yu wei she"] = { text = "虚与委蛇", comment = "xu yu wei yi" },
["mu na"] = { text = "木讷", comment = "mu ne" },
["du le le"] = { text = "独乐乐", comment = "du yue le" },
["zhong le le"] = { text = "众乐乐", comment = "zhong yue le" },
["xun ma"] = { text = "荨麻", comment = "qian ma" },
["qian ma zhen"] = { text = "荨麻疹", comment = "xun ma zhen" },
["mo ju"] = { text = "模具", comment = "mu ju" },
["cao zhi"] = { text = "草薙", comment = "cao ti" },
["cao zhi jing"] = { text = "草薙京", comment = "cao ti jing" },
["cao zhi jian"] = { text = "草薙剑", comment = "cao ti jian" },
["jia ping ao"] = { text = "贾平凹", comment = "jia ping wa" },
["xue fo lan"] = { text = "雪佛兰", comment = "xue fu lan" },
["qiang jin"] = { text = "强劲", comment = "qiang jing" },
["tong ti"] = { text = "胴体", comment = "dong ti" },
["li neng kang ding"] = { text = "力能扛鼎", comment = "li neng gang ding" },
["ya lv jiang"] = { text = "鸭绿江", comment = "ya lu jiang" },
["da fu bian bian"] = { text = "大腹便便", comment = "da fu pian pian" },
["ka bo zi"] = { text = "卡脖子", comment = "qia bo zi" },
["zhi sheng"] = { text = "吱声", comment = "zi sheng" },
["chan he"] = { text = "掺和", comment = "chan huo" },
["chan huo"] = { text = "掺和", comment = "chan huo" },
["can he"] = { text = "掺和", comment = "chan huo" },
["cheng zhi"] = { text = "称职", comment = "chen zhi" },
["luo shi fen"] = { text = "螺蛳粉", comment = "luo si fen" },
["tiao huan"] = { text = "调换", comment = "diao huan" },
["tai xing shan"] = { text = "太行山", comment = "tai hang shan" },
["jie si di li"] = { text = "歇斯底里", comment = "xie si di li" },
["nuan he"] = { text = "暖和", comment = "nuan huo" },
["mo ling liang ke"] = { text = "模棱两可", comment = "mo leng liang ke" },
["pan yang hu"] = { text = "鄱阳湖", comment = "po yang hu" },
["bo jing"] = { text = "脖颈", comment = "bo geng" },
["bo jing er"] = { text = "脖颈儿", comment = "bo geng er" },
["jie zha"] = { text = "结扎", comment = "jie za" },
-- 错字
["pu jie"] = { text = "扑街", comment = "仆街" },
["pu gai"] = { text = "扑街", comment = "仆街" },
["pu jie zai"] = { text = "扑街仔", comment = "仆街仔" },
["pu gai zai"] = { text = "扑街仔", comment = "仆街仔" },
["ceng jin"] = { text = "曾今", comment = "曾经" },
["an nai"] = { text = "按耐", comment = "按捺(na)" },
["an nai bu zhu"] = { text = "按耐不住", comment = "按捺(na)不住" },
["bie jie"] = { text = "别介", comment = "别价(jie)" },
["beng jie"] = { text = "甭介", comment = "甭价(jie)" },
["xue mai pen zhang"] = { text = "血脉喷张", comment = "血脉贲(ben)张 | 血脉偾(fen)张" },
["qi ke fu"] = { text = "契科夫", comment = "契诃(he)夫" },
["zhao cha"] = { text = "找茬", comment = "找碴" },
["zhao cha er"] = { text = "找茬儿", comment = "找碴儿" },
["da jia lai zhao cha"] = { text = "大家来找茬", comment = "大家来找碴" },
["da jia lai zhao cha er"] = { text = "大家来找茬儿", comment = "大家来找碴儿" },
["ci ya"] = { text = "龇牙", comment = "龇(zi)牙" },
["ci zhe ya"] = { text = "龇着牙", comment = "龇(zi)着牙" },
["ci ya lie zui"] = { text = "龇牙咧嘴", comment = "龇(zi)牙咧嘴" },
["cou huo"] = { text = "凑活", comment = "凑合(he)" },
}
local M = {}

local function corrector(input)
for cand in input:iter() do
-- cand.comment 是目前输入的词汇的完整拼音
local c = corrections[cand.comment]
if c and cand.text == c.text then
cand:get_genuine().comment = c.comment
elseif cand.type == "user_phrase" or cand.type == "phrase" or cand.type == "sentence" then
cand:get_genuine().comment = ""
end
yield(cand)
end
function M.init(env)
local config = env.engine.schema.config
local delimiter = config:get_string('speller/delimiter')
if delimiter and #delimiter > 0 and delimiter:sub(1,1) ~= ' ' then
env.delimiter = delimiter:sub(1,1)
end
env.name_space = env.name_space:gsub('^*', '')
M.style = config:get_string(env.name_space) or '{comment}'
M.corrections = {
-- 错音
["hun dun"] = { text = "馄饨", comment = "hún tun" },
["zhu jiao"] = { text = "主角", comment = "zhǔ jué" },
["jiao se"] = { text = "角色", comment = "júe sè" },
["pi sa"] = { text = "比萨", comment = "bǐ sà" },
["chi pi sa"] = { text = "吃比萨", comment = "chī bǐ sà" },
["pi sa bing"] = { text = "比萨饼", comment = "bǐ sà bǐng" },
["shui fu"] = { text = "说服", comment = "shuō fú" },
["dao hang"] = { text = "道行", comment = "dào héng" },
["mo yang"] = { text = "模样", comment = "mú yàng" },
["you mo you yang"] = { text = "有模有样", comment = "yǒu mú yǒu yàng" },
["yi mo yi yang"] = { text = "一模一样", comment = "yī mú yī yàng" },
["zhuang mo zuo yang"] = { text = "装模作样", comment = "zhuāng mú zuò yàng" },
["ren mo gou yang"] = { text = "人模狗样", comment = "rén mú góu yàng" },
["mo ban"] = { text = "模板", comment = "mú bǎn" },
["a mi tuo fo"] = { text = "阿弥陀佛", comment = "ē mí tuó fó" },
["na mo a mi tuo fo"] = { text = "南无阿弥陀佛", comment = "nā mó ē mí tuó fó" },
["nan wu a mi tuo fo"] = { text = "南无阿弥陀佛", comment = "nā mó ē mí tuó fó" },
["nan wu e mi tuo fo"] = { text = "南无阿弥陀佛", comment = "nā mó ē mí tuó fó" },
["gei yu"] = { text = "给予", comment = "jǐ yǔ" },
["bin lang"] = { text = "槟榔", comment = "bīng láng" },
["zhang bai zhi"] = { text = "张柏芝", comment = "zhāng bó zhī" },
["teng man"] = { text = "藤蔓", comment = "téng wàn" },
["nong tang"] = { text = "弄堂", comment = "lòng táng" },
["xin kuan ti pang"] = { text = "心宽体胖", comment = "xīn kūan tǐ pán" },
["mai yuan"] = { text = "埋怨", comment = "mán yuàn" },
["xu yu wei she"] = { text = "虚与委蛇", comment = "xū yǔ wēi yí" },
["mu na"] = { text = "木讷", comment = "mù nè" },
["du le le"] = { text = "独乐乐", comment = "dú yuè lè" },
["zhong le le"] = { text = "众乐乐", comment = "zhòng yuè lè" },
["xun ma"] = { text = "荨麻", comment = "qián má" },
["qian ma zhen"] = { text = "荨麻疹", comment = "xún má zhěn" },
["mo ju"] = { text = "模具", comment = "mú jù" },
["cao zhi"] = { text = "草薙", comment = "cǎo tì" },
["cao zhi jing"] = { text = "草薙京", comment = "cǎo tì jīng" },
["cao zhi jian"] = { text = "草薙剑", comment = "cǎo tì jiàn" },
["jia ping ao"] = { text = "贾平凹", comment = "jià píng wā" },
["xue fo lan"] = { text = "雪佛兰", comment = "xuě fú lán" },
["qiang jin"] = { text = "强劲", comment = "qiáng jìng" },
["tong ti"] = { text = "胴体", comment = "dòng tǐ" },
["li neng kang ding"] = { text = "力能扛鼎", comment = "lì néng gāng dǐng" },
["ya lv jiang"] = { text = "鸭绿江", comment = "yā lù jiāng" },
["da fu bian bian"] = { text = "大腹便便", comment = "dà fù pián pián" },
["ka bo zi"] = { text = "卡脖子", comment = "qiǎ bó zi" },
["zhi sheng"] = { text = "吱声", comment = "zī shēng" },
["chan he"] = { text = "掺和", comment = "chān huo" },
["can huo"] = { text = "掺和", comment = "chān huo" },
["can he"] = { text = "掺和", comment = "chān huo" },
["cheng zhi"] = { text = "称职", comment = "chèn zhí" },
["luo shi fen"] = { text = "螺蛳粉", comment = "luó sī fěn" },
["tiao huan"] = { text = "调换", comment = "diào huàn" },
["tai xing shan"] = { text = "太行山", comment = "tài háng shān" },
["jie si di li"] = { text = "歇斯底里", comment = "xiē sī dǐ lǐ" },
["nuan he"] = { text = "暖和", comment = "nuǎn huo" },
["mo ling liang ke"] = { text = "模棱两可", comment = "mó léng liǎng kě" },
["pan yang hu"] = { text = "鄱阳湖", comment = "pó yáng hú" },
["bo jing"] = { text = "脖颈", comment = "bó gěng" },
["bo jing er"] = { text = "脖颈儿", comment = "bó gěng er" },
["jie zha"] = { text = "结扎", comment = "jié zā" },
["hai shen wei"] = { text = "海参崴", comment = "hǎi shēn wǎi" },
["hou pu"] = { text = "厚朴", comment = "hòu pò " },
["da wan ma"] = { text = "大宛马", comment = "dà yuān mǎ" },
["ci ya"] = { text = "龇牙", comment = "zī yá" },
["ci zhe ya"] = { text = "龇着牙", comment = "zī zhe yá" },
["ci ya lie zui"] = { text = "龇牙咧嘴", comment = "zī yá liě zuǐ" },
["tou pi xue"] = { text = "头皮屑", comment = "tóu pi xiè" },
["liu an shi"] = { text = "六安市", comment = "lù ān shì" },
["liu an xian"] = { text = "六安县", comment = "lù ān xiàn" },
["an hui sheng liu an shi"] = { text = "安徽省六安市", comment = "ān huī shěng lù ān shì" },
["an hui liu an"] = { text = "安徽六安", comment = "ān huī lù ān" },
["an hui liu an shi"] = { text = "安徽六安市", comment = "ān huī lù ān shì" },
["nan jing liu he"] = { text = "南京六合", comment = "nán jīng lù hé" },
["nan jing shi liu he"] = { text = "南京六合区", comment = "nán jīng lù hé qū" },
["nan jing shi liu he qu"] = { text = "南京市六合区", comment = "nán jīng shì lù hé qū" },
-- 错字
["pu jie"] = { text = "扑街", comment = "仆街" },
["pu gai"] = { text = "扑街", comment = "仆街" },
["pu jie zai"] = { text = "扑街仔", comment = "仆街仔" },
["pu gai zai"] = { text = "扑街仔", comment = "仆街仔" },
["ceng jin"] = { text = "曾今", comment = "曾经" },
["an nai"] = { text = "按耐", comment = "按捺(nà)" },
["an nai bu zhu"] = { text = "按耐不住", comment = "按捺(nà)不住" },
["bie jie"] = { text = "别介", comment = "别价(jie)" },
["beng jie"] = { text = "甭介", comment = "甭价(jie)" },
["xue mai pen zhang"] = { text = "血脉喷张", comment = "血脉贲(bēn)张 | 血脉偾(fèn)张" },
["qi ke fu"] = { text = "契科夫", comment = "契诃(hē)夫" },
["zhao cha"] = { text = "找茬", comment = "找碴" },
["zhao cha er"] = { text = "找茬儿", comment = "找碴儿" },
["da jia lai zhao cha"] = { text = "大家来找茬", comment = "大家来找碴" },
["da jia lai zhao cha er"] = { text = "大家来找茬儿", comment = "大家来找碴儿" },
["cou huo"] = { text = "凑活", comment = "凑合(he)" },
["ju hui"] = { text = "钜惠", comment = "巨惠" },
["mo xie zuo"] = { text = "魔蝎座", comment = "摩羯(jié)座" },
["nuo da"] = { text = "诺大", comment = "偌(ruò)大" },
}
end

return corrector
function M.func(input, env)
for cand in input:iter() do
-- cand.comment 是目前输入的词汇的完整拼音
local pinyin = cand.comment:match("^[(.-)]$")
if pinyin and #pinyin > 0 then
if env.delimiter then
pinyin = pinyin:gsub(env.delimiter,' ')
end
local c = M.corrections[pinyin]
if c and cand.text == c.text then
cand:get_genuine().comment = string.gsub(M.style, "{comment}", c.comment)
else
cand:get_genuine().comment = ""
end
end
yield(cand)
end
end

return M
8 changes: 8 additions & 0 deletions rime_mint.schema.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -142,12 +142,20 @@ reduce_english_filter:
# 中国农历配置
chineseLunarCalendar_translator: lunar

# Lua 配置:为 corrector 格式化 comment,占位符为 {comment}
# 默认 "{comment}" 输入 hun dun 时会在「馄饨」旁边生成 hún tun 的 comment
# 例如左右加个括号 "({comment})" 就会变成 (hún tun)
corrector: "{comment}"

translator:
# enable_correction: true # Rime自带的按键纠错,适用于手机26键 参考: https://github.com/rime/librime/pull/228
# 字典文件
dictionary: rime_mint # 使用的字典文件
spelling_hints: 8 # corrector.lua :为了让错音错字提示的 Lua 同时适配全拼双拼,将拼音显示在 comment 中
always_show_comments: true # corrector.lua :Rime 默认在 preedit 等于 comment 时取消显示 comment,这里强制一直显示,供 corrector.lua 做判断用。
comment_format: # 标记拼音注释,供 corrector.lua 做判断用
- xform/^/[/
- xform/$/]/
preedit_format:
- xform/([nl])v/$1ü/
- xform/([nl])ue/$1üe/
Expand Down
8 changes: 8 additions & 0 deletions rime_mint_flypy.schema.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -142,13 +142,21 @@ reduce_english_filter:
# 中国农历配置
chineseLunarCalendar_translator: lunar

# Lua 配置:为 corrector 格式化 comment,占位符为 {comment}
# 默认 "{comment}" 输入 hun dun 时会在「馄饨」旁边生成 hún tun 的 comment
# 例如左右加个括号 "({comment})" 就会变成 (hún tun)
corrector: "{comment}"

translator:
# enable_correction: true # Rime自带的按键纠错,适用于手机26键 参考: https://github.com/rime/librime/pull/228
# 字典文件
dictionary: rime_mint # 使用的字典文件
prism: rime_mint_flypy # 多方案共用一个词库时,为避免冲突,需要用 prism 指定一个名字。
spelling_hints: 8 # corrector.lua :为了让错音错字提示的 Lua 同时适配全拼双拼,将拼音显示在 comment 中
always_show_comments: true # corrector.lua :Rime 默认在 preedit 等于 comment 时取消显示 comment,这里强制一直显示,供 corrector.lua 做判断用。
comment_format: # 标记拼音注释,供 corrector.lua 做判断用
- xform/^/[/
- xform/$/]/
preedit_format:
- xform/([nl])v/$1ü/
- xform/([nl])ue/$1üe/
Expand Down

0 comments on commit bf7a4e3

Please sign in to comment.