Skip to content

Support UTF32 character set #895

Closed
Closed
@ghost

Description

0 Note

0.1 编码

  • UTF8, GBK等都是变长编码, 一个字符占1~4字节.
    • 在单字节ASCII范围(0~127)内, 上述编码兼容.
  • UTF32为定长4-Byte编码.

0.2 MySQL参数和含义

SELECT CHARACTER_SET_NAME, COLLATION_NAME, ID FROM INFORMATION_SCHEMA.COLLATIONS ORDER BY ID;
show variables like 'character%';
show variables like 'collation%';

https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html :

  • character_set_client: The character set for statements that arrive from the client.
  • character_set_connection: The character set used for literals specified without a character set introducer and for
    number-to-string conversion.
    • introducer: _utf8'abc' COLLATE utf8_danish_ci
  • character_set_database: set only by server
  • character_set_filesystem: This variable is used to interpret string literals that refer to file names.
    • LOAD DATA and SELECT ... INTO OUTFILE statements and etc
    • Such file names are converted from character_set_client to character_set_filesystem
  • character_set_results: The character set used for returning query results to the client.
  • character_set_server: The servers default character set.
  • character_set_system: The value is always utf8.

其中 character_set_system为global only. 其余为global/session.

0.3 dtle 行为

  • Extractor.readMySqlCharsetSystemVariables() 获取character|collation_set_server, 并生成set statement.
  • Applier.ApplyEventQueries()在所有连接上执行(a.dba.dbs[i])

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions