-
Notifications
You must be signed in to change notification settings - Fork 489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support utf8mb3 charset #1424
base: master
Are you sure you want to change the base?
support utf8mb3 charset #1424
Conversation
[REVIEW NOTIFICATION] This pull request has not been approved. To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
Welcome @cposture! |
@bb7133 please review code |
@@ -459,6 +462,7 @@ var collations = []*Collation{ | |||
{247, "utf8mb4", "utf8mb4_vietnamese_ci", false}, | |||
{255, "utf8mb4", "utf8mb4_0900_ai_ci", false}, | |||
{2048, "utf8mb4", "utf8mb4_zh_pinyin_tidb_as_cs", false}, | |||
{2049, "utf8mb3", "utf8mb3_general_ci", true}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This collation seems to be there under a different name already.
mysql> show collation where charset='utf8mb3' and Collation LIKE '%\_general\_ci';
+--------------------+---------+----+---------+----------+---------+---------------+
| Collation | Charset | Id | Default | Compiled | Sortlen | Pad_attribute |
+--------------------+---------+----+---------+----------+---------+---------------+
| utf8mb3_general_ci | utf8mb3 | 33 | Yes | Yes | 1 | PAD SPACE |
+--------------------+---------+----+---------+----------+---------+---------------+
1 row in set (0.01 sec)
mysql> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 8.0.30 |
+-----------+
1 row in set (0.01 sec)
There already is this:
{33, "utf8", "utf8_general_ci", false},
That is similar to this:
{2049, "utf8mb3", "utf8mb3_general_ci", true},
utf8
is an alias for utf8mb3
. I think the change in MySQL 8.0.28 (see https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-utf8.html ) might be related to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, I think it's better to keep 'utf8_general_ci(33)' and just make utf8mb3_general_ci
as an alias.
If we want to introduce the alias, maybe we can introduce a new 'alias' table to represent the mapping relations:
utf8
<->utf8mb3
utf8_bin
<->utf8mb3_bin
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think MySQL (and MariaDB?) are doing the mapping the other way around, so utf8
is an alias for utf8mb3
. The result is that tables that are created with current versions and use utf8
show utf8mb3
in the output. Then the plan is to switch the utf8
alias to mean utf8mb4
in the future. Then old tables still show utf8mb3
while new ones show utf8mb4
even if both were created with the same utf8
but just on different versions.
See also:
@cposture could you sign the CLA? |
It has show "You have agreed to the CLA for pingcap/parser" |
Please have a look at pingcap/tidb#26226 and pingcap/tidb#31790 as they are related. Also the parser has moved to a different repo ( https://github.com/pingcap/tidb/tree/master/parser ) as shown in the README. The reason that the pingcap/parser repo isn't archived yet is that it is still used for older versions. |
1f320e2
to
4aea0bc
Compare
ok |
what should i do for https://github.com/pingcap/tidb/issues/26226?update pingcap/tidb go mod? |
The first thing is to move this PR from pingcap/parser to pingcap/tidb. Then try to consider the issues I mentioned as much as possible. We now require a linked issue for PRs on pingcap/tidb, you can use one or both of these issues for that instead of creating a new one. |
support utf8mb3 charset
refer https://dev.mysql.com/doc/refman/8.0/en/charset-charsets.html