Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support utf8mb3 charset #1424

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

cposture
Copy link

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has not been approved.

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@CLAassistant
Copy link

CLAassistant commented Aug 31, 2022

CLA assistant check
All committers have signed the CLA.

@ti-chi-bot
Copy link
Member

Welcome @cposture!

It looks like this is your first PR to pingcap/parser 🎉.

I'm the bot to help you request reviewers, add labels and more, See available commands.

We want to make sure your contribution gets all the attention it needs!



Thank you, and welcome to pingcap/parser. 😃

@cposture
Copy link
Author

@bb7133 please review code

@@ -459,6 +462,7 @@ var collations = []*Collation{
{247, "utf8mb4", "utf8mb4_vietnamese_ci", false},
{255, "utf8mb4", "utf8mb4_0900_ai_ci", false},
{2048, "utf8mb4", "utf8mb4_zh_pinyin_tidb_as_cs", false},
{2049, "utf8mb3", "utf8mb3_general_ci", true},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This collation seems to be there under a different name already.

mysql> show collation where charset='utf8mb3' and Collation LIKE '%\_general\_ci';
+--------------------+---------+----+---------+----------+---------+---------------+
| Collation          | Charset | Id | Default | Compiled | Sortlen | Pad_attribute |
+--------------------+---------+----+---------+----------+---------+---------------+
| utf8mb3_general_ci | utf8mb3 | 33 | Yes     | Yes      |       1 | PAD SPACE     |
+--------------------+---------+----+---------+----------+---------+---------------+
1 row in set (0.01 sec)

mysql> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 8.0.30    |
+-----------+
1 row in set (0.01 sec)

There already is this:

	{33, "utf8", "utf8_general_ci", false},

That is similar to this:

	{2049, "utf8mb3", "utf8mb3_general_ci", true},

utf8 is an alias for utf8mb3. I think the change in MySQL 8.0.28 (see https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-utf8.html ) might be related to this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I think it's better to keep 'utf8_general_ci(33)' and just make utf8mb3_general_ci as an alias.

If we want to introduce the alias, maybe we can introduce a new 'alias' table to represent the mapping relations:

  • utf8 <-> utf8mb3
  • utf8_bin <-> utf8mb3_bin
    ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think MySQL (and MariaDB?) are doing the mapping the other way around, so utf8 is an alias for utf8mb3. The result is that tables that are created with current versions and use utf8 show utf8mb3 in the output. Then the plan is to switch the utf8 alias to mean utf8mb4 in the future. Then old tables still show utf8mb3 while new ones show utf8mb4 even if both were created with the same utf8 but just on different versions.

See also:

@dveeden
Copy link
Contributor

dveeden commented Aug 31, 2022

@cposture could you sign the CLA?

@cposture
Copy link
Author

@cposture could you sign the CLA?

It has show "You have agreed to the CLA for pingcap/parser"

@dveeden
Copy link
Contributor

dveeden commented Aug 31, 2022

Please have a look at pingcap/tidb#26226 and pingcap/tidb#31790 as they are related.

Also the parser has moved to a different repo ( https://github.com/pingcap/tidb/tree/master/parser ) as shown in the README. The reason that the pingcap/parser repo isn't archived yet is that it is still used for older versions.

@cposture
Copy link
Author

Please have a look at pingcap/tidb#26226 and pingcap/tidb#31790 as they are related.

Also the parser has moved to a different repo ( https://github.com/pingcap/tidb/tree/master/parser ) as shown in the README. The reason that the pingcap/parser repo isn't archived yet is that it is still used for older versions.

ok

@cposture
Copy link
Author

Please have a look at pingcap/tidb#26226 and pingcap/tidb#31790 as they are related.

Also the parser has moved to a different repo ( https://github.com/pingcap/tidb/tree/master/parser ) as shown in the README. The reason that the pingcap/parser repo isn't archived yet is that it is still used for older versions.

what should i do for https://github.com/pingcap/tidb/issues/26226?update pingcap/tidb go mod?
@dveeden

@dveeden
Copy link
Contributor

dveeden commented Sep 1, 2022

Please have a look at pingcap/tidb#26226 and pingcap/tidb#31790 as they are related.
Also the parser has moved to a different repo ( https://github.com/pingcap/tidb/tree/master/parser ) as shown in the README. The reason that the pingcap/parser repo isn't archived yet is that it is still used for older versions.

what should i do for https://github.com/pingcap/tidb/issues/26226?update pingcap/tidb go mod? @dveeden

The first thing is to move this PR from pingcap/parser to pingcap/tidb. Then try to consider the issues I mentioned as much as possible. We now require a linked issue for PRs on pingcap/tidb, you can use one or both of these issues for that instead of creating a new one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants