Skip to content

charset: support collation utf8mb4_unicode_ci and utf8_unicode_ci #17596

Closed
@bb7133

Description

Feature Request

Is your feature request related to a problem? Please describe:
Currently, TiDB doesn't support utf8mb4_unicode_ci and utf8_unicode_ci when new collation is enabled.

tidb> set names utf8 collate utf8_unicode_ci;
ERROR 1273 (HY000): Unsupported collation when new collation is enabled: 'utf8_unicode_ci'.

unicode_ci is a widely used collation in MySQL, it would be better if TiDB can support it.

Describe the feature you'd like:
Support collation utf8mb4_unicode_ci and utf8_unicode_ci when new collation is enabled.
Besides implementing the algorithm for unicode_ci, we need to think over how to incorporate it into the current new collation frame.
For example, what if concat(general_ci_str, unicode_ci_str)? How constant propagation work with unicode_ci?

Mentor(s)

Contact the mentors: #ddl-sig channel in TiDB Community Slack Workspace

Recommended Skills

  • Golang
  • Rust

Learning Materials

Schedule

  • GanttStart: 2020-07-01
  • GanttDue: 2020-10-15
  • GanttProgress: 100%

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

component/charsetfeature/acceptedThis feature request is accepted by product managerspriority/P0The issue has P0 priority.type/feature-requestCategorizes issue or PR as related to a new feature.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions