Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: support database-level collations #16618

Open
eisenstatdavid opened this issue Jun 20, 2017 · 2 comments
Open

sql: support database-level collations #16618

eisenstatdavid opened this issue Jun 20, 2017 · 2 comments
Labels
A-schema-changes A-schema-descriptors Relating to SQL table/db descriptor handling. A-sql-encoding Relating to the SQL/KV encoding. A-sql-pgcompat Semantic compatibility with PostgreSQL A-sql-semantics C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) X-anchored-telemetry The issue number is anchored by telemetry references.

Comments

@eisenstatdavid
Copy link

eisenstatdavid commented Jun 20, 2017

Feature request broken off from #2473 .

root@:26257/> create database foo lc_collate = de;
pq: unsupported collation: de

Jira issue: CRDB-6061

@Enver-Yilmaz
Copy link

I suggest you to look at DUCET "Default Unicode Collation Element Table" https://en.wikipedia.org/wiki/Unicode_collation_algorithm
With this algorithm, you can achieve sensible sort covers all languages. It isn't perfect but acceptable for many. You should do this as case insensitive and accent sensitive manner.

ICU library has this as "root" collation and I guess go has support for ICU in https://github.com/golang/text repository. ICU has many options for example you can collate with german locale, case and accent insensitive and phonebook sort which is special sorting for used on only german phonebooks.

Postgresql has it in version 10 but they didn't support case and accent sensitivity. Main problem with Postgresql was it's used OS libraries for collation handling. This is problematic many ways because collation algorithms updated out of control and indexes became corrupted with changed rules. So they adopt ICU library to be able to version collation algorithm on index. With glibc this is impossible. But they still didn't support case insensitive collations and this is no go for many users uses ORM tools to manage schema and access.

MySQL has this since version 5.5, they call utf8mb4. You can look at http://mysqlserverteam.com/new-collations-in-mysql-8-0-0 for what's coming with 8. They made it default for new db.

@eisenstatdavid
Copy link
Author

Thanks, we already implemented support for collation at the column level via golang/text.

@knz knz added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) and removed feature labels Apr 24, 2018
@knz knz added A-sql-pgcompat Semantic compatibility with PostgreSQL A-sql-encoding Relating to the SQL/KV encoding. labels Apr 27, 2018
@knz knz added the A-schema-descriptors Relating to SQL table/db descriptor handling. label Apr 27, 2018
ivan added a commit to ludiosarchive/postgrex that referenced this issue Jul 9, 2018
This is necessary for CockroachDB support:
cockroachdb/cockroach#16618

CockroachDB's CREATE DATABASE currently supports only "C" and "C.UTF-8"
collations.
ivan added a commit to ludiosarchive/postgrex that referenced this issue Jul 11, 2018
This is necessary for CockroachDB support:
cockroachdb/cockroach#16618

CockroachDB's CREATE DATABASE currently supports only "C" and "C.UTF-8"
collations.
@petermattis petermattis removed this from the Later milestone Oct 5, 2018
@apantel apantel added the X-anchored-telemetry The issue number is anchored by telemetry references. label Dec 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-schema-changes A-schema-descriptors Relating to SQL table/db descriptor handling. A-sql-encoding Relating to the SQL/KV encoding. A-sql-pgcompat Semantic compatibility with PostgreSQL A-sql-semantics C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) X-anchored-telemetry The issue number is anchored by telemetry references.
Projects
None yet
Development

No branches or pull requests

7 participants