Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating a Table With PRIMARY KEY in MySQL With DATETIME Column Results in a Wrong Precision Value in ClickHouse #462

Closed
Selfeer opened this issue Feb 15, 2024 · 3 comments · Fixed by #464
Assignees
Labels
bug Something isn't working GA-1 All the issues that are issues in release(Scheduled Dec 2023) qa-verified label to mark issues that were verified by QA

Comments

@Selfeer
Copy link
Collaborator

Selfeer commented Feb 15, 2024

When creating a table in MySQL with DATETIME columns and specifying precision, the table is replicated to ClickHouse with DateTime64 but with the wrong precision value. This seems to happen only when we specify PRIMARY KEY in MySQL table.

For example,

In MySQL

CREATE TABLE table_1 (id INT NOT NULL PRIMARY KEY, data DATETIME(3));

In ClickHouse

DESC table_1;

┌─name───────┬─type───────────────────────────┬─default_type─┬─default_expression─┬─comment─┬─codec_expression─┬─ttl_expression─┐
│ id         │ Int32                          │              │                    │         │                  │                │
│ data       │ Nullable(DateTime64(0, 'UTC')) │              │                    │         │                  │                │
│ _version   │ UInt64                         │              │                    │         │                  │                │
│ is_deleted │ UInt8                          │              │                    │         │                  │                │
└────────────┴────────────────────────────────┴──────────────┴────────────────────┴─────────┴──────────────────┴────────────────┘

Auto create, creates a table with DateTime64(0) instead of DateTime64(3). This happens for all precision values 1-6.

One thing to note here is that the ClickHouse DateTime64 is created with precision 0 when we specify timezone configurations for sink-connector:

database.connectionTimeZone: "UTC"
database.serverTimezone: "UTC"
clickhouse.datetime.timezone: "UTC"

If we remove these configurations, the issue persists but the ClickHouse DateTime64 precision values always sets to 3 now.

┌─name───────┬─type───────────────────────────┬─default_type─┬─default_expression─┬─comment─┬─codec_expression─┬─ttl_expression─┐
│ id         │ Int32                          │              │                    │         │                  │                │
│ data       │ Nullable(DateTime64(3))        │              │                    │         │                  │                │
│ _version   │ UInt64                         │              │                    │         │                  │                │
│ is_deleted │ UInt8                          │              │                    │         │                  │                │
└────────────┴────────────────────────────────┴──────────────┴────────────────────┴─────────┴──────────────────┴────────────────┘

The issue is gone if we do not specify PRIMARY KEY for the MySQL table.

@aadant
Copy link
Collaborator

aadant commented Feb 16, 2024

Thanks for reporting this. @subkanthi will take a look

@aadant aadant added bug Something isn't working GA-1 All the issues that are issues in release(Scheduled Dec 2023) labels Feb 16, 2024
@aadant
Copy link
Collaborator

aadant commented Feb 20, 2024

I checked that it works fine with the python ddl parser

$ python db_load/mysql_parser/mysql_parser.py test_bug1.sql
2024-02-20 10:49:50,749 - root - MainThread - INFO - source = CREATE TABLE table_1 (id INT NOT NULL PRIMARY KEY, data DATETIME(3))

2024-02-20 10:49:51,084 - root - MainThread - INFO - {'column_name': 'id', 'datatype': 'INT', 'nullable': False, 'mysql_datatype': 'INT', 'generated': False, 'has_is_deleted_column': False}
2024-02-20 10:49:51,085 - root - MainThread - INFO - {'column_name': 'data', 'datatype': 'DateTime64(3)', 'nullable': True, 'mysql_datatype': 'DATETIME(3)', 'generated': False, 'has_is_deleted_column': False}
2024-02-20 10:49:51,085 - root - MainThread - INFO - CREATE TABLE table_1 (id INT NOT NULL,
data DateTime64(3) NULL,
`_version` UInt64 DEFAULT 0,
`is_deleted` UInt8 DEFAULT 0
) engine=ReplacingMergeTree(_version,is_deleted)  order by id
2024-02-20 10:49:51,086 - root - MainThread - DEBUG - (sqlStatements (sqlStatement (ddlStatement (createTable CREATE TABLE (tableName (fullId (uid (simpleId table_1)))) (createDefinitions ( (createDefinition (fullColumnName (uid (simpleId id))) (columnDefinition (dataType INT) (columnConstraint (nullNotnull NOT NULL)) (columnConstraint PRIMARY KEY))) , (createDefinition (fullColumnName (uid (simpleId (keywordsCanBeId data)))) (columnDefinition (dataType DATETIME (lengthOneDimension ( (decimalLiteral 3) ))))) ))))))
2024-02-20 10:49:51,086 - root - MainThread - INFO - target = CREATE TABLE table_1 (id INT NOT NULL,
data DateTime64(3) NULL,
`_version` UInt64 DEFAULT 0,
`is_deleted` UInt8 DEFAULT 0
) engine=ReplacingMergeTree(_version,is_deleted)  order by id

@Selfeer
Copy link
Collaborator Author

Selfeer commented Feb 26, 2024

The issue seems to be fixed, tested on: altinityinfra/clickhouse-sink-connector:464-c979a8c2583ff701dd5b63efb8f36f7fff8111e7-lt

Link to TestFlows tests: https://github.com/Altinity/clickhouse-sink-connector/blob/develop/sink-connector-lightweight/tests/integration/tests/datatypes.py

@Selfeer Selfeer closed this as completed Feb 26, 2024
@Selfeer Selfeer added the qa-verified label to mark issues that were verified by QA label Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working GA-1 All the issues that are issues in release(Scheduled Dec 2023) qa-verified label to mark issues that were verified by QA
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants