Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClickHouse: Support for special characters in column names #2705

Merged
merged 9 commits into from
Mar 17, 2025

Conversation

Amogh-Bharadwaj
Copy link
Contributor

@Amogh-Bharadwaj Amogh-Bharadwaj commented Mar 13, 2025

Avro requires alphanumeric + underscore field names. However Postgres column names do not have the same restriction.

  • This PR introduces a map of column names to an Avro compatible encoding of the same. Violating characters are replaced by underscore and the resulting string is suffixed with a serial number.
  • This is utilized only for the initial load Avro loading since for CDC the columns will be that of our raw table columns.
  • This PR only implements this support for ClickHouse target peers
  • It also supports these columns for our normalize inserts by escaping the single quotes in column name argument of the JSONExtract* functions
  • Functionally tested and the unprivileged columns E2E test has been adapted to include more columns to test this PR

Fixes #2701

@Amogh-Bharadwaj Amogh-Bharadwaj force-pushed the ch-better-column-support branch from 79e7e94 to 19b848d Compare March 14, 2025 14:51
@serprex
Copy link
Member

serprex commented Mar 14, 2025

#2701

@Amogh-Bharadwaj Amogh-Bharadwaj requested a review from serprex March 15, 2025 08:29
@Amogh-Bharadwaj Amogh-Bharadwaj merged commit b946864 into main Mar 17, 2025
9 checks passed
@Amogh-Bharadwaj Amogh-Bharadwaj deleted the ch-better-column-support branch March 17, 2025 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support source column with spaces and other weird characters
2 participants