Skip to content

Conversation

@TeddyCr
Copy link
Collaborator

@TeddyCr TeddyCr commented Dec 19, 2025

Describe your changes:

Fixes #22689

I worked on ... because ...

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

  • ClickHouse data type support:
    • Added LOWCARDINALITY to supportedDataTypes for 8 test definitions that support STRING types
  • Test definitions updated:
    • Modified columnValueLengthsToBeBetween, columnValuesMissingCount, columnValuesToBeInSet, columnValuesToBeNotInSet, columnValuesToBeNotNull, columnValuesToBeUnique, columnValuesToMatchRegex, columnValuesToNotMatchRegex
  • Database migrations:
    • Created idempotent migration scripts for MySQL and PostgreSQL in 1.11.4 to update existing test_definition records
  • Additional improvements:
    • Added DATE, DATETIME, TIMESTAMP support to columnValuesToBeInSet for temporal type validation

This will update automatically on new commits.


pmbrull
pmbrull previously approved these changes Dec 19, 2025
@sonarqubecloud
Copy link

@TeddyCr TeddyCr merged commit 66f2cb4 into open-metadata:main Dec 20, 2025
24 of 26 checks passed
@github-actions
Copy link
Contributor

Failed to cherry-pick changes to the 1.11.4 branch.
Please cherry-pick the changes manually.
You can find more details here.

@gitar-bot
Copy link

gitar-bot bot commented Dec 20, 2025

Code Review ⚠️ Changes requested

Data type mismatch between migration scripts and JSON definitions for columnValuesToBeInSet test.

⚠️ Bug: Migration/JSON mismatch: DATE/DATETIME/TIMESTAMP added only in JSON

📄 openmetadata-service/src/main/resources/json/data/tests/columnValuesToBeInSet.json:8 📄 bootstrap/sql/migrations/native/1.11.4/mysql/postDataMigrationSQLScript.sql:19-24 📄 bootstrap/sql/migrations/native/1.11.4/postgres/postDataMigrationSQLScript.sql:37-46

In columnValuesToBeInSet.json, the PR adds not just LOWCARDINALITY but also DATE, DATETIME, and TIMESTAMP to the supportedDataTypes:

"supportedDataTypes": [..., "BOOLEAN", "LOWCARDINALITY", "DATE", "DATETIME", "TIMESTAMP"]

However, the MySQL and PostgreSQL migration scripts only add LOWCARDINALITY for this test definition:

WHERE name = 'columnValuesToBeInSet'
  AND JSON_CONTAINS(JSON_EXTRACT(json, '$.supportedDataTypes'), '"STRING"')
  AND NOT JSON_CONTAINS(JSON_EXTRACT(json, '$.supportedDataTypes'), '"LOWCARDINALITY"');

This means:

  • New installations will have DATE, DATETIME, TIMESTAMP support for this test
  • Existing installations (running migrations) will NOT get DATE, DATETIME, TIMESTAMP support

If the intent is to add these three data types, the migration scripts should be updated to also append them. If these were added by mistake in the JSON, they should be removed.

What Works Well

  • Consistent structure between MySQL and PostgreSQL migration scripts
  • Proper idempotent guards to prevent duplicate entries (checking for existing LOWCARDINALITY before adding)
  • Migration scripts correctly use database-specific JSON functions (JSON_ARRAY_APPEND for MySQL, jsonb_set with array concatenation for PostgreSQL)

Recommendations

  • Align the migration scripts with the JSON file for columnValuesToBeInSet - the JSON adds DATE, DATETIME, and TIMESTAMP in addition to LOWCARDINALITY, but migrations only add LOWCARDINALITY. Either update migrations to include these types or remove them from the JSON file.
Options

Auto-apply is off Gitar will not commit updates to this branch.
✅ Code review is on Gitar will review this change.
Display: compact Hiding non-applicable rules.

Comment with these commands to change:

Auto-apply ✅ Code review Compact
gitar auto-apply:on         
gitar code-review:off         
gitar display:verbose         

This comment will update automatically (Docs)

TeddyCr added a commit that referenced this pull request Dec 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs To release Will cherry-pick this PR into the release branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Clickhouse - Cannot select Column Test type for columns with data type lowcardinality<string>

2 participants