Skip to content

Conversation

@rishabh1815769
Copy link

@rishabh1815769 rishabh1815769 commented Oct 29, 2025

Summary
Fixes type casting issues for ClickHouse wrapper types such as Nullable(), Array(), and LowCardinality() by extracting the inner type before mapping to Rust Wren Engine column types.

Problem
Previously, ClickHouse columns with wrapper types like Nullable(String), Array(Int32) etc were not being properly recognized and would be marked as UNKNOWN type, causing issues with type inference and query execution.

Summary by CodeRabbit

  • New Features

    • Added support for datetime64 column type in ClickHouse metadata.
  • Bug Fixes

    • Improved handling of wrapped ClickHouse data types (Array, Nullable, LowCardinality).
    • Fixed column nullability detection based on data type wrappers.

@github-actions github-actions bot added ibis python Pull requests that update Python code labels Oct 29, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 29, 2025

Walkthrough

Enhanced ClickHouse metadata handling by introducing an inner-type extraction helper to recursively unwrap nested type wrappers (Nullable, Array, LowCardinality), extended type mapping to include "datetime64" as TIMESTAMP, and updated column parsing to derive nullability from type structure.

Changes

Cohort / File(s) Summary
ClickHouse Type Mapping & Inner-Type Extraction
ibis-server/app/model/metadata/clickhouse.py
Added _extract_inner_type() helper to recursively unwrap ClickHouse type wrappers and extract base types. Updated CLICKHOUSE_TYPE_MAPPING to include "datetime64" → TIMESTAMP. Modified _transform_column_type() to use extracted inner types for normalization. Updated get_table_list() to compute column nullability from Nullable(...) wrapper presence, setting Column.notNull accordingly.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Pay attention to the _extract_inner_type() logic to verify it correctly handles all nested wrapper combinations (Nullable, Array, LowCardinality)
  • Review the nullability derivation logic in get_table_list() to confirm the negation of nullable is applied correctly
  • Verify type mapping coverage for all ClickHouse types, especially edge cases where "datetime64" may interact with other wrappers

Poem

🐰✨ A rabbit hops through nested types with glee,
Unwrapping layers—Nullable, Array, see!
datetime64 now shines as TIMESTAMP so bright,
Inner types extracted with logic just right!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "fix(ibis): add support for ClickHouse wrapper types (Nullable, Array, LowCardinality)" is fully related to the main change in the changeset. The title clearly and specifically describes the primary objective: adding support for ClickHouse wrapper types, and it explicitly names the three wrapper types being addressed (Nullable, Array, LowCardinality). The changes directly reflect this intent by introducing a new _extract_inner_type helper method to extract base types from these wrappers and updating the type mapping logic accordingly. The title is concise, uses a conventional fix prefix, avoids vague language or noise, and provides sufficient specificity that a teammate reviewing the git history would immediately understand the core purpose of these changes.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3c14e99 and add2d04.

📒 Files selected for processing (1)
  • ibis-server/app/model/metadata/clickhouse.py (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
ibis-server/app/model/metadata/clickhouse.py (2)
ibis-server/app/model/metadata/dto.py (2)
  • RustWrenEngineColumnType (13-58)
  • Column (61-67)
ibis-server/app/model/metadata/athena.py (1)
  • _transform_column_type (140-160)
🔇 Additional comments (2)
ibis-server/app/model/metadata/clickhouse.py (2)

34-34: Good addition: DateTime64 support.

Mapping "datetime64" to TIMESTAMP is correct for ClickHouse.


124-127: Using an inner-type extractor is the right approach.

This is fine once the extractor is correctly defined and always returns a string. Ensure the helper fix below is applied first.

Comment on lines +93 to 101
is_nullable = 'nullable(' in row["data_type"].lower()
unique_tables[schema_table].columns.append(
Column(
name=row["column_name"],
type=self._transform_column_type(row["data_type"]),
notNull=False,
notNull=not is_nullable,
description=row["column_comment"],
properties=None,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Nullability detection is incorrect for Array(Nullable(...)); only top-level Nullable should count.

Sub-string check flags arrays-of-nullable elements as nullable columns. Handle wrapper chain: Nullable(...), LowCardinality(Nullable(...)) → nullable; Array(Nullable(...)) → not nullable at column level.

Apply this minimal change here:

-            is_nullable = 'nullable(' in row["data_type"].lower()
+            is_nullable = self._is_column_nullable(row["data_type"])

Add this helper inside ClickHouseMetadata (see additional snippet below).

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In ibis-server/app/model/metadata/clickhouse.py around lines 93-101, the current
substring check flags types like Array(Nullable(...)) as nullable; add a helper
method on ClickHouseMetadata (e.g., is_top_level_nullable(data_type: str) ->
bool) that lowercases and trims the type and returns True only when the
outermost wrapper is Nullable(...) or when the outermost wrapper is
LowCardinality(...) whose immediate inner wrapper is Nullable(...); do not treat
Nullable(...) inside Array(...) as top-level nullable. Replace the current
'nullable = "nullable(" in row["data_type"].lower()' logic to call this helper
when computing notNull.

Comment on lines +138 to +158
def _extract_inner_type(self, data_type: str) -> str:
"""Extract the inner type from ClickHouse type definitions.
This handles types wrapped in Nullable(...), Array(...), etc.
Args:
data_type: The ClickHouse data type string
Returns:
The extracted inner type string
"""

if '(' in data_type and data_type.endswith(')'):
paren_start = data_type.find('(')
type_name = data_type[:paren_start].lower()
inner = data_type[paren_start + 1:-1]

if type_name in ['nullable', 'array', 'lowcardinality']:
return self._extract_inner_type(inner)
else:
return type_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Bug: _extract_inner_type is defined at module scope and can return None → runtime crash.

  • Defined outside the class but called as self._extract_inner_type(...) → AttributeError.
  • No default return when there are no parentheses → inner_type becomes None and .lower() at Line 126 crashes.

Replace this block with a method inside ClickHouseMetadata and guarantee a string return:

- def _extract_inner_type(self, data_type: str) -> str:
-         """Extract the inner type from ClickHouse type definitions.
- 
-         This handles types wrapped in Nullable(...), Array(...), etc.
- 
-         Args:
-             data_type: The ClickHouse data type string
- 
-         Returns:
-             The extracted inner type string
-         """
- 
-         if '(' in data_type and data_type.endswith(')'):
-             paren_start = data_type.find('(')
-             type_name = data_type[:paren_start].lower()
-             inner = data_type[paren_start + 1:-1]
- 
-             if type_name in ['nullable', 'array', 'lowcardinality']:
-                 return self._extract_inner_type(inner)
-             else:
-                 return type_name
+    def _extract_inner_type(self, data_type: str) -> str:
+        """Extract base type from ClickHouse wrappers like Nullable(...), Array(...), LowCardinality(...)."""
+        s = (data_type or "").strip()
+        while True:
+            if "(" in s and s.endswith(")"):
+                paren_start = s.find("(")
+                type_name = s[:paren_start].strip().lower()
+                inner = s[paren_start + 1 : -1].strip()
+                if type_name in ("nullable", "array", "lowcardinality"):
+                    s = inner
+                    continue
+                # Non-wrapper with params, e.g., Decimal(10,2) or FixedString(16)
+                return type_name
+            # No params/wrappers; return as-is (caller lowercases)
+            return s

Additionally, add this helper inside the same class to correctly compute column nullability:

def _is_column_nullable(self, data_type: str) -> bool:
    s = (data_type or "").strip()
    while True:
        s_l = s.lower()
        if s_l.startswith("nullable(") and s.endswith(")"):
            return True
        if s_l.startswith("lowcardinality(") and s.endswith(")"):
            # unwrap and continue; LowCardinality(Nullable(T)) => nullable
            s = s[s.find("(") + 1 : -1].strip()
            continue
        if s_l.startswith("array(") and s.endswith(")"):
            # Array(Nullable(T)) does not make the column nullable
            return False
        return False
🤖 Prompt for AI Agents
In ibis-server/app/model/metadata/clickhouse.py around lines 138 to 158, move
_extract_inner_type into the ClickHouseMetadata class (so it can be called as
self._extract_inner_type) and make it always return a string: when the input
contains parentheses unwrap permitted wrappers (nullable, array, lowcardinality)
recursively and return the inner type name lowercased; when there are no
parentheses simply return data_type.strip().lower() (never None). Also add the
_is_column_nullable helper as an instance method in the same class exactly as
described in the comment to correctly determine nullability (handle Nullable,
LowCardinality wrapping and Array semantics), and use these methods where inner
type and nullability are computed.

@goldmedal goldmedal changed the title Add support for ClickHouse wrapper types (Nullable, Array, LowCardinality) fix(ibis): add support for ClickHouse wrapper types (Nullable, Array, LowCardinality) Oct 30, 2025
@goldmedal
Copy link
Contributor

Thanks @rishabh1815769 for working on this. There is a ruff check failure. You can fix it using our formatting command.

  • Install just
  • Execute just fmt at ibis-server

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ibis python Pull requests that update Python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants