Skip to content

For a given query execution, allow data to be returned in columnar format and deserialized to numpy objects #120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 4, 2025
Prev Previous commit
docs: Update changelog and add documentation on set_query_execution_args
  • Loading branch information
paxcodes committed Apr 3, 2025
commit 3ad3fd379b9d1c287b40c69d0612ec1bd28cd417
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
### 1.4.0

- feat: #119 Allow query results returned in columns and deserialized to `numpy` objects

### 1.3.2

- feat(aggragation-function): add anyLast function.
Expand Down
55 changes: 55 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ Read [Documentation](https://github.com/jayvynl/django-clickhouse-backend/blob/m
- Support most clickhouse data types.
- Support [SETTINGS in SELECT Query](https://clickhouse.com/docs/en/sql-reference/statements/select/#settings-in-select-query).
- Support [PREWHERE clause](https://clickhouse.com/docs/en/sql-reference/statements/select/prewhere).
- Support query results returned in columns and [deserialized to `numpy` objects](https://clickhouse-driver.readthedocs.io/en/latest/features.html#numpy-pandas-support).

**Notes:**

Expand Down Expand Up @@ -381,6 +382,60 @@ and [distributed table engine](https://clickhouse.com/docs/en/engines/table-engi
The following example assumes that a cluster defined by [docker compose in this repository](https://github.com/jayvynl/django-clickhouse-backend/blob/main/compose.yaml) is used.
This cluster name is `cluster`, it has 2 shards, every shard has 2 replica.

Query results returned as columns and/or deserialized into `numpy` objects
---

`clickhouse-driver` allows results to be returned as columns and/or deserialized into
`numpy` objects. This backend supports both options by using the context manager,
`Cursor.set_query_execution_args()`.

```python
import numpy as np
from django.db import connection

sql = """
SELECT toDateTime32('2022-01-01 01:00:05', 'UTC'), number, number*2.5
FROM system.numbers
LIMIT 3
"""
with connection.cursor() as cursorWrapper:
with cursorWrapper.cursor.set_query_execution_args(
columnar=True, use_numpy=True
) as cursor:
cursor.execute(sql)
np.testing.assert_equal(
cursor.fetchall(),
[
np.array(
[
np.datetime64("2022-01-01T01:00:05"),
np.datetime64("2022-01-01T01:00:05"),
np.datetime64("2022-01-01T01:00:05"),
],
dtype="datetime64[s]",
),
np.array([0, 1, 2], dtype=np.uint64),
np.array([0, 2.5, 5.0], dtype=np.float64),
],
)

cursor.execute(sql)
np.testing.assert_equal(
cursor.fetchmany(2),
[
np.array(
[
np.datetime64("2022-01-01T01:00:05"),
np.datetime64("2022-01-01T01:00:05"),
np.datetime64("2022-01-01T01:00:05"),
],
dtype="datetime64[s]",
),
np.array([0, 1, 2], dtype=np.uint64),
],
)
```

### Configuration

```python
Expand Down
Loading