An async http(s) ClickHouse client for python 3.6+ supporting type conversion in both directions, streaming, lazy decoding on select queries, and a fully typed interface.
You can use it with either aiohttp or httpx http connectors.
To use with aiohttp
install it with command:
> pip install aiochclient[aiohttp]
Or aiochclient[aiohttp-speedups]
to install with extra speedups.
To use with httpx
install it with command:
> pip install aiochclient[httpx]
Or aiochclient[httpx-speedups]
to install with extra speedups.
Installing with [*-speedups]
adds the following:
- cChardet for
aiohttp
speedup - aiodns for
aiohttp
speedup - ciso8601 for ultra-fast datetime
parsing while decoding data from ClickHouse for
aiohttp
andhttpx
.
Additionally the installation process attempts to use Cython for a speed boost (roughly 30% faster).
aiochclient
needs aiohttp.ClientSession
or httpx.AsyncClient
to connect to ClickHouse:
from aiochclient import ChClient
from aiohttp import ClientSession
async def main():
async with ClientSession() as s:
client = ChClient(s)
assert await client.is_alive() # returns True if connection is Ok
await client.execute(
"CREATE TABLE t (a UInt8, b Tuple(Date, Nullable(Float32))) ENGINE = Memory"
)
For INSERT queries you can pass values as *args
. Values should be
iterables:
await client.execute(
"INSERT INTO t VALUES",
(1, (dt.date(2018, 9, 7), None)),
(2, (dt.date(2018, 9, 8), 3.14)),
)
For fetching all rows at once use the
fetch
method:
all_rows = await client.fetch("SELECT * FROM t")
For fetching first row from result use the
fetchrow
method:
row = await client.fetchrow("SELECT * FROM t WHERE a=1")
assert row[0] == 1
assert row["b"] == (dt.date(2018, 9, 7), None)
You can also use
fetchval
method, which returns first value of the first row from query result:
val = await client.fetchval("SELECT b FROM t WHERE a=2")
assert val == (dt.date(2018, 9, 8), 3.14)
With async iteration on the query results stream you can fetch multiple rows without loading them all into memory at once:
async for row in client.iterate(
"SELECT number, number*2 FROM system.numbers LIMIT 10000"
):
assert row[0] * 2 == row[1]
Use fetch
/fetchrow
/fetchval
/iterate
for SELECT queries and execute
or
any of last for INSERT and all another queries.
All fetch queries return rows as lightweight, memory efficient objects. Before
v1.0.0
rows were only returned as tuples. All rows have a full mapping interface, where you can
get fields by names or indexes:
row = await client.fetchrow("SELECT a, b FROM t WHERE a=1")
assert row["a"] == 1
assert row[0] == 1
assert row[:] == (1, (dt.date(2018, 9, 8), 3.14))
assert list(row.keys()) == ["a", "b"]
assert list(row.values()) == [1, (dt.date(2018, 9, 8), 3.14)]
To check out the api docs, visit the readthedocs site..
aiochclient
automatically converts types from ClickHouse to python types and
vice-versa.
ClickHouse type | Python type |
---|---|
Bool |
bool |
UInt8 |
int |
UInt16 |
int |
UInt32 |
int |
UInt64 |
int |
UInt128 |
int |
UInt256 |
int |
Int8 |
int |
Int16 |
int |
Int32 |
int |
Int64 |
int |
Int128 |
int |
Int256 |
int |
Float32 |
float |
Float64 |
float |
String |
str |
FixedString |
str |
Enum8 |
str |
Enum16 |
str |
Date |
datetime.date |
DateTime |
datetime.datetime |
DateTime64 |
datetime.datetime |
Decimal |
decimal.Decimal |
Decimal32 |
decimal.Decimal |
Decimal64 |
decimal.Decimal |
Decimal128 |
decimal.Decimal |
IPv4 |
ipaddress.IPv4Address |
IPv6 |
ipaddress.IPv6Address |
UUID |
uuid.UUID |
Nothing |
None |
Tuple(T1, T2, ...) |
Tuple[T1, T2, ...] |
Array(T) |
List[T] |
Nullable(T) |
None or T |
LowCardinality(T) |
T |
Map(T1, T2) |
Dict[T1, T2] |
Nested(T1, T2, ...) |
List[Tuple[T1, T2, ...], Tuple[T1, T2, ...]] |
aiochclient
uses the
aiohttp.TCPConnector
to determine pool size. By default, the pool limit is 100 open connections.
It's highly recommended using uvloop
and installing aiochclient
with
speedups for the sake of speed. Some recent benchmarks on our
machines without parallelization:
- 180k-220k rows/sec on SELECT
- 50k-80k rows/sec on INSERT
Note: these benchmarks are system dependent