Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
24b735f
support_compression: initial
ilejn Aug 27, 2025
6a8badc
support_compression: slightly works
ilejn Aug 29, 2025
7b92ea0
support_compression: LZ4 seems to be working
ilejn Sep 17, 2025
da17f18
support_compression: code cleanup
ilejn Sep 17, 2025
a089e5d
support_compression: RowBinaryWithNamesAndTypes, tiny cleanup
ilejn Sep 17, 2025
64eab4d
support_compression: some doc, test
ilejn Sep 17, 2025
347db71
support_compression: need_more_input + minor improvements
ilejn Sep 19, 2025
937b78f
support_compression: minor things per windsurf-bot' review
ilejn Sep 19, 2025
db61e10
support_compression: irrelevant session_id change reverted, minor
ilejn Sep 25, 2025
b31874f
support_compression: one more null pointer check
ilejn Sep 25, 2025
b15b466
support_compression: destruction order fix (sanitizer complain)
ilejn Sep 30, 2025
19a0829
support_compression: add lz4 library
ilejn Oct 1, 2025
9e98da6
support_compression: include vector (for Win and mac)
ilejn Oct 2, 2025
638008d
support_compression: tiny cleanup
ilejn Oct 6, 2025
8c82dc3
support_compression: stream owning model modified
ilejn Oct 8, 2025
f0c1f5d
Revert "support_compression: stream owning model modified"
ilejn Oct 21, 2025
d2c8c05
Revert "support_compression: tiny cleanup"
ilejn Oct 21, 2025
fae23ed
Revert "support_compression: include vector (for Win and mac)"
ilejn Oct 21, 2025
3a6ce31
Revert "support_compression: add lz4 library"
ilejn Oct 21, 2025
d8942dd
Revert "support_compression: destruction order fix (sanitizer complain)"
ilejn Oct 21, 2025
76f625b
Revert "support_compression: one more null pointer check"
ilejn Oct 21, 2025
02dd8e4
Revert "support_compression: irrelevant session_id change reverted, m…
ilejn Oct 21, 2025
773ed70
Revert "support_compression: minor things per windsurf-bot' review"
ilejn Oct 21, 2025
6541bf9
Revert "support_compression: need_more_input + minor improvements"
ilejn Oct 21, 2025
1384159
Revert "support_compression: RowBinaryWithNamesAndTypes, tiny cleanup"
ilejn Oct 21, 2025
77e299c
Revert "support_compression: code cleanup"
ilejn Oct 21, 2025
a944fe6
Revert "support_compression: LZ4 seems to be working"
ilejn Oct 21, 2025
1ce050c
support_compression: reapplied everything non-LZ4 related
ilejn Oct 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,8 +109,9 @@ The list of DSN parameters recognized by the driver is as follows:
| `HugeIntAsString` | `off` | Report integer column types that may underflow or overflow 64-bit signed integer (`SQL_BIGINT`) as a `String`/`SQL_VARCHAR` |
| `DriverLog` | `on` if `CMAKE_BUILD_TYPE` is `Debug`, `off` otherwise | Enable or disable the extended driver logging |
| `DriverLogFile` | `\temp\clickhouse-odbc-driver.log` on Windows, `/tmp/clickhouse-odbc-driver.log` otherwise | Path to the extended driver log file (used when `DriverLog` is `on`) |
| `AutoSessionId` | `off` | Auto generate session_id required to use some features of CH (e.g. TEMPORARY TABLE) |
| `ClientName` | empty | Sets additional information about the calling application. This string will be used as a prefix for the User-Agent header.
| `AutoSessionId` | `off` | Auto generate session_id required to use some features of CH (e.g. TEMPORARY TABLE) |
| `ClientName` | empty | Sets additional information about the calling application. This string will be used as a prefix for the User-Agent header. |
| `Compress` | `off` | Pass enable_http_compression=1 parameter to server |


### URL query string
Expand Down
4 changes: 3 additions & 1 deletion driver/config/config.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -253,7 +253,9 @@ key_value_map_t readDSNInfo(const std::string & dsn_utf8) {
INI_DRIVERLOG,
INI_DRIVERLOGFILE,
INI_AUTO_SESSION_ID,
INI_CLIENT_NAME
INI_CLIENT_NAME,
INI_COMPRESS,
INI_USE_COMPRESSION
}
) {
if (
Expand Down
4 changes: 4 additions & 0 deletions driver/config/ini_defines.h
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@
#define INI_DRIVERLOGFILE "DriverLogFile"
#define INI_AUTO_SESSION_ID "AutoSessionId"
#define INI_CLIENT_NAME "ClientName"
#define INI_COMPRESS "Compress"
#define INI_USE_COMPRESSION "UseCompression"

#if defined(UNICODE)
# define INI_DSN_DEFAULT DSN_DEFAULT_UNICODE
Expand All @@ -54,6 +56,8 @@
#define INI_STRINGMAXLENGTH_DEFAULT "1048575"
#define INI_AUTO_SESSION_ID_DEFAULT "off"
#define INI_CLIENT_NAME_DEFAULT ""
#define INI_COMPRESS_DEFAULT "0"
#define INI_USE_COMPRESSION_DEFAULT "0"

#ifdef NDEBUG
# define INI_DRIVERLOG_DEFAULT "off"
Expand Down
21 changes: 21 additions & 0 deletions driver/connection.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ Poco::URI Connection::getUri() const {
bool database_set = false;
bool default_format_set = false;
bool session_id_set = false;
bool enable_http_compression_set = false;

for (const auto& parameter : uri.getQueryParameters()) {
if (Poco::UTF8::icompare(parameter.first, "default_format") == 0) {
Expand All @@ -118,6 +119,10 @@ Poco::URI Connection::getUri() const {
uri.addQueryParameter("session_id", session_id);
}

if (enable_http_compression && !enable_http_compression_set) {
uri.addQueryParameter("enable_http_compression", enable_http_compression ? "1" : "0");
}
Comment on lines +122 to +124
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation only adds the enable_http_compression parameter to the URI when it's set to true. This means if compression is enabled by default on the server side and you want to explicitly disable it, setting enable_http_compression=0 in the connection string won't have any effect. Consider always adding the parameter when it's explicitly set (regardless of value):

Suggested change
if (enable_http_compression && !enable_http_compression_set) {
uri.addQueryParameter("enable_http_compression", enable_http_compression ? "1" : "0");
}
if (!enable_http_compression_set) {
uri.addQueryParameter("enable_http_compression", enable_http_compression ? "1" : "0");
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, 'if compression is enabled by default on the server side' it is fine, data will be compressed.


return uri;
}

Expand Down Expand Up @@ -411,6 +416,20 @@ void Connection::setConfiguration(const key_value_map_t & cs_fields, const key_v
client_name = value;
}
}
else if (Poco::UTF8::icompare(key, INI_COMPRESS) == 0 || Poco::UTF8::icompare(key, INI_USE_COMPRESSION) == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I get it right that:

  • "Compress" and "UseCompression" are synonymous? (
    If yes, then please add that to the docs.

  • having any of those keys in the config without actual value equals to "NO" ?
    If yes, then that is somewhat frustrating, and has to be either fixed or clearly specified in docs.

recognized_key = true;
unsigned int typed_value = 0;
valid_value =
(value.empty() ||
(
Poco::NumberParser::tryParseUnsigned(value, typed_value) &&
(typed_value == 1 || typed_value == 0)
) ||
isYesOrNo(value));
if (valid_value) {
enable_http_compression = (typed_value == 1 || isYes(value));
}
}

return std::make_tuple(recognized_key, valid_value);
};
Expand All @@ -428,6 +447,8 @@ void Connection::setConfiguration(const key_value_map_t & cs_fields, const key_v
const auto & recognized_key = std::get<0>(res);
const auto & valid_value = std::get<1>(res);

// LOG("DSN: known attribute '" << key << "', valid value, '" << valid_value << "'");

if (recognized_key) {
if (!valid_value)
throw std::runtime_error("DSN: bad value '" + value + "' for attribute '" + key + "'");
Expand Down
1 change: 1 addition & 0 deletions driver/connection.h
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ class Connection
std::int32_t stringmaxlength = 0;
bool auto_session_id = false;
std::string client_name;
bool enable_http_compression = false;

public:
std::unique_ptr<Poco::Net::HTTPClientSession> session;
Expand Down
9 changes: 9 additions & 0 deletions driver/format/ODBCDriver2.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -254,6 +254,15 @@ ODBCDriver2ResultReader::ODBCDriver2ResultReader(const std::string & timezone_,
result_set = std::make_unique<ODBCDriver2ResultSet>(timezone, stream, releaseMutator());
}

ODBCDriver2ResultReader::ODBCDriver2ResultReader(const std::string & timezone_, std::istream & raw_stream, std::unique_ptr<ResultMutator> && mutator, std::unique_ptr<std::istream> && inflating_input_stream)
: ResultReader(timezone_, raw_stream, std::move(mutator), std::move(inflating_input_stream))
{
if (stream.eof())
return;

result_set = std::make_unique<ODBCDriver2ResultSet>(timezone, stream, releaseMutator());
}

bool ODBCDriver2ResultReader::advanceToNextResultSet() {
// ODBCDriver2 format doesn't support multiple result sets in the response,
// so only a basic cleanup is done here.
Expand Down
2 changes: 1 addition & 1 deletion driver/format/ODBCDriver2.h
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
#pragma once

#include "driver/platform/platform.h"
#include "driver/result_set.h"

// Implementation of ResultSet for ODBCDriver2 wire format of ClickHouse.
Expand Down Expand Up @@ -63,6 +62,7 @@ class ODBCDriver2ResultReader
{
public:
explicit ODBCDriver2ResultReader(const std::string & timezone_, std::istream & raw_stream, std::unique_ptr<ResultMutator> && mutator);
explicit ODBCDriver2ResultReader(const std::string & timezone_, std::istream & raw_stream, std::unique_ptr<ResultMutator> && mutator, std::unique_ptr<std::istream> && inflating_input_stream);
virtual ~ODBCDriver2ResultReader() override = default;

virtual bool advanceToNextResultSet() override;
Expand Down
10 changes: 10 additions & 0 deletions driver/format/RowBinaryWithNamesAndTypes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -314,6 +314,16 @@ RowBinaryWithNamesAndTypesResultReader::RowBinaryWithNamesAndTypesResultReader(c
result_set = std::make_unique<RowBinaryWithNamesAndTypesResultSet>(timezone, stream, releaseMutator());
}

RowBinaryWithNamesAndTypesResultReader::RowBinaryWithNamesAndTypesResultReader(const std::string & timezone_, std::istream & raw_stream, std::unique_ptr<ResultMutator> && mutator, std::unique_ptr<std::istream> && inflating_input_stream)
: ResultReader(timezone_, raw_stream, std::move(mutator), std::move(inflating_input_stream))
{
if (stream.eof())
return;

result_set = std::make_unique<RowBinaryWithNamesAndTypesResultSet>(timezone, stream, releaseMutator());

}

bool RowBinaryWithNamesAndTypesResultReader::advanceToNextResultSet() {
// RowBinaryWithNamesAndTypes format doesn't support multiple result sets in the response,
// so only a basic cleanup is done here.
Expand Down
1 change: 1 addition & 0 deletions driver/format/RowBinaryWithNamesAndTypes.h
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ class RowBinaryWithNamesAndTypesResultReader
{
public:
explicit RowBinaryWithNamesAndTypesResultReader(const std::string & timezone, std::istream & raw_stream, std::unique_ptr<ResultMutator> && mutator);
explicit RowBinaryWithNamesAndTypesResultReader(const std::string & timezone, std::istream & raw_stream, std::unique_ptr<ResultMutator> && mutator, std::unique_ptr<std::istream> && inflating_input_stream);
virtual ~RowBinaryWithNamesAndTypesResultReader() override = default;

virtual bool advanceToNextResultSet() override;
Expand Down
38 changes: 32 additions & 6 deletions driver/result_set.cpp
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
#include "driver/result_set.h"
#include "driver/driver.h"
#include "driver/format/ODBCDriver2.h"
#include "driver/format/RowBinaryWithNamesAndTypes.h"

Expand Down Expand Up @@ -308,6 +309,13 @@ ResultReader::ResultReader(const std::string & timezone_, std::istream & raw_str
{
}

ResultReader::ResultReader(const std::string & timezone_, std::istream & raw_stream, std::unique_ptr<ResultMutator> && mutator, std::unique_ptr<std::istream> && inflating_input_stream_)
: timezone(timezone_)
, stream(raw_stream, std::move(inflating_input_stream_))
, result_mutator(std::move(mutator))
{
}

bool ResultReader::hasResultSet() const {
return static_cast<bool>(result_set);
}
Expand All @@ -323,15 +331,33 @@ std::unique_ptr<ResultMutator> ResultReader::releaseMutator() {
return std::move(result_mutator);
}

std::unique_ptr<ResultReader> make_result_reader(const std::string & format, const std::string & timezone, std::istream & raw_stream, std::unique_ptr<ResultMutator> && mutator) {
if (format == "ODBCDriver2") {
return std::make_unique<ODBCDriver2ResultReader>(timezone, raw_stream, std::move(mutator));
std::unique_ptr<ResultReader>
make_result_reader(const std::string &format, const std::string &timezone,
const std::string &compression, std::istream &raw_stream,
std::unique_ptr<ResultMutator> &&mutator) {
std::istream * stream_ptr = nullptr;
std::unique_ptr<std::istream> inflating_input_stream;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this ownership model seems a bit clumsy, maybe it would be worth moving ownership of the wrapping stream to the AmortizedIStreamReader and pass it to the ResultReader ? On the downside that would make AmortizedIStreamReader a tiny bit more complex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted, ownership model is proved to be error prone.
It is implemented in this way because ResultReader is a kind of wrapper around AmortizedIStreamReader and acts as a resource holder and because I tried to keep AmortizedIStreamReader intact.

I moved holder from ResultReader to AmortizedIStreamReader.
Besides this I switched back to reference for raw_stream_ member of AmortizedIStreamReader. It does not change anything, just reduces number of modification.


if (compression == "gzip" || compression == "deflate") {
inflating_input_stream = make_unique<Poco::InflatingInputStream>(raw_stream, Poco::InflatingStreamBuf::STREAM_GZIP);
stream_ptr = inflating_input_stream.get();
} else {
if (!compression.empty())
LOG("Unknown compression method, assuming uncompressed");
stream_ptr = &raw_stream;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just in case, if remote party sends compression="foobar", then it is silently assumed to be "no compression"... I suggest either throwing an error or at least writing a warning to a log.

}
else if (format == "RowBinaryWithNamesAndTypes") {

if (format == "ODBCDriver2") {
return std::make_unique<ODBCDriver2ResultReader>(
timezone, *stream_ptr, std::move(mutator), std::move(inflating_input_stream));
} else if (format == "RowBinaryWithNamesAndTypes") {
if (!isLittleEndian())
throw std::runtime_error("'" + format + "' format is supported only on little-endian platforms");
throw std::runtime_error(
"'" + format +
"' format is supported only on little-endian platforms");

return std::make_unique<RowBinaryWithNamesAndTypesResultReader>(timezone, raw_stream, std::move(mutator));
return std::make_unique<RowBinaryWithNamesAndTypesResultReader>(
timezone, *stream_ptr, std::move(mutator), std::move(inflating_input_stream));
}

throw std::runtime_error("'" + format + "' format is not supported");
Expand Down
13 changes: 11 additions & 2 deletions driver/result_set.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@
#include <variant>
#include <vector>

#include "Poco/InflatingStream.h"

extern const std::string::size_type initial_string_capacity_g;

class ColumnInfo {
Expand Down Expand Up @@ -135,6 +137,7 @@ class ResultSet {
class ResultReader {
protected:
explicit ResultReader(const std::string & timezone_, std::istream & stream, std::unique_ptr<ResultMutator> && mutator);
explicit ResultReader(const std::string & timezone_, std::istream & stream, std::unique_ptr<ResultMutator> && mutator, std::unique_ptr<std::istream> && inflating_input_stream);

public:
virtual ~ResultReader() = default;
Expand All @@ -153,10 +156,16 @@ class ResultReader {
std::unique_ptr<ResultSet> result_set;
};

std::unique_ptr<ResultReader> make_result_reader(const std::string & format, const std::string & timezone, std::istream & raw_stream, std::unique_ptr<ResultMutator> && mutator);
std::unique_ptr<ResultReader> make_result_reader(
const std::string & format,
const std::string & timezone,
const std::string & compression,
std::istream & raw_stream,
std::unique_ptr<ResultMutator> && mutator);

template <typename ConversionContext>
SQLRETURN Field::extract(BindingInfo & binding_info, ConversionContext && context) const {
SQLRETURN Field::extract(BindingInfo & binding_info, ConversionContext && context) const
{
return std::visit([&binding_info, &context] (auto & value) {
if constexpr (std::is_same_v<DataSourceType<DataSourceTypeId::Nothing>, std::decay_t<decltype(value)>>) {
return fillOutputNULL(binding_info.value, binding_info.value_max_size, binding_info.indicator);
Expand Down
2 changes: 2 additions & 0 deletions driver/statement.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,7 @@ void Statement::requestNextPackOfResultSets(std::unique_ptr<ResultMutator> && mu
request.setHost(uri.getHost());
request.setURI(uri.getPathEtc());
request.set("User-Agent", connection.buildUserAgentString());
request.set("Accept-Encoding", "gzip, deflate");

LOG(request.getMethod() << " " << request.getHost() << request.getURI() << " body=" << prepared_query
<< " UA=" << request.get("User-Agent"));
Expand Down Expand Up @@ -190,6 +191,7 @@ void Statement::requestNextPackOfResultSets(std::unique_ptr<ResultMutator> && mu
result_reader = make_result_reader(
response->get("X-ClickHouse-Format", connection.default_format),
response->get("X-ClickHouse-Timezone", Poco::Timezone::name()),
response->get("Content-Encoding", ""),
*in, std::move(mutator)
);

Expand Down
9 changes: 8 additions & 1 deletion driver/test/misc_it.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -264,7 +264,14 @@ INSTANTIATE_TEST_SUITE_P(

std::make_tuple("AllGood_ClientName_Empty", "ClientName=", FailOn::Never),
std::make_tuple("AllGood_ClientName_Plain", "ClientName=TestApp/0.1 (TestOS)", FailOn::Never),
std::make_tuple("AllGood_ClientName_Wrapped", "ClientName={TestApp/0.1; (TestOS)}", FailOn::Never)
std::make_tuple("AllGood_ClientName_Wrapped", "ClientName={TestApp/0.1; (TestOS)}", FailOn::Never),

std::make_tuple("AllGood_Compress_Empty", "Compress=", FailOn::Never),
std::make_tuple("AllGood_Compress_On", "Compress=on", FailOn::Never),
std::make_tuple("AllGood_Compress_Off", "Compress=off", FailOn::Never),
std::make_tuple("AllGood_UseCompression_Empty", "UseCompression=", FailOn::Never),
std::make_tuple("AllGood_UseCompression_On", "UseCompression=on", FailOn::Never),
std::make_tuple("AllGood_UseCompression_Off", "UseCompression=off", FailOn::Never)
),
[] (const auto & param_info) {
return std::get<0>(param_info.param);
Expand Down
6 changes: 6 additions & 0 deletions driver/utils/amortized_istream_reader.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,11 @@ class AmortizedIStreamReader
{
}

explicit AmortizedIStreamReader(std::istream & raw_stream, std::unique_ptr<std::istream> stream_holder)
: raw_stream_(raw_stream), stream_holder_(std::move(stream_holder))
{
}

~AmortizedIStreamReader() {
// Put back any pre-read characters, just in case...
// (it should be done in reverse order)
Expand Down Expand Up @@ -126,4 +131,5 @@ class AmortizedIStreamReader
std::istream & raw_stream_;
std::size_t offset_ = 0;
std::string buffer_;
std::unique_ptr<std::istream> stream_holder_; // can be empty if ownership is managed externally
};
1 change: 1 addition & 0 deletions packaging/odbc.ini.sample
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ Description = DSN (localhost) for ClickHouse ODBC Driver (ANSI)
# # Path to the extended driver log file
# AutoSessionId = off # Auto generate session_id required to use some features of CH (e.g. TEMPORARY TABLE)
# ClientName = # Sets additional information about the calling application. This string will be used as a prefix for the User-Agent header.
# Compress = off # Pass enable_http_compression=1 parameter to server

[ClickHouse DSN (Unicode)]
Driver = ClickHouse ODBC Driver (Unicode)
Expand Down
Loading