-
Notifications
You must be signed in to change notification settings - Fork 618
Open
Labels
Description
Current Limitation
The insert() method only supports direct table insertion, not SELECT statements
with transformations.
Desired Enhancement
Allow insert() to accept SELECT statements with input() function:
client.insert()
.table("target_table")
.query("SELECT col1, col2, col1 + col2 as computed FROM input('col1 Int32, col2 Int32')")
.data(inputStream)
.format(ClickHouseFormat.CSV)
.send();Or alternative syntax:
client.insert("INSERT INTO target_table SELECT col1, col2, col1 + col2 as computed FROM input('col1 Int32, col2 Int32')")
.data(inputStream)
.format(ClickHouseFormat.CSV)
.send();Use Case
We need to stream data with computed columns during insertion:
- Stream large datasets efficiently
- Apply transformations/computed columns on the fly
- Leverage ClickHouse's input() function capabilities
- Maintain the performance benefits of the streaming insert() API
Business Value
This is a common pattern in ClickHouse for ETL workloads where data needs
transformation during insertion without materializing intermediate results.
Current Workaround (Working but Undocumented)
We're successfully using this pattern in production:
String loadQuery = tableName + "(" + headers + ")" +
" select " + parametersQuery + "* from input('" + csvHeadersStructure + "')";
try (InputStream inputStream = Files.newInputStream(Paths.get(filepath))) {
InsertResponse response = client.insert(loadQuery, inputStream, ClickHouseFormat.CSVWithNames)
.get(1200, TimeUnit.SECONDS);
}This works but:
- Not documented in official API docs
- Unclear if this is intended behavior or might break in future versions
- No IDE autocomplete/type safety for this pattern