Refactor and Improve streaming engines Kafka/RabbitMQ/NATS and data formats#42777
Refactor and Improve streaming engines Kafka/RabbitMQ/NATS and data formats#42777Avogar merged 33 commits intoClickHouse:masterfrom
Conversation
| /// Need for backward compatibility. | ||
| if (format_name == "Avro" && local_context->getSettingsRef().output_format_avro_rows_in_file.changed) | ||
| max_rows = local_context->getSettingsRef().output_format_avro_rows_in_file.value; |
There was a problem hiding this comment.
This is temporary, I will make a PR to depricate setting output_format_avro_rows_in_file after this PR because this setting doesn't make sense anymore
|
Quite a big change, did not finish the reading yet. We need also to do benchmarks before merging |
It will be great if you can help me with it |
|
I checked performance difference in kafka and found out that async producing is less effective. Seems like overhead on copying of each message from the buffer to queue surpasses all benefits. I will think about how to optimize producing later. Let's make kafka producing process singlethreaded as it was before. |
hm, no, initially seemed that asynchronous writing is preferable because rabbitmq library is event-based and requires to run event loops, but indeed need to compare and check the difference. |
Revert some changes from #42777 to fix performance tests
Fixes: ClickHouse#42777 Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Fixes: ClickHouse#42777 Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> (cherry picked from commit 51d4f58)
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Refactor and Improve streaming engines Kafka/RabbitMQ/NATS and add support for all formats, also refactor formats a bit:
max_block_size.kafka_max_rows_per_message/rabbitmq_max_rows_per_message/nats_max_rows_per_message. They control the number of rows formatted in one message in row-based formats. Default value: 1.CC: @filimonov, @kssenii