Merge branch 'main' into aashishkohli-patch-4

Blargian · web-flow · commit 989e3140c241 · 2025-06-15T22:23:06.000+02:00
diff --git a/docs/cloud/manage/backups/export-backups-to-own-cloud-account.md b/docs/cloud/manage/backups/export-backups-to-own-cloud-account.md
@@ -12,12 +12,16 @@ import EnterprisePlanFeatureBadge from '@theme/badges/EnterprisePlanFeatureBadge
 ClickHouse Cloud supports taking backups to your own cloud service provider (CSP) account (AWS S3, Google Cloud Storage, or Azure Blob Storage).
 For details of how ClickHouse Cloud backups work, including "full" vs. "incremental" backups, see the [backups](overview.md) docs.
 
-Here we show examples of how to take full and incremental backups to AWS, GCP, Azure object storage as well as how to restore from the backups.
+Here we show examples of how to take full and incremental backups to AWS, GCP, Azure object storage as well as how to restore from the backups. The BACKUP commands listed below are run within the original service. The RESTORE commands are run from a new service where the backup should be restored. 
 
 :::note
 Users should be aware that any usage where backups are being exported to a different region in the same cloud provider, or to another cloud provider (in the same or different region) will incur [data transfer](../network-data-transfer.mdx) charges.
 :::
 
+:::note
+Backup / Restore into your own bucket for services utilizing [TDE](https://clickhouse.com/docs/cloud/security/cmek#transparent-data-encryption-tde) is currently not supported. 
+:::
+
 ## Requirements {#requirements}
 
 You will need the following details to export/restore backups to your own CSP storage bucket.
@@ -58,7 +62,12 @@ You will need the following details to export/restore backups to your own CSP st
 
 <hr/>
 
-# Backup / restore
+# Backup / restore {#backup-restoer}
+
+:::note:::
+1. For restoring the backup from your own bucket into a new service, you will need to update the trust policy of your backups storage bucket to allow access from the new service.
+2. The Backup / Restore commands need to be run from the database command line. For restore to a new service, you will first need to create the service and then run the command.   
+:::
 
 ## Backup / restore to AWS S3 bucket {#backup--restore-to-aws-s3-bucket}
 
diff --git a/docs/integrations/data-ingestion/clickpipes/kafka.md b/docs/integrations/data-ingestion/clickpipes/kafka.md
@@ -101,7 +101,7 @@ without an embedded schema id, then the specific schema ID or subject must be sp
 
 11. **Congratulations!** you have successfully set up your first ClickPipe. If this is a streaming ClickPipe it will be continuously running, ingesting data in real-time from your remote data source.
 
-## Supported Data Sources {#supported-data-sources}
+## Supported data sources {#supported-data-sources}
 
 | Name                 |Logo|Type| Status          | Description                                                                                          |
 |----------------------|----|----|-----------------|------------------------------------------------------------------------------------------------------|
@@ -114,14 +114,14 @@ without an embedded schema id, then the specific schema ID or subject must be sp
 
 More connectors are will get added to ClickPipes, you can find out more by [contacting us](https://clickhouse.com/company/contact?loc=clickpipes).
 
-## Supported Data Formats {#supported-data-formats}
+## Supported data formats {#supported-data-formats}
 
 The supported formats are:
 - [JSON](../../../interfaces/formats.md/#json)
 - [AvroConfluent](../../../interfaces/formats.md/#data-format-avro-confluent)
 
 
-### Supported Data Types {#supported-data-types}
+### Supported data types {#supported-data-types}
 
 #### Standard types support {#standard-types-support}
 The following standard ClickHouse data types are currently supported in ClickPipes:
@@ -169,7 +169,7 @@ Note that you will have to manually change the destination column to the desired
 
 ClickPipes supports all Avro Primitive and Complex types, and all Avro Logical types except `time-millis`, `time-micros`, `local-timestamp-millis`, `local_timestamp-micros`, and `duration`.  Avro `record` types are converted to Tuple, `array` types to Array, and `map` to Map (string keys only).  In general the conversions listed [here](/interfaces/formats/Avro#data-types-matching) are available.  We recommend using exact type matching for Avro numeric types, as ClickPipes does not check for overflow or precision loss on type conversion.
 
-#### Nullable Types and Avro Unions {#nullable-types-and-avro-unions}
+#### Nullable types and Avro unions {#nullable-types-and-avro-unions}
 
 Nullable types in Avro are defined by using a Union schema of `(T, null)` or `(null, T)` where T is the base Avro type.  During schema inference, such unions will be mapped to a ClickHouse "Nullable" column.  Note that ClickHouse does not support
 `Nullable(Array)`, `Nullable(Map)`, or `Nullable(Tuple)` types.  Avro null unions for these types will be mapped to non-nullable versions (Avro Record types are mapped to a ClickHouse named Tuple).  Avro "nulls" for these types will be inserted as:
@@ -179,7 +179,7 @@ Nullable types in Avro are defined by using a Union schema of `(T, null)` or `(n
 
 ClickPipes does not currently support schemas that contain other Avro Unions (this may change in the future with the maturity of the new ClickHouse Variant and JSON data types).  If the Avro schema contains a "non-null" union, ClickPipes will generate an error when attempting to calculate a mapping between the Avro schema and Clickhouse column types.
 
-#### Avro Schema Management {#avro-schema-management}
+#### Avro schema management {#avro-schema-management}
 
 ClickPipes dynamically retrieves and applies the Avro schema from the configured Schema Registry using the schema ID embedded in each message/event.  Schema updates are detected and processed automatically.
 
@@ -190,7 +190,7 @@ The following rules are applied to the mapping between the retrieved Avro schema
 - If the Avro schema is missing a field defined in the ClickHouse destination mapping, the ClickHouse column will be populated with a "zero" value, such as 0 or an empty string.  Note that [DEFAULT](/sql-reference/statements/create/table#default) expressions are not currently evaluated for ClickPipes inserts (this is temporary limitation pending updates to the ClickHouse server default processing).
 - If the Avro schema field and the ClickHouse column are incompatible, inserts of that row/message will fail, and the failure will be recorded in the ClickPipes errors table.  Note that several implicit conversions are supported (like between numeric types), but not all (for example, an Avro `record` field can not be inserted into an `Int32` ClickHouse column).
 
-## Kafka Virtual Columns {#kafka-virtual-columns}
+## Kafka virtual columns {#kafka-virtual-columns}
 
 The following virtual columns are supported for Kafka compatible streaming data sources.  When creating a new destination table virtual columns can be added by using the `Add Column` button.
 
@@ -208,6 +208,12 @@ The following virtual columns are supported for Kafka compatible streaming data
 Note that the _raw_message column is only recommended for JSON data.  For use cases where only the JSON string is required (such as using ClickHouse [`JsonExtract*`](/sql-reference/functions/json-functions#jsonextract-functions) functions to populate a downstream materialized
 view), it may improve ClickPipes performance to delete all the "non-virtual" columns.
 
+## Best practices {#best-practices}
+
+### Message Compression {#compression}
+We strongly recommend using compression for your Kafka topics. Compression can result in a significant saving in data transfer costs with virtually no performance hit.
+To learn more about message compression in Kafka, we recommend starting with this [guide](https://www.confluent.io/blog/apache-kafka-message-compression/).
+
 ## Limitations {#limitations}
 
 - [DEFAULT](/sql-reference/statements/create/table#default) is not supported.
@@ -269,7 +275,7 @@ Below is an example of the required IAM policy for Apache Kafka APIs for MSK:
 }
 ```
 
-#### Configuring a Trusted Relationship {#configuring-a-trusted-relationship}
+#### Configuring a trusted relationship {#configuring-a-trusted-relationship}
 
 If you are authenticating to MSK with a IAM role ARN, you will need to add a trusted relationship between your ClickHouse Cloud instance so the role can be assumed.
 
@@ -343,11 +349,11 @@ the ClickPipe will automatically restart the consumer and continue processing me
 
 - **What are the requirements for using ClickPipes for Kafka?**
 
-  In order to use ClickPipes for Kafka, you will need a running Kafka broker and a ClickHouse Cloud service with ClickPipes enabled. You will also need to ensure that ClickHouse Cloud can access your Kafka broker. This can be achieved by allowing remote connection on the Kafka side, whitelisting [ClickHouse Cloud Egress IP addresses](/manage/security/cloud-endpoints-api) in your Kafka setup.
+  In order to use ClickPipes for Kafka, you will need a running Kafka broker and a ClickHouse Cloud service with ClickPipes enabled. You will also need to ensure that ClickHouse Cloud can access your Kafka broker. This can be achieved by allowing remote connection on the Kafka side, whitelisting [ClickHouse Cloud Egress IP addresses](/manage/security/cloud-endpoints-api) in your Kafka setup. Alternatively, you can use [AWS PrivateLink](/integrations/clickpipes/aws-privatelink) to connect ClickPipes for Kafka to your Kafka brokers.
 
 - **Does ClickPipes for Kafka support AWS PrivateLink?**
 
-  AWS PrivateLink is supported. Please [contact us](https://clickhouse.com/company/contact?loc=clickpipes) for more information.
+  AWS PrivateLink is supported. See [the documentation](/integrations/clickpipes/aws-privatelink) for more information on how to set it up.
 
 - **Can I use ClickPipes for Kafka to write data to a Kafka topic?**
 
diff --git a/docs/integrations/data-ingestion/etl-tools/dlt-and-clickhouse.md b/docs/integrations/data-ingestion/etl-tools/dlt-and-clickhouse.md
@@ -74,6 +74,8 @@ host = "localhost"                       # ClickHouse server host
 port = 9000                              # ClickHouse HTTP port, default is 9000
 http_port = 8443                         # HTTP Port to connect to ClickHouse server's HTTP interface. Defaults to 8443.
 secure = 1                               # Set to 1 if using HTTPS, else 0.
+
+[destination.clickhouse]
 dataset_table_separator = "___"          # Separator for dataset table names from dataset.
 ```