update: add information about single availability zone for Kafka

aiven · Oct 22, 2024 · d8e5d45 · d8e5d45
1 parent 096265e
commit d8e5d45
Showing 1 changed file with 66 additions and 65 deletions.
diff --git a/docs/products/kafka/howto/best-practices.md b/docs/products/kafka/howto/best-practices.md
@@ -2,86 +2,87 @@
 title: Get the best from Apache Kafka®
 ---
 
-We recommend to follow these best practices to ensure that your Apache Kafka® service is fast and reliable.
+Follow these best practices to ensure that your Aiven for Apache Kafka® service is fast and reliable.
 
 ## Check your topic replication factors
 
-Apache Kafka services rely on replication between brokers to preserve
-data in case of the loss of a node. Consider how business critical the
-data in each topic is and make sure that replication is set high enough
-for it.
+Apache Kafka services use replication between brokers to preserve data in case of a
+node failure. Consider how critical the data in each topic is to your business, and set
+a replication factor high enough to ensure data protection.
 
-You can set the replication factor in [Aiven web
-console](https://console.aiven.io/) when you create a topic or edit
-an existing one.
+When creating or editing a topic, you can set the replication factor in the
+[Aiven Console](https://console.aiven.io/).
 
 :::note
-We do not allow to set the replication factor below 2 in order to
-prevent data loss from unexpected node termination.
+Replication factors below 2 are not allowed to prevent data loss from unexpected node
+terminations.
 :::
 
 ## Choose a reasonable number of partitions for a topic
 
-Too few partitions can cause bottlenecks in data processing. In the most
-extreme case, a single partition means that messages are effectively
-processed sequentially. However, too many partitions causes strain on
-the cluster because of an additional overhead. As you cannot reduce the
-number of partitions for existing topics, it is usually best to start
-with a low number that allows efficient data processing and increase it
-if needed.
+Too few partitions can cause bottlenecks in data processing. In the extreme case, a
+single partition means that messages are processed sequentially. However, too many
+partitions strain the cluster due to overhead. As you cannot reduce the
+number of partitions for existing topics, it is best to start with a low number that
+supports efficient data processing and increase it as needed.
 
-As a general rule of thumb, the recommendation is to have max 4000
-partitions per broker, and max 200 000 partitions per cluster
+It is generally recommended to have a maximum of 4,000 partitions per broker and
+200,000 partitions per cluster
 ([source](https://blogsarchive.apache.org/kafka/entry/apache-kafka-supports-more-partitions)).
 
 :::note
-Ordering is guaranteed only per partition. If you require relative
-ordering of records, put that subset of data into the same
-partition.
+Ordering is only guaranteed within a partition. To maintain the order of related records,
+make sure they are placed in the same partition.
 :::
 
 ## Periodically examine topics with entity-based partitioning for imbalances
 
-If you partition messages based on an entity ID (for example, user ID),
-there is a risk of heavily imbalanced partitions. This results in uneven
-load in your cluster and reduces how effectively it can process messages
-in parallel.
-
-You can check the size of each partition in [Aiven web
-console](https://console.aiven.io/) in the topic details, in the
-*Partitions* tab.
-
-## Find the right balance between throughput and latency
-
-To find the right balance try different batch sizes in your producer and
-consumer configurations. Bigger batches increase throughput but also
-increase the latency for individual messages. Conversely, using smaller
-batches decreases message processing latency, but the overhead per
-message increases and the overall throughput decreases.
-
-You can, for example, set `batch.size` and `linger.ms` in the producer
-configuration of your application code (see [official Apache Kafka
-documentation](https://kafka.apache.org/documentation/) for reference).
-
-## Acknowledgements of received data
-
-You can specify a value for acknowledgements setting `acks` in the
-client producer configuration. This will have an impact on how the
-success of a write operation is determined.
-
-With `acks` equal to **0** after the producer sends the data, it does
-not wait for a confirmation from the broker. This will make
-communication faster. However, there is a potential loss of data in case
-of the broker being down when the producer sends the data. This
-configuration is only appropriate when you can afford loss of data.
-
-With `acks` equal to **1** (default value and recommended behaviour),
-the producer waits for the leader broker to acknowledge that the data
-was received. This mode partially prevents data loss, however, the data
-loss still can occur if the broker goes down between the moment it sent
-acknowledgement and the data was replicated.
-
-With `acks` equal to **all**, the leader and all the replicas will send
-confirmation of the received data. This configuration slows the
-communication, but ensures that there will be no data loss, since the
-replicas also confirm that the data was received.
+Partitioning messages based on an entity ID (such as a user ID) can lead to
+imbalanced partitions. This results in uneven load distribution and reduces the
+cluster's efficiency in processing messages in parallel.
+
+You can view the size of each partition in the **Partitions** tab under topic details
+in the [Aiven Console](https://console.aiven.io/).
+
+## Balance between throughput and latency
+
+To find the right balance between throughput and latency, adjust the batch sizes in
+your producer and consumer settings. Larger batches improve throughput but can increase
+the time it takes to process individual messages. Smaller batches reduce this time
+but increase the overhead, which may lower overall throughput.
+
+You can change settings like `batch.size` and `linger.ms` in your producer
+configuration. For more details, refer to the
+[Apache Kafka documentation](https://kafka.apache.org/documentation/).
+
+## Configure acknowledgments for received data
+
+The `acks` parameter in the client producer configuration controls how the success of a
+write operation is determined. Choose the appropriate setting based on your data
+reliability needs:
+
+- **`acks=0`**: The producer sends data without waiting for confirmation from the
+  broker. This speeds up communication, but there’s a risk of data loss if the broker
+  goes down during transmission. Use this setting only if some data loss is acceptable.
+
+- **`acks=1` (default and recommended setting)**: The producer waits for the leader
+  broker to confirm receipt of the data. This reduces the chance of data loss, but
+  data can still be lost if the leader fails before the data is fully replicated.
+
+- **`acks=all`**: The producer waits for acknowledgment from both the leader and all
+  replicas. This ensures no data loss but can slow down communication.
+
+
+## Configure single availability zone (AZ) for BYOC customers
+
+For Bring Your Own Cloud (BYOC) customers looking to optimize costs, consider
+deploying Aiven for Apache Kafka® in a single availability zone (AZ) configuration.
+This option can significantly reduce infrastructure costs while still providing the
+full features of Aiven’s managed Kafka service. You can enable this setting during
+service creation by selecting the **single_zone.enabled** option
+under **Advanced configuration**.
+
+- Contact [Aiven support](mailto:support@aiven.io) to enable this feature for your
+  project before service creation.
+- You must configure single-AZ deployments when creating a new service. You cannot
+  change existing services to a single-AZ setup.