You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: modules/develop/pages/transactions.adoc
+41-25Lines changed: 41 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,9 +6,11 @@
6
6
7
7
Redpanda supports Apache Kafka®-compatible transaction semantics and APIs. For example, you can fetch messages starting from the last consumed offset and transactionally process them one by one, updating the last consumed offset and producing events at the same time.
8
8
9
+
A transaction can span partitions from different topics, and a topic can be deleted while there are active transactions on one or more of its partitions. In-flight transactions can detect deletion events, remove the deleted partitions (and related messages) from the transaction scope, and commit changes to the remaining partitions.
10
+
9
11
If a producer is sending multiple messages to the same or different partitions, and network connectivity or broker failure cause the transaction to fail, then it's guaranteed that either all messages are written to the partitions or none. This is important for applications that require strict guarantees, like financial services transactions.
10
12
11
-
Transactions guarantee both exactly-once semantics (EOS) and atomicity.
13
+
Transactions guarantee both exactly-once semantics (EOS) and atomicity:
12
14
13
15
* EOS helps developers avoid the anomalies of at-most-once processing (with potential lost events) and at-least-once processing (with potential duplicated events). Redpanda supports EOS when transactions are used in combination with xref:develop:produce-data/idempotent-producers.adoc[idempotent producers].
14
16
* Atomicity additionally commits a set of messages across partitions as a unit: either all messages are committed or none. Encapsulated data received or sent across multiple topics in a single operation can only succeed or fail globally.
@@ -23,31 +25,18 @@ endif::[]
23
25
24
26
== Use transactions
25
27
26
-
By default, the `enable_transactions` cluster configuration property is set to true. However, in the following use cases, clients must explicitly use the Transactions API to perform operations within a transaction.
27
-
28
-
The required `transactional.id` property acts as a producer identity. It enables reliability semantics that span multiple producer sessions by allowing the client to guarantee that all transactions issued by the client with the same ID have completed prior to starting any new transactions.
29
-
30
-
The two primary use cases for transactions are:
28
+
By default, the `enable_transactions` cluster configuration property is set to true. However, in the following use cases, clients must explicitly use the Transactions API to perform operations within a transaction:
31
29
32
30
* xref:develop:transactions.adoc#atomic-publishing-of-multiple-messages[Atomic (all or nothing) publishing of multiple messages]
* A transaction can span partitions from different topics, and a topic can be deleted while there are active transactions on one or more of its partitions. In-flight transactions can detect deletion events, remove the deleted partitions (and related messages) from the transaction scope, and commit changes to the remaining partitions.
38
-
* Ongoing transactions can prevent consumers from advancing. To avoid this, don't set transaction timeout (`transaction.timeout.ms` in Java client) to high values: the higher the timeout, the longer consumers may be blocked. By default, it's about a minute, but it's a client setting that depends on the client.
39
-
ifndef::env-cloud[]
40
-
* When running transactional workloads from clients, tune xref:reference:cluster-properties#max_transactions_per_coordinator[`max_transactions_per_coordinator`] to the number of active transactions that you expect your clients to run at any given time (if your client transaction IDs are not reused). The total number of transactions in the cluster at any one time is `max_transactions_per_coordinator * transaction_coordinator_partitions` (default is 50). When the threshold is exceeded, Redpanda terminates old sessions. If an idle producer corresponding to the terminated session wakes up and produces, its batches are rejected with the message `invalid producer epoch` or `invalid_producer_id_mapping`, depending on where it is in the transaction execution phase.
41
-
Be aware that if you keep the default as 50 and your clients create a new ID for every transaction, the total continues to accumulate, which bloats memory.
42
-
* When upgrading a self-managed deployment, make sure to use maintenance mode with a glossterm:rolling upgrade[].
43
-
44
-
endif::[]
33
+
The required `transactional.id` property acts as a producer identity. It enables reliability semantics that span multiple producer sessions by allowing the client to guarantee that all transactions issued by the client with the same ID have completed prior to starting any new transactions.
45
34
46
35
=== Atomic publishing of multiple messages
47
36
48
-
With its event sourcing microservice architecture, a banking IT system illustrates the necessity for transactions well. A bank has multiple branches, and each branch is an independent microservice that manages its own non-intersecting set of accounts. Each branch keeps its own ledger, which is represented as a Redpanda partition. When a branch representing a microservice starts, it replays its ledger to reconstruct the actual state.
37
+
A banking IT system with an event-sourcing microservice architectureillustrates the necessity for transactions. A bank has multiple branches, and each branch is an independent microservice that manages its own non-intersecting set of accounts. Each branch keeps its own ledger, represented as a Redpanda partition. When a branch (microservice) starts, it replays its ledger to reconstruct the actual state.
49
38
50
-
Financial transactions (money transfers) require the following guarantees:
39
+
Financial transactions such as money transfers require the following guarantees:
51
40
52
41
* A sender can't withdraw more than the account withdrawal limit.
53
42
* A recipient receives exactly the same amount sent.
@@ -61,6 +50,9 @@ Things get more complex with cross-branch financial transactions, because they i
61
50
62
51
Redpanda natively supports transactions, so it's possible to atomically update several ledgers at the same time. For example:
63
52
53
+
.Show multi-ledger transaction example:
54
+
[%collapsible]
55
+
====
64
56
[,java]
65
57
----
66
58
Properties props = new Properties();
@@ -131,6 +123,7 @@ while (true) {
131
123
// TIP: notify the initiator of a transaction about the success
132
124
}
133
125
----
126
+
====
134
127
135
128
When a transaction fails before a `commitTransaction` attempt completes, you can assume that it is not executed. When a transaction fails after a `commitTransaction` attempt completes, the true transaction status is unknown. Redpanda only guarantees that there isn't a partial result: either the transaction is committed and complete, or it is fully rolled back.
The transformation reads a record from `topic(1)`, processes it, and writes it to `topic(2)`. Without transactions, an intermittent error can cause a message to be lost or processed several times. With transactions, Redpanda guarantees exactly-once semantics. For example:
156
149
150
+
.Show exactly-once processing example:
151
+
[%collapsible]
152
+
====
157
153
[,java]
158
154
----
159
155
var source = "source-topic";
@@ -260,20 +256,40 @@ while (true) {
260
256
}
261
257
}
262
258
----
259
+
====
263
260
264
-
Different transactions require different approaches to handling failures within the application. Consider the approaches to failed or timed-out transactions in the provided use cases.
265
-
266
-
* Publishing of multiple messages: The request came from outside the system, and it is the application's responsibility to discover the true status of a timed-out transaction. (This example doesn't use consumer groups to distribute partitions between consumers.)
267
-
* Exactly-once streaming (consume-transform-loop): This is a closed system. Upon re-initialization of the consumer and producer, the system automatically discovers the moment it was interrupted and continues from that place. Additionally, this automatically scales by the number of partitions. Run another instance of the application, and it starts processing its share of partitions in the source topic.
To help avoid common pitfalls and optimize performance, consider the following when configuring transactional workloads in Redpanda:
272
+
273
+
* Ongoing transactions can prevent consumers from advancing. To avoid this, don't set transaction timeout (`transaction.timeout.ms` in Java client) to high values: the higher the timeout, the longer consumers may be blocked. By default, it's about a minute, but it's a client setting that depends on the client.
274
+
ifndef::env-cloud[]
275
+
* When running transactional workloads from clients, tune xref:reference:cluster-properties#max_transactions_per_coordinator[`max_transactions_per_coordinator`] to the number of active transactions that you expect your clients to run at any given time (if your client transaction IDs are not reused).
276
+
+
277
+
The total number of transactions in the cluster at any one time is `max_transactions_per_coordinator * transaction_coordinator_partitions` (default is 50). When the threshold is exceeded, Redpanda terminates old sessions. If an idle producer corresponding to the terminated session wakes up and produces, its batches are rejected with the message `invalid producer epoch` or `invalid_producer_id_mapping`, depending on where it is in the transaction execution phase.
278
+
+
279
+
Be aware that if you keep the default as 50 and your clients create a new ID for every transaction, the total continues to accumulate, which bloats memory.
280
+
* Transactional metadata is stored in the internal topic `kafka_internal/tx`. Over time, this topic can consume disk space. You can manage its disk usage by tuning the `transaction_coordinator_delete_retention_ms` and `transactional_id_expiration_ms` cluster properties.
281
+
+
282
+
See also: xref:manage:cluster-maintenance/disk-utilization.adoc#manage_transaction_coordinator_disk_usage[Manage Disk Space]
283
+
* When upgrading a self-managed deployment, make sure to use maintenance mode with a glossterm:rolling upgrade[].
284
+
endif::[]
285
+
286
+
== Handle transaction failures
287
+
288
+
Different transactions require different approaches to handling failures within the application. Consider the approaches to failed or timed-out transactions in the provided use cases:
289
+
290
+
* Publishing of multiple messages: The request came from outside the system, and it is the application's responsibility to discover the true status of a timed-out transaction. (This example doesn't use consumer groups to distribute partitions between consumers.)
291
+
* Exactly-once streaming (consume-transform-loop): This is a closed system. Upon re-initialization of the consumer and producer, the system automatically discovers the moment it was interrupted and continues from that place. Additionally, this automatically scales by the number of partitions. Run another instance of the application, and it starts processing its share of partitions in the source topic.
292
+
277
293
== Transactions with compacted segments
278
294
279
295
Transactions are supported on topics with compaction configured. The compaction process removes the aborted transaction's data and all transactional control markers from the log. The resulting compacted segment contains only committed data batches (and potentially harmless gaps in the offsets due to skipped batches).
0 commit comments