You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+21-1Lines changed: 21 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,7 +26,6 @@ The connector has the following limitations:
26
26
- Only row-level operations are produced (`INSERT`, `UPDATE`, `DELETE`):
27
27
- Partition deletes - those changes are ignored
28
28
- Row range deletes - those changes are ignored
29
-
- No support for collection types (`LIST`, `SET`, `MAP`) and `UDT` - columns with those types are omitted from generated messages
30
29
- No support for preimage and postimage - changes only contain those columns that were modified, not the entire row before/after change. More information [here](#cell-representation)
31
30
32
31
## Connector installation
@@ -192,6 +191,27 @@ If the operation did not modify the `v` column, the data event will contain the
192
191
193
192
See `UPDATE` example for full data change event's value.
194
193
194
+
#### Collections
195
+
Connector supports both frozen and non-frozen collections.
196
+
Format for frozen collections is as follows (those structs will be stored in "Cell" mentioned above):
197
+
-`List` and `Set` of type T are represented as `Schema.array(T)`. In the JSON format, this is also an array.
198
+
-`Map` with key type K and value type V is represented as `Schema.map(K, V)`. In JSON, this is an array (not object!) of 2-element arrays (first element is key, second is value).
199
+
-`UDT` is represented as a struct. In JSON, this is an object.
200
+
201
+
Non-frozen collections are a bit more complicated. `scylla.collections.mode` config defines which representation will be used. Currently, only `delta` mode is supported. In the future, more modes (e.g. preimage / postimage) may be added.
202
+
203
+
##### Non-frozen collections: delta mode.
204
+
Each non-frozen collection column is represented as a struct, with fields `mode` and `elements`. This struct will be stored in "Cell" described previously.
205
+
`mode` can be:
206
+
-`MODIFY` - elements were added or deleted.
207
+
-`OVERWRITE` - whole content of collection was removed, and new elements were added. If no elements were added (meaning the collection was just removed), this mode won't be used - instead, whole struct (stored in `field` value of "Cell" struct, as mentioned previously) will be null.
208
+
209
+
Type of `elements` field depends on collection type:
210
+
- For `Set` of type T it will be `Schema.map(T, Schema.BOOLEAN_SCHEMA)`. The boolean value signals wheter value was added (true) or removed (false) from set.
211
+
- For `List` of type T, it will be `Schema.map(Schema.STRING_SCHEMA, T)` - key of this map is timeuuid, as described in https://docs.scylladb.com/using-scylla/cdc/cdc-advanced-types/#lists. Removed elements are marked by null value.
212
+
- For `Map` with key K and value V, it will be `Schema.map(K, V)` (same as in frozen collection). Removed elements are marked by null value.
213
+
- For `UDT` it will be struct representing this UDT, bit a bit differently than in frozen UDT: each field of this struct is a "Cell" (a struct with a single field, `value`). "Cell" is used the same way as with columns - null means that the field wasn't changed, "Cell" with null value means field was removed, field with non-null value means that field was overwritten.
214
+
195
215
#### ScyllaExtractNewState transformer
196
216
Connector provides one single message transformation (SMT), `ScyllaExtractNewState` (class: `com.scylladb.cdc.debezium.connector.transforms.ScyllaExtractNewState`).
197
217
This SMT works like exactly like `io.debezium.transforms.ExtractNewRecordState` (in fact it is called underneath), but also flattens structure by extracting values from aforementioned single-field structures.
0 commit comments