You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+22-1Lines changed: 22 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -26,7 +26,6 @@ The connector has the following limitations:
26
26
- Only row-level operations are produced (`INSERT`, `UPDATE`, `DELETE`):
27
27
- Partition deletes - those changes are ignored
28
28
- Row range deletes - those changes are ignored
29
-
- No support for collection types (`LIST`, `SET`, `MAP`) and `UDT` - columns with those types are omitted from generated messages
30
29
- No support for preimage and postimage - changes only contain those columns that were modified, not the entire row before/after change. More information [here](#cell-representation)
31
30
32
31
## Connector installation
@@ -192,6 +191,28 @@ If the operation did not modify the `v` column, the data event will contain the
192
191
193
192
See `UPDATE` example for full data change event's value.
194
193
194
+
#### Collections
195
+
Connector supports both frozen and non-frozen collections.
196
+
Format for frozen collections is as follows (those structs will be stored in "Cell" mentioned above):
197
+
-`List` and `Set` of type T are represented as `Schema.array(T)`. In the JSON format, this is also an array.
198
+
-`Map` with key type K and value type V is represented as `Schema.map(K, V)`. In JSON, this is an array (not object!) of 2-element arrays (first element is key, second is value).
199
+
-`UDT` is represented as a struct. In JSON, this is an object.
200
+
201
+
Non-frozen collections are a bit more complicated. `scylla.collections.mode` config defines which representation will be used. Currently, only simple mode is supported. In the future, more modes (e.g. preimage / postimage) may be added.
202
+
203
+
##### Non-frozen collections: simple mode.
204
+
Each non-frozen collection column is represented as a struct, with fields `mode` and `elements`. This struct will be stored in "Cell" described previously.
205
+
`mode` can be:
206
+
-`ADD` - new elements were added to collection.
207
+
-`OVERWRITE` - whole content of collection was removed, and new elements were added. If no elements were added (meaning the collection was just removed), this mode won't be used - instead, whole struct (stored in `field` value of "Cell" struct, as mentioned previously) will be null.
208
+
-`REMOVE` - some elements were removed from collection.
209
+
210
+
Type of `elements` field depends on collection type:
211
+
- For `Set` of type T it will be `Schema.array(T)` (same as in frozen collection).
212
+
- For `List` of type T, it will be `Schema.map(Schema.STRING_SCHEMA, T)` - key of this map is timeuuid, as described in https://docs.scylladb.com/using-scylla/cdc/cdc-advanced-types/#lists. When removing elements (`REMOVE` mode), values will always be null.
213
+
- For `Map` with key K and value V, it will be `Schema.map(K, V)` (same as in frozen collection). When removing elements (`REMOVE` mode), values will always be null.
214
+
- For `UDT` it will be struct representing this UDT, bit a bit differently than in frozen UDT: each field of this struct is a "Cell" (a struct with a single field, `value`). `UDT` never uses `REMOVE` mode - because it is possible that some keys are added/modified, and some are removed, in the same operation. Instead, "Cell" is used the same way as with columns - null means that the field wasn't changed, "Cell" with null value means field was removed, field with non-null value means that field was overwritten.
215
+
195
216
#### ScyllaExtractNewState transformer
196
217
Connector provides one single message transformation (SMT), `ScyllaExtractNewState` (class: `com.scylladb.cdc.debezium.connector.transforms.ScyllaExtractNewState`).
197
218
This SMT works like exactly like `io.debezium.transforms.ExtractNewRecordState` (in fact it is called underneath), but also flattens structure by extracting values from aforementioned single-field structures.
0 commit comments