Confluent compatible JSON Serde #455

big-andy-coates · 2024-01-17T13:14:44Z

Confluent JSON serde prefix the key and value data with the Id of the schema used to serialise the data. Creek, by default, does not, making it incompatible with Confluent JSON serde.

Other Confluent serde, e.g. Avro and Protobuf, also prefix with the schema id. However, the difference is that this information is not actually needed for JSON. When JSON schema evolution is managed correctly all data in the topic is compatible with consumer schemas, so there is no intrinsic need to track which schema version was used when producing.

There are also disadvantages to prefixing the data with the schema id:

strictly speaking the the payload of the message key is no longer serialised JSON: .
it's hard to evolve a key schema without it changing key-to-partition mapping, which is bad.
There is no reason why adding a new optional property to a key schema should change which partition an existing key should produce to.

Of course, not being compatible with the stock Confluent Serde has its disadvantages too, mainly compatibility with existing data and tooling. Soo...

Enhance the JSON serde to have the option to prefix key and/or value payloads with the schema id.

While we're at it, we may as well also : have the option to add the key and/or value schema id as a header.

If and where schema ids should be persisted to Kafka should be part of the Kafka topic descriptor.

big-andy-coates added the enhancement New feature or request label Jan 17, 2024

big-andy-coates self-assigned this Jan 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confluent compatible JSON Serde #455

Confluent compatible JSON Serde #455

big-andy-coates commented Jan 17, 2024

Confluent compatible JSON Serde #455

Confluent compatible JSON Serde #455

Comments

big-andy-coates commented Jan 17, 2024