Apache Pulsar is supported Topic Compaction which is a key-based data retention mechanism.
Currently, we retain all null-key messages during topic compaction, which I don't think is necessary because when you use topic compaction, it means that you want to retain the value according to the key, so retaining null-key messages is meaningless.
Additionally, retaining all null-key messages will double the storage cost, and we'll never be able to clean them up since the compacted topic has not supported the retention policy yet.
In summary, I don't think we should retain null-key messages during topic compaction.
In order to avoid introducing break changes to release version, we need to add a configuration to control whether to retain null-key messages during topic compaction. If the configuration is true, we will retain null-key messages during topic compaction, otherwise, we will not retain null-key messages during topic compaction.
Add config to broker.conf/standalone.conf
topicCompactionRetainNullKey=false
- Make
topicCompactionRetainNullKey=false
default in the 3.2.0. - Cherry-pick it to a branch less than 3.2.0 make
topicCompactionRetainNullKey=true
default. - Delete the configuration
topicCompactionRetainNullKey
in 3.3.0 and don't supply an option to retain null-keys.
Make topicCompactionRetainNullKey=true
in broker.conf/standalone.conf.
Make topicCompactionRetainNullKey=false
in broker.conf/standalone.conf.
- Mailing List discussion thread: https://lists.apache.org/thread/68k6vrghfp3np601lrfx5mbfmghbbrjh
- Mailing List voting thread: https://lists.apache.org/thread/36rfmvz5rchgnvqb2wcq4wb64k6st90p