Closed
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Currently the code uses a HashSet<ScalarValue>
to track unique keys, which is slow and uses more memory than needed.
Describe the solution you'd like
Use a typed array for storing the entries & keep a hashmap.
We can use the same approach as present in the dictionary builders (PrimitiveDictionaryBuilder
) or parquet Interner
contributed by @tustvold.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.