Closed
Description
Currently the dense_vector
field supports only 32-bit float values. At a minimum, the field should also support 8-bit integer values.
Background
There is a growing emphasis in the NLP and neural search landscapes to utilize quantization along with other techniques in order to improve efficiency while trying to maintain effectiveness of methods. This is particularly evident in BERT/Transformer models [1][2] and embeddings used for retrieval and ranking [3]. In order to support more efficient storage of and computation on embeddings, the dense_vector
field needs to support a wider range of numeric types.
[1] Q8BERT: Quantized 8Bit BERT
[2] Faster and smaller quantized NLP with Hugging Face and ONNX Runtime
[3] Sentence-Transformers; Model Distillation