Название исследуемой задачи: | Post Training Quantization. Flexible continuous modification for SOTA post training quantization methods to make them lossless. |
---|---|
Тип научной работы: | M1P |
Автор: | Седова Анна |
Научный руководитель: | Жариков Илья |
Neural network quantization gives the opportunity to inference large models on resource constrained devices. Post-Training Quantization(PTQ) methods have became popular, as they are simple and fast to use. They do not require whole model retraining and use only small calibration set to calculate quantization parameters. However, these methods show significant accuracy decrease on low-bit setting. There are methods that allow to increase the accuracy of model by increasing its computational complexity. In this paper, we propose a continuous modification for these methods and find a reasonable trade-off between computational complexity and performance.