Closed
Description
I noticed that we use conditions like this to check whether it is greedy sampling
https://github.com/WoosukKwon/cacheflow/blob/189ae231336857bcc4c6f6157bf7868cdf56fb5f/cacheflow/sampling_params.py#L45
However, I guess this will result in several problems
- It is not recommended to use
==
for floating point numbers - A small temperature will result in inf/nan
I typically use something like this https://github.com/lm-sys/FastChat/blob/a94fd259a97128f7f4483ddb760690f467888d84/fastchat/serve/inference.py#L227
@WoosukKwon, @zhuohan123 What do you think? If you are happy, I can change all "==" to "<=".