-
Notifications
You must be signed in to change notification settings - Fork 835
Open
Labels
pendingSomething isn't workingSomething isn't working
Description
Describe the bug
As I mentioned in this issue, the default value of top_p and temperature is not guaranteed to be 1. Therefore, the code below will get a modified logits, i.e., a distribution processed depending on generation_config from hf end.
LMFlow/src/lmflow/models/hf_decoder_model.py
Lines 382 to 405 in 1b223f7
| if self.use_accelerator: | |
| outputs = self.backend_model.generate( | |
| input_ids=inputs, | |
| pad_token_id=self.tokenizer.pad_token_id, | |
| *args, | |
| **kwargs | |
| ) | |
| else: | |
| if self.device == "gpu": | |
| outputs = self.ds_engine.module.generate( | |
| input_ids=inputs, | |
| synced_gpus=True, | |
| pad_token_id=self.tokenizer.pad_token_id, | |
| *args, | |
| **kwargs | |
| ) | |
| elif self.device == "cpu": | |
| outputs = self.backend_model.generate( | |
| input_ids=inputs, | |
| synced_gpus=True, | |
| pad_token_id=self.tokenizer.pad_token_id, | |
| *args, | |
| **kwargs | |
| ) |
Much worse, you applied top_p and temperature again in score_to_prob, resulting unexpected distribution:
LMFlow/src/lmflow/pipeline/inferencer.py
Lines 435 to 440 in 1b223f7
| for _ in range(num_new_tokens): | |
| pred = self.predict_next_token(model=model, input_ids=sequence, num_new_tokens=1) # predict next one token | |
| prob = self.score_to_prob(pred.scores[0], temperature=temperature) | |
| sampled = self.sample(prob=prob, num_samples=1) | |
| new_tokens.append(sampled) | |
| sequence = torch.cat([sequence, sampled['sampled_token']], dim=1) |
jialefu and Kamichanw
Metadata
Metadata
Assignees
Labels
pendingSomething isn't workingSomething isn't working