-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Serving][Refactor] Logit processor and logit bias support
This PR refactors the existing logit processing pipeline with a unfiied logit processor class. The logit processor class exposes two functions: - `InplaceUpdateLogits`, which takes in the raw logits produced by the model, and apply logit bias (which is introduced in this PR), presence/frequency/repetition penalties, and token id mask in order when needed. - `ComputeProbsFromLogits`, which takes in the updated logits, and invoke softmax with temperature to compute the probability distribution. The logit processor completely runs on GPU. This being said, all the logit bias / penalty / mask application and the softmax is backed by GPU kernels. This is a highlight difference compared with the logit processing prior to this PR, where the processing happens on CPU, and softmax also happens on CPU when any logit process is needed. With the unified logit processor, we simplified the interface of handling model's output logits in engine actions to make it cleaner. We also simplified the interface of Sampler. Preliminary results show that LogitProcessor brings a bit perf improvement when any processing is needed.
- Loading branch information
1 parent
43d38ee
commit d861a7d
Showing
23 changed files
with
1,008 additions
and
550 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.