-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Model] Support math-shepherd-mistral-7b-prm model #9697
[Model] Support math-shepherd-mistral-7b-prm model #9697
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
493b4b2
to
d3f0ead
Compare
d3f0ead
to
3e2c7f4
Compare
3e2c7f4
to
e62f65c
Compare
vllm/model_executor/models/bert.py
Outdated
self._pooler = Pooler( | ||
pooling_type=PoolingType[pooler_config.pooling_type] | ||
if pooler_config.pooling_type is not None else PoolingType.CLS, | ||
normalize=pooler_config.pooling_norm or True, | ||
softmax=pooler_config.pooling_softmax or False, | ||
step_tag_id=pooler_config.pooling_step_tag_id, | ||
returned_token_ids=pooler_config.pooling_returned_token_ids, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a factory method to Pooler
to automatically merge the config with model specific defaults?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g. we should be able to write
self._pooler = Pooler.from_config_with_defaults(
pooler_config,
# These values are overridden if they are set inside the config
pooling_type=PoolingType.CLS,
normalize=True,
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now, thanks for your effort and patience!
Now you just have to get the tests to pass. |
db2552b
to
4e468a3
Compare
Head branch was pushed to by a user without write access
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Went-Liang <wenteng_liang@163.com>
Signed-off-by: Went-Liang <wenteng_liang@163.com>
Signed-off-by: Went-Liang <wenteng_liang@163.com>
Signed-off-by: Went-Liang <wenteng_liang@163.com>
…e PoolerConfig Signed-off-by: Went-Liang <wenteng_liang@163.com>
Signed-off-by: Went-Liang <wenteng_liang@163.com>
Signed-off-by: Went-Liang <wenteng_liang@163.com>
Signed-off-by: Went-Liang <wenteng_liang@163.com>
Signed-off-by: Went-Liang <wenteng_liang@163.com>
Signed-off-by: Went-Liang <wenteng_liang@163.com>
Signed-off-by: Went-Liang <wenteng_liang@163.com>
Signed-off-by: Went-Liang <wenteng_liang@163.com>
f5434e1
to
d1b0f5b
Compare
Excuse me, the test produced the following error (as shown in the image). This doesn't seem to be caused by my code changes. Could you please advise on how to handle this? @DarkLight1337 |
I have retried the failing test, see if it passes this time. |
Looks like this issue comes from main branch, I have asked those with permissions to force-merge this. |
Thanks so much !!! |
Signed-off-by: Went-Liang <wenteng_liang@163.com> Signed-off-by: Randall Smith <Randall.Smith@amd.com>
Signed-off-by: Went-Liang <wenteng_liang@163.com> Signed-off-by: NickLucche <nlucches@redhat.com>
Signed-off-by: Went-Liang <wenteng_liang@163.com> Signed-off-by: NickLucche <nlucches@redhat.com>
FILL IN THE PR DESCRIPTION HERE
Support peiyi9979/math-shepherd-mistral-7b-prm as embedding model.
As mentioned by 9314, the Process-Supervised Reward Model, which provides reward scores for intermediate steps generated by LLMs, can offer more fine-grained optimization for Reinforcement Learning (RL). This will help the community reproduce the OpenAI O1 model. PR 9424 allows any model that adds a pooler method to be used as an embedding model.
Therefore, this PR adds a pooler method to
LlamaForCausalLM
, introduces apooling-type
named "STEP" and adds aPoolerConfig
class to facilitate users to configure the pooler method. In STEP mode, users can use thepeiyi9979/math-shepherd-mistral-7b-prm
model by setting thepooling-step-tag-id
andpooling-returned-token-ids
variables.pooling-returned-token-ids
represents a list of indices for the vocabulary dimensions to be extracted, such as the token IDs ofgood_token
andbad_token
in the math-shepherd-mistral-7b-prm model. Whenpooling-step-tag-id
is not None, it indicates that the score corresponding to thepooling-step-tag-id
in the generated sentence should be returned. Otherwise, it returns the scores for all tokens.The model can be served with:
And a test correspond to the example in the huggingface model page is:
Of course, you can also use it directly like this:
Thank you for your time on reviewing this PR :)