-
-
Notifications
You must be signed in to change notification settings - Fork 9.7k
[custom_op][vllm-plugin] update custom_op class to use op_registry #19164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @xuechendi, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
Hello team, gemini-code-assist here to provide a summary of this pull request. This PR addresses an issue where the existing CustomOp
implementation didn't fully leverage the op_registry
for initialization, making it difficult for plugins to properly override default operator implementations without resorting to monkey patching. The core idea is to modify the CustomOp
class so that when it's instantiated, it looks up the actual implementation class in the op_registry
and instantiates that instead. Additionally, the @CustomOp.register
decorator is enhanced to allow plugin-provided custom ops to explicitly override existing registered ops, providing a cleaner mechanism for plugins to inject their own implementations.
Highlights
- Custom Op Instantiation Logic: A
__new__
method is added to the baseCustomOp
class. This method intercepts the class creation process, checks theop_registry
for a class registered under the same name, and delegates the actual object creation to the registered class. This ensures that when code tries to instantiateCustomOp
(or a subclass registered with a specific name), it gets the correct, potentially overridden, implementation from the registry. - Enhanced Registration for Overrides: The
@CustomOp.register
decorator is updated to accept an optionalcustom_op=True
argument. When this flag is set and a class with the given name is already registered, the decorator will now replace the existing entry in theop_registry
with the new class being registered. This provides a clear and intended way for plugins to override default operator implementations.
Changelog
- vllm/model_executor/custom_op.py
- Added
__new__
method toCustomOp
to enable dynamic instantiation from theop_registry
(lines 19-23). - Modified the
register
class method decorator to acceptcustom_op=True
and allow overriding existing registered ops (lines 150-161).
- Added
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
A decorator, a clever wrap,
Changes how functions clap.
With@register
's might,
Ops take new flight,
No more monkey patch trap.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant improvement to the CustomOp
registration and instantiation mechanism, enabling plugins to override existing operators by leveraging the op_registry
. The changes in __new__
and the register
decorator are central to this enhancement.
While the direction is good, there are a few critical and high-severity issues in the register
method's logic that could lead to unexpected behavior or errors, particularly concerning how overriding classes are handled and how name collisions are managed when custom_op=False
. Additionally, the __new__
method could be made more robust in handling unregistered classes.
Addressing these points will greatly improve the reliability and predictability of the custom operator system.
Summary of Findings
- Robustness of
__new__
method: The__new__
method inCustomOp
could lead to an unhandledAttributeError
ifcls.name
is not defined. Improved error handling is recommended. - Missing
name
attribute setting inregister
: When overriding an operator usingcustom_op=True
in theregister
decorator, thename
attribute is not set on the overriding class, which can cause issues during its instantiation. - Incorrect return value from
register
decorator: Theregister
decorator may return a different class than the one being decorated in cases of name collision withcustom_op=False
. This violates decorator conventions and can lead to subtle bugs. It should consistently return the decorated class or raise an error on invalid registration attempts.
Merge Readiness
The pull request makes valuable changes to the custom operator framework. However, there are critical and high-severity issues identified in the register
method's logic, along with a medium-severity robustness concern in the __new__
method. These issues should be addressed before merging to ensure the stability and predictability of the custom operator system. I am unable to approve this PR myself, and it should be reviewed and approved by others once these changes are made.
@simon-mo @MengqingCao @aurickq , please take a review. |
1190506
to
92f0202
Compare
@mgoin @youkaichao @wangxiyuan , please also have a review on this PR. As part of RFC #19161 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your efforts!
This change is overall lgtm, but there is one small suggestion I have commented.
And could you add test and docstring for the usage?
d44b398
to
1bb1ae9
Compare
@MengqingCao , I updated the register decorator arg_name and also add an UT for oot_custom_op, please review again, Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is reasonable to me, thanks for this. probably want to cc either @youkaichao or @houseroad for this?
Also cc @zou3519 as well. |
tests/plugins/vllm_add_dummy_platform/vllm_add_dummy_platform/dummy_custom_ops.py
Outdated
Show resolved
Hide resolved
@CustomOp.register("RotaryEmbedding", is_oot_custom_op=True) | ||
class DummyRotaryEmbedding(RotaryEmbedding): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some nits about the API. CustomOp.register is used for registering a new class of custom operators. What we're doing here with the is_oot_custom_op
flag is that is actually that whenever a RotaryEmbedding is initialized, it will actually be replaced with a DummyRotaryEmbedding.
It also sounds like you might want to replace special variants, like Llama4RotaryEmbedding with NPURotaryEmbedding. The first input to register
is supposed to be the "base name" of a custom operator (like "RotaryEmbedding"), but it sounds like in your case you may want to use "Llama4RotaryEmbedding" instead.
If we're going with this design, I think there should be a separate API because it seems like something separate from CustomOp.register.
@RotaryEmbedding.monkeypatch # or RotaryEmbedding.use_out_of_tree if you want it to be nicer.
class DummyRotaryEmbedding(RotaryEmbedding):
pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the seperate API will lead to bigger changes to existing vllm codes right? Currently, since most of the ops are already derived from custom_op, it is much smaller change in this PR to make things working.
And the main reason is also we want to avoid monkey patch, so we want to have a single entry which is CustomOp.
To clarify, we want do things like:
replace Llama4RotaryEmbedding to HPULlama4RotaryEmbedding
replace RotaryEmbedding to HPURotaryEmbedding
And HPULlama4RotaryEmbedding is different to HPURotaryEmbedding in impl
here is the actual impl: https://github.com/HabanaAI/vllm-hpu-extension/tree/plugin_poc/vllm_hpu/ops
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will be the major impact if we reuse custom_op to register oot_op vs using the way you suggested by decorate on actual layer class?
I think it makes more sense to us by reusing custom_op, this prevent massive update to vllm.
If you think reuse 'custom_op.register' might bring confusion, I can add a new decorator in custom_op and using 'custom_op.register_oot' instead, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zou3519 , I've updated the PR based on your suggestion. Now either using CustomOP.register_oot('RotaryEmbedding')
or RotaryEmbedding.register_oot
will register custom class to CustomOp
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
fd4678e
to
bf79f74
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, this looks reasonable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the change. It's fine
@simon-mo , can you review the PR again, thanks |
@simon-mo is out this week, maybe @youkaichao or @WoosukKwon ? |
Signed-off-by: nie3e <adrcwiek@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> added notebooks to playground updates remoted verbatim HF secrets from all files updates [custom_op][vllm-plugin] update custom_op class to use op_registry (vllm-project#19164) Signed-off-by: Chendi.Xue <chendi.xue@intel.com> Export NaNs in logits to scheduler_stats if output is corrupted (vllm-project#18777) Signed-off-by: Vlad Mihailescu <vtmihailescu@gmail.com> [CPU][CI] Fallback sliding window to v0 and fix CPU pooling model tests (vllm-project#19901) Signed-off-by: jiang1.li <jiang1.li@intel.com> [Kernel] mark TorchSDPABackend swap_blocks NotImplementedError (vllm-project#19749)
…llm-project#19164) Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
…llm-project#19164) Signed-off-by: Chendi.Xue <chendi.xue@intel.com> Signed-off-by: juncheoll <th6re8e@naver.com>
…llm-project#19164) Signed-off-by: Chendi.Xue <chendi.xue@intel.com> Signed-off-by: fhl <2410591650@qq.com>
…llm-project#19164) Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
…llm-project#19164) Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
…llm-project#19164) Signed-off-by: Chendi.Xue <chendi.xue@intel.com> Signed-off-by: Will Eaton <weaton@redhat.com>
…llm-project#19164) Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
…llm-project#19164) Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
…llm-project#19164) Signed-off-by: Chendi.Xue <chendi.xue@intel.com> Signed-off-by: avigny <47987522+avigny@users.noreply.github.com>
Co-Authored with @kzawora-intel
Essential Elements of an Effective PR Description Checklist
Purpose
POC for #19161
Problem:
Solution:
Test Plan
Verfied by a poc hpu plugin
Test Result
Verified that HPUUnquantizedFusedMoEMethod in hpu plugin is called.