-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support RegisterCustomOpsLibrary via the Python API #4764
Conversation
so2 = so1 | ||
|
||
# Model loading successfully indicates that the custom op node could be resolved successfully | ||
sess1 = onnxrt.InferenceSession(custom_op_model, so1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add custom_op_lib_paths as an optional arguments? So the user doesn't need to jump to another sessionOption to do that. And this looks more pythonic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to support an arg in InferenceSession ? That is difficult to do.
Also, the user doesn't need to jump to another session options. There is only one session options.
so = ort.SessionOptions()
so.register_custom_ops_library(shared_lib_path)
sess = ort.InferenceSession(model_path, so)
I think this is very simple and it aligns with all the other language bindings. Also, session options can be shared across multiple sessions. If we tie it to Session, we have to keep referencing the library in all the Sessions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added another session options just to demonstrate that it is "re-useable"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may not be difficult as you think.
In InferenceSession.init python function, just add a few lines as following
if custom_op_lib_paths:
so.register_custom_ops_library(custom_op_lib_paths)
...
The overall C++ and pybinding interface keeps same, the only difference is to make python wrapper more friendly to avoid everyone write couple of same code lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if custom_op_lib_paths: so.register_custom_ops_library(custom_op_lib_paths)
I find this hacky and not providing too much value (saves 2 lines of code while potentially causing maintenance hell) for several reasons:
-
What is
so
here then ? Construct a SessionOptions instance in InferenceSession on the fly just to hold the custom op libs ? -
What if the user provides his/her own SessionOptions ? Then do we add the custom_op to that SessionOptions instance (and mutate something the user provided) or do we hold another "dummy" SessionOptions instance just to hold the custom_ops ?
-
I think this is unnecessary complications just to save
so = Ort.SessionOptions so.register_ops_library(path)
. This is more clear in that it ties the shared library to the SessionOptions instance and thereby allowing a user to share it across sessions if he/she chooses to do so. -
Also, by exposing the extra optional parameter in InferenceSession init, it does not keep it consistent with c/c++/c# - none of these language bindings will expose a Session constructor that takes a path to the shared library - they need to pass it in via SessionOptions as well. My suggestion is to stick with the same.
Is there any concern with instantiating a SessionOptions instance ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless you think these two lines are problematic, I don't see what's the difference between putting it in ort code wrapper and the user code, right? And if you are the user, will it be helpful if there is a simpler API for you?
And No one expects C++ and Python interface are exactly same, only the parity features.
Is there any ETA on merging this PR? My concern is not a blocking issue here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Wenbing.I think it is problematic for the concerns/ reasons listed above. Also, I don’t want to set the precedence of exposing an optional parameter in the InferenceSession constructor for every Session option because that defeats the purpose of SessionOptions.
The PR will be merged once I solve all build issues- working on it now.
// If domain is empty, it is assumed to be part of the ONNX domain | ||
if (domain->domain_[0]) { | ||
// Add it to the DomainToVersion ONNX map if it doesn't already exist | ||
// For example, two sessions using the same session_options should not add the same custom op domain to the version map twice |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fails a debug assert in the ONNX code if we try to add an item that already exists in the map. hence, added this guard.
// A non-nullptr indicates some error | ||
// Free status and throw | ||
platform_env.UnloadDynamicLibrary(library_handle_); | ||
::free(status); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Free the returned OrtStatus if not nullptr (error scenario)
|
||
// Unload the library when the destructor is triggered | ||
CustomOpLibrary::~CustomOpLibrary() { | ||
platform_env.UnloadDynamicLibrary(library_handle_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unload the library when the corresponding struct that holds it goes out of scope
ORT_DISALLOW_COPY_ASSIGNMENT_AND_MOVE(CustomOpLibraries); | ||
|
||
private: | ||
std::vector<std::unique_ptr<CustomOpLibrary>> custom_op_libraries_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CustomOpLibraries will "own" the Customlibrary. The session options that needs to load the shared library needn't hold onto it after it has gotten what it needs from the library.
@hariharans29 Thanks a lot for working on this. This is going to be very useful. When will this be available in the release branch or which release branch? |
Hi Saurabh- It should be in the upcoming release (due to release soon this month). |
Description:
RegisterCustomOpsLibrary
is a mechanism via which custom kernels in shared libraries can be used to support a model containing ops not yet in the ONNX opset. This change supports it via the Python API.There are some challenges associated with this:
Freeing
OrtCustomOpDomain
s associated with the custom kernels (onnxruntime/onnxruntime/test/testdata/custom_op_library/custom_op_library.cc
Line 120 in fdc5c30
OrtReleaseCustomOpDomain()
. In Python, just like in C/C++, we rely on the implementer of the shared library to release the OrtCustomOpDomains created within it (i.e.) we don't take ownership and manage the resource for the user.Unloading the shared library
Currently the user has to manually unload the library via the provided handle while using the C/C++ API. In Python, we unload the library automatically (no action needed by the user) when the last session using the session options connected with the shared library goes out of scope.
TODO:
Update relevant documentation in the custom op md files
See how this plays with the shallow copy and the deep copying mechanism offered by the
copy
module (https://docs.python.org/3/library/copy.html). If we think it is unsafe to allow deep copying SessionOptions instances in Python (because they hold associated OrtCustomOpDomain pointers), explicit disallow it by overriding the method for now until we have support for it. Python aliases are okay and they are not copies. For example,so1 = Sessionoptions() so2 = so1
. This only creates aliases to the same object and preserves the object until all the referenced aliases go out of scope and this is supported.EDIT: It seems like currently
SessionOptions
isn't picklable. We need to investigate which instance attribute is causing the pickling issue and solve that. (Not sure if we even want to support pickling given that it could lead to security concerns). But given that it is not pickable, nobody will be able to shallow copy (copy.copy()
) or deep copy (copy.deepcopy()
) SessionOptions using the copy module and hence there is no action required.CC: @wenbingl
Motivation and Context
Fix #3958