-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Converting traditional ML algorithms using Hummingbird and benchmark model performance. #123
Comments
I'd like to work on this |
Sure, please go ahead. |
So I played around with it in this notebook: https://www.kaggle.com/code/alibizhenis/hummingbird
|
I compared three transformative models here: https://www.kaggle.com/code/alibizhenis/hummingbird-pca
|
Thanks for your experiment. Can you also please try to run same algorithms with |
I tested them in the same notebook with skl2onnx. Results of PCA and KernelPCA matched, while converted TruncatedSVD model produced completely different result for some reason. |
Thanks for the investigation. Can we also try to find any ARIMA model to convert to After doing that I would like you to wrap up all your investigations in this package's
Thanks for your hard work. |
Which framework would you like me to use for time series models? Because I don't think sklearn supports any. There is a statsmodels package, but it's not supported by hummingbird. |
Yeah agree. This is the part we need some investigation also. |
I added the notebooks for PCA and classification. Upon further research on time series forecasting, I concluded the following:
|
Upon further research, I couldn't find ways to convert models from popular time series packages like statsmodels. Nonetheless, I found ways to use some models in torchscript and onnx (mostly deep learning):
|
Can we perform following experiments:
|
Currently for build in ML algorithms in opensearch we need to write that in Java, which is sometimes more time consuming due to not having enough ML support in Java
One initiative we started is, we can write algorithm in TorchScript, trace the torchScript file and then load the model file in Opensearch using MLCommon's Model serving framework.
One bottleneck is, in torchScript we can't import any 3rd party library like scikit-learn so to include scikit-learn models in Opensearch we have to rewrite the algorithm in torchscrip which can't be the ideal solution.
To solve this problem we can use Hummingbird through which we can convert traditional machine learning algorithm to neural network based algorithm for faster execution and at the same time we should be able to convert the algorithm to torchScript or Onnx so that we can load the model in opensearch.
In this issue, we would like to investigate if humming bird will solve our issue or not.
The following steps can be done in the investigation
The text was updated successfully, but these errors were encountered: