-
Notifications
You must be signed in to change notification settings - Fork 432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tflite] TFL_VAR_HANDLE
and TFL_READ_VARIABLE
are not supported
#2059
Comments
TFL_VAR_HANDLE
and TFL_READ_VARIABLE
are not supported
I don't have permission to push my changes to your private branch. Please try: Add the two lines below into tf2onnx/tflite_handlers/tfl_direct.py file after line 93 @tfl_op("TFL_READ_VARIABLE", tf_op="ReadVariableOp") After this, please try your patch again. |
@fatcat-z Thanks for your response! Do you mean after line 91? If so, I tried that and weirdly it didn't work - same error message logs as before: https://colab.research.google.com/gist/josephrocca/5af909bd240264cdecd4598903be8dfa
I've added you as a collaborator to that repo in case you wanted to try any changes yourself, but please feel free to suggest other things for me to try. Thanks for your help with this 🙏 |
Yes, after line 91. Did you run python setup.py develop to install the local tf2onnx version in you test environment? |
@fatcat-z Oh, I was installing it like this, as you can see in the linked Colab above:
but I just tried this:
and the same errors occurred. I'm a bit of a Python noob (I come from the web/JS world), so please excuse my incompetence here 😬 |
I'm using your branch to convert that tflite model to onnx and got below error in the result: 2022-10-14 14:49:15,567 - ERROR - Unsupported ops: Counter({'VarHandleOp': 14, 'ReadVariableOp': 14}) This is expected, because we didn't implement these 2 TF ops yet. I'll try to see if it could be done soon. |
Please try the code in this branch, this commit. Those ops are designed for training which are not supported by tf2onnx and won't impact the inference results. Removing them should has no impact to the final inference results. Please leverage it generate a new onnx file and see if the results are correct. |
Thanks!! There are some other down-stream issues in ORT Web preventing the model from running correctly but I think they might be specific to ORT Web, rather than this conversion process. I'll post a separate issue if I can't work out what's going wrong there. (I'll leave it to you to re-open this if you'd like to keep this open until it's fully confirmed that these changes didn't affect the correctness of the inference results.) |
Do you mind running ORT in your local machine to confirm the correctness? |
👍 I've added a correctness check (and reminder to comment here) to the todo list for this project. |
@fatcat-z An update on this: I tried to check correctness, and while the tflite file works fine, both ORT Python and ORT Web are throwing an error using the converted ONNX model. Here's a full minimal reproduction of the tflite inferene --> conversion to onnx --> onnx inference:
I originally posted a question about this here: microsoft/onnxruntime#13383 But it looks like it might be a conversion issue rather than a runtime issue, since I see a |
This should be a conversion issue that some information was lost. Working on a fix. |
Hello, I am trying to convert a tflite float16 model into ONNX. I have tried the mentioned change to avoid the outputnames issue. But I also got ReadVariableOp error. Here is the error information:
Althrough the tflite fp16 model is converted into ONNX, it seems not expected. Any updates about the fixing? Thanks. |
The solution is there and I'm working on code. The current ETA would be this week because of some unexpected things. |
Thank you for updating. Look forward to it. |
I believe that the variable operations are not only for training but also for inference. The model is using this as memory for each inference and the value from the last inference is concatenated into the current inference. The problem in shape is most likely coming from the fact that these values [1,48,1] from the ReadVariableOp are missing. Unfortunately I think these operations will have to be supported if you want to convert your model. |
Probably you can access the private branch for a local debug. |
@fatcat-z That seems like it might have worked - it definitely fixed the error messages that I was getting previously. A conversion notebook that includes an inference comparison between the original tflite model and the new onnx model: https://colab.research.google.com/gist/josephrocca/ecb5a2faf54b06eb700ecc562557c6a9/onnx-runtime-python-inference.ipynb#scrollTo=6dhtRr-ru03Q TFLite outputs:
ONNX outputs:
Some of those numbers are almost exactly the same (to ~3 decimal places), while others are quite far off. Is that expected? Guessing it depends a lot on the particular model and the sorts of ops it has? I will continue to investigate this. In any case, please feel free to close this if you are satisfied that the TFL_VAR/READ support is solved based on the above notebook outputs. Thanks for your work on this! |
Okay, I think what might have been happening there was that I was feeding |
If I understand the fix correctly it will remove some VAR_OPS and fill others with zeroes. This will fix your shape issue but it wont work with the architecture of the model. As you can see below from the snippet of the encoder, the model is using the VAR_OPS to keep track of the previous inference: VAR_HANDLE should handle the mapping to the variable. Without these the output of the model will be different from the original tensorflow model. |
@dramaticlama Looks like you're correct!
@fatcat-z I'm wondering if ONNX supports this sort of thing? If not, maybe the converter could just turn variable nodes into input/output nodes, or something like that? |
When I prepared this PR, I was using another version of soundstream_encoder.tflite you shared. For that one, the PR works accidentally. @dramaticlama 's thoughts are correct so we probably finally could not find out a way to convert such model successfully. |
@fatcat-z Ah I see. Can you see any possible path forward here to convert this into a stateless ONNX model? For example, could we turn So the user would have to manually pipe the output variables back into the input during the next inference according to the details of the logged (I guess, ideally, all the "stateful" variables could be "bundled" into a single output/input so the model would just have to pipe that one extra output back into the one extra input.) |
New Operator
The
TFL_VAR_HANDLE
andTFL_READ_VARIABLE
operators are used in Lyra's "soundstream encoder" model and "lyragan" model. I don't know the specifics of what these operators are for other than what is implied by their name, and I probably can't contribute them. Here are the two models:And here's a minimal reproduction of the errors:
https://colab.research.google.com/gist/josephrocca/5af909bd240264cdecd4598903be8dfa
Note that the above Colab uses this patch (the only change is here) of tf2onnx to hackily avoid this problem.
Here's the full outputs of the conversion commands:
soundstream_encoder.tflite
lyragan.tflite
The text was updated successfully, but these errors were encountered: