Replies: 1 comment
-
Hi! I think you should start by using the python backend and calling the open_clip_torch package as is. In initialize you would prepare the model and tokenizer, and in execute you would call encode_text. Of course you also need to properly manage the package triton_python_backend_utils for setting inputs and outputs. You could use a config.pbtxt like this:
After you have this working, you could start considering converting the weights to ONNX or TRT and using ensembles to orchestrate the process. You can check my repo https://github.com/joaquincabezas/clip_is_awesome where I will be including serving recipes. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, is it possible to load an open_clip text embedding model as a backend on the triton server? Not sure how I would export the weights of the model to TRT and formulate the config. Also not sure how I orchestration would work if I used the python backend instead. If anyone has experience with this please do let me know and apologies if this discussion does not belong here (also asked in open clip repo).
thanks
Beta Was this translation helpful? Give feedback.
All reactions