Loading open_clip model as backend? #7321

cuzokwe · 2024-06-04T14:05:47Z

cuzokwe
Jun 4, 2024

Hello, is it possible to load an open_clip text embedding model as a backend on the triton server? Not sure how I would export the weights of the model to TRT and formulate the config. Also not sure how I orchestration would work if I used the python backend instead. If anyone has experience with this please do let me know and apologies if this discussion does not belong here (also asked in open clip repo).

thanks

joaquincabezas · 2024-12-03T09:22:07Z

joaquincabezas
Dec 3, 2024

Hi! I think you should start by using the python backend and calling the open_clip_torch package as is. In initialize you would prepare the model and tokenizer, and in execute you would call encode_text. Of course you also need to properly manage the package triton_python_backend_utils for setting inputs and outputs.

You could use a config.pbtxt like this:

name: "YOUR_MODEL"
backend: "python"

input [
  {
    name: "INPUT"
    data_type: TYPE_STRING
    dims: [ 1 ]
  }
]

output [
  {
    name: "OUTPUT"
    data_type: TYPE_FP32
    dims: [ 1, 768 ]
  }
]

After you have this working, you could start considering converting the weights to ONNX or TRT and using ensembles to orchestrate the process. You can check my repo https://github.com/joaquincabezas/clip_is_awesome where I will be including serving recipes.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loading open_clip model as backend? #7321

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Loading open_clip model as backend? #7321

cuzokwe Jun 4, 2024

Replies: 1 comment

joaquincabezas Dec 3, 2024

cuzokwe
Jun 4, 2024

joaquincabezas
Dec 3, 2024