How can I use CLIP model to obtain the image and text embeddings? #260

zhujiajian98 · 2024-04-18T02:36:47Z

Now I need to try to use the prompt training scheme to finetune the yolo-world. But according to the docs/prompt_yolo_world.md, I need to extract the image embedding/text embedding by using the clip model and save the corresponding npy file. So now how can I use CLIP model to obtains the image and text embeddings?

The text was updated successfully, but these errors were encountered:

wondervictor · 2024-04-19T08:17:59Z

Hi @zhujiajian98, text embeddings can be extracted by generate_text_prompts.py, I'll provide the scripts for image embeddings in a day (before the next Monday).

liping-ren · 2024-04-24T03:27:06Z

Hello Author, I am interested in using images for fine tuning and would like to ask if you can provide the script that generates the image embedding? Thank you very much.

wondervictor · 2024-04-28T08:09:36Z

Hi @liping-ren, it has been added to tools/generate_image_prompts.py.

zhujiajian98 changed the title ~~How can I use CLIP model to obtains the image and text embeddings?~~ How can I use CLIP model to obtain the image and text embeddings? Apr 19, 2024

wondervictor added the discussions The issue might be helpful or contains useful information label Apr 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How can I use CLIP model to obtain the image and text embeddings? #260

How can I use CLIP model to obtain the image and text embeddings? #260

zhujiajian98 commented Apr 18, 2024

wondervictor commented Apr 19, 2024

liping-ren commented Apr 24, 2024

wondervictor commented Apr 28, 2024

How can I use CLIP model to obtain the image and text embeddings? #260

How can I use CLIP model to obtain the image and text embeddings? #260

Comments

zhujiajian98 commented Apr 18, 2024

wondervictor commented Apr 19, 2024

liping-ren commented Apr 24, 2024

wondervictor commented Apr 28, 2024