You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Now I need to try to use the prompt training scheme to finetune the yolo-world. But according to the docs/prompt_yolo_world.md, I need to extract the image embedding/text embedding by using the clip model and save the corresponding npy file. So now how can I use CLIP model to obtains the image and text embeddings?
The text was updated successfully, but these errors were encountered:
zhujiajian98
changed the title
How can I use CLIP model to obtains the image and text embeddings?
How can I use CLIP model to obtain the image and text embeddings?
Apr 19, 2024
Hi @zhujiajian98, text embeddings can be extracted by generate_text_prompts.py, I'll provide the scripts for image embeddings in a day (before the next Monday).
Hello Author, I am interested in using images for fine tuning and would like to ask if you can provide the script that generates the image embedding? Thank you very much.
Now I need to try to use the prompt training scheme to finetune the yolo-world. But according to the
docs/prompt_yolo_world.md
, I need to extract the image embedding/text embedding by using the clip model and save the correspondingnpy
file. So now how can I use CLIP model to obtains the image and text embeddings?The text was updated successfully, but these errors were encountered: