Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Support for colbertv2.0 ? #355

Open
3 tasks
shatealaboxiaowang opened this issue Sep 10, 2024 · 8 comments
Open
3 tasks

Question: Support for colbertv2.0 ? #355

shatealaboxiaowang opened this issue Sep 10, 2024 · 8 comments

Comments

@shatealaboxiaowang
Copy link

Model description

Hi dear:
Thanks for your source code. can support for colbertv2.0 deployment ?

Thank you!

Open source status

  • The model implementation is available on transformers
  • The model weights are available on huggingface-hub
  • I verified that the model is currently not running in the lastest version pip install infinity_emb[all] --upgrade

Provide useful links for the implementation

No response

@michaelfeil
Copy link
Owner

Colbert is a late-interaction model (stateful).

Please provide some example code with only torch and the transformers library. I think it requires some client side computation (late interaction). Don't use any third party packages like colbert package.

@wirthual
Copy link
Collaborator

Hi @shatealaboxiaowang ,

You are able to run colbertv2 with infinity like so:

infinity_emb v2 --model-id colbert-ir/colbertv2.0

@simjak
Copy link

simjak commented Jan 2, 2025

@wirthual can you run it on RunPod?

@michaelfeil
Copy link
Owner

@simjak same as Colleen, you can’t use infinity serverless yet. You can spin up your own serverfull Runpod etc.

@simjak
Copy link

simjak commented Jan 2, 2025

@michaelfeil any plans to support serveless ColPali?
Is there an example of spinning off ColPali pod?

@michaelfeil
Copy link
Owner

port=7997
model1=michaelfeil/colqwen2-v0.1
model2=colbert-ir/colbertv2.0

# needs 16GB+
docker run -it --gpus all \
 -p $port:$port \
 michaelf34/infinity:latest \
 v2 \
 --model-id $model1 \
 --model-id $model2 \
 --port $port \
--dtype bfloat16 \
--batch-size 8 \
--device cuda

@simjak
Copy link

simjak commented Jan 3, 2025

@michaelfeil I tried to run on runpod, but got:

2025-01-03T13:14:45.429197971Z huggingface_hub.errors.EntryNotFoundError: 404 Client Error. (Request ID: Root=1-6777e2c5-19b02a382e0c73337abbfc1f;bd37d90f-3a39-4744-a1bd-c70bb380dfba)
2025-01-03T13:14:45.429203568Z Entry Not Found for url: https://huggingface.co/vidore/colqwen2-v1.0/resolve/main/config.json.
2025-01-03T13:14:45.429208543Z ERROR:    Application startup failed. Exiting.

Is there something wrong with this model https://huggingface.co/vidore/colqwen2-v1.0

image

@simjak
Copy link

simjak commented Jan 3, 2025

oh, I needed to use the merged version https://huggingface.co/vidore/colqwen2-v1.0-merged
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants