Refine model file download for python backend #526

kaixuanliu · 2025-03-24T14:06:56Z

For models like jinaai/jina-embeddings-v2-base-code, Rust side will download 2 separate directories models--jinaai--jina-embeddings-v2-base-code/ and models--jinaai--jina-bert-v2-qk-post-norm/, while existing implementation only pass 1 Path to python backend. This PR refines model file download so that we can run models below using python backend:

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

kaixuanliu · 2025-03-24T14:25:01Z

@regisss @Narsil pls help review

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

Narsil · 2025-03-26T11:24:02Z

I'm sorry but I do not understand, how does it downloads 2 models ? I cannot reproduce on my side.

Can you provide steps to reproduce, it seems the candle backend properly downloads only 1 model.

Propagating model_id is not something desirable a priori, the fact that the model id is resolved first, and then only a path is used makes debugging usually much simpler.

kaixuanliu · 2025-03-26T12:58:38Z

Steps to re-produce on CPU:

build the docker container docker build --build-arg PLATFORM="cpu" -f Dockerfile-intel -t tei_cpu .
start backend:

model='jinaai/jina-embeddings-v2-base-code'
volume=$PWD/data
docker run -p 8080:80 -v $volume:/data -e TRUST_REMOTE_CODE=True tei-cpu --model-id $model

Narsil · 2025-03-26T17:28:35Z

I can now reproduce. The trust-remote-code behavior of this model is suprising to me, looking into what's going on.

Narsil · 2025-03-27T17:15:44Z

Thanks for surfacing this, discussing internally we figured there are some security implications of this behavior, which we're most likely going to close, so this behavior will go away (and force every repo to own their own remote code, so that trust_remote_code cannot be abused as much).

Before we're going through with this, we're trying to figure out the implications, are you aware of other models with that behavior ?

alvarobartt · 2025-03-27T18:36:35Z

Before we're going through with this, we're trying to figure out the implications, are you aware of other models with that behavior ?

I found out that https://huggingface.co/mims-harvard/ToolRAG-T1-GTE-Qwen2-1.5B uses the auto_map to reference code in other repository too

The following filter may not be fully accurate, but may help https://huggingface.co/models?other=custom_code,text-embeddings-inference&sort=trending

kaixuanliu · 2025-03-28T01:17:25Z

I have not met other models with this unexpected behavior. But these three I listed above are in TEI README part. Our customers are asking support for these 3 models:
Alibaba-NLP/gte-Qwen2-7B-instruct
Alibaba-NLP/gte-Qwen2-1.5B-instruct
jinaai/jina-embeddings-v2-base-code

pass model_id param to python backend to better download model files

c4a6a49

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

kaixuanliu added 3 commits March 24, 2025 12:00

fix bug

a56ac04

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

fix bug

95e3795

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

apply it to ClassificationModel and ClassificationModel as well

8e73f1e

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refine model file download for python backend #526

Refine model file download for python backend #526

kaixuanliu commented Mar 24, 2025

kaixuanliu commented Mar 24, 2025

Narsil commented Mar 26, 2025

kaixuanliu commented Mar 26, 2025

Narsil commented Mar 26, 2025

Narsil commented Mar 27, 2025

alvarobartt commented Mar 27, 2025 •

edited

Loading

kaixuanliu commented Mar 28, 2025

Refine model file download for python backend #526

Are you sure you want to change the base?

Refine model file download for python backend #526

Conversation

kaixuanliu commented Mar 24, 2025

kaixuanliu commented Mar 24, 2025

Narsil commented Mar 26, 2025

kaixuanliu commented Mar 26, 2025

Narsil commented Mar 26, 2025

Narsil commented Mar 27, 2025

alvarobartt commented Mar 27, 2025 • edited Loading

kaixuanliu commented Mar 28, 2025

alvarobartt commented Mar 27, 2025 •

edited

Loading