add paddle nv-embed-v1 #8785

Li-Z-Q · 2024-07-19T08:25:14Z

PR types

New features

PR changes

Models

Description

add paddle nv-embed-v1 embedding model, and integrate it into the MTEB evaluation framework

paddle-bot · 2024-07-19T08:25:19Z

Thanks for your contribution!

codecov · 2024-07-19T08:56:32Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 55.58%. Comparing base (57000fa) to head (6815387).
Report is 9 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #8785      +/-   ##
===========================================
+ Coverage    55.44%   55.58%   +0.14%     
===========================================
  Files          626      630       +4     
  Lines        98065    98382     +317     
===========================================
+ Hits         54368    54683     +315     
- Misses       43697    43699       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

sijunhe · 2024-07-24T13:49:57Z

legacy/pipelines/examples/contrastive_training/evaluation/mteb/eval_mteb.py

+        hf_model = LoRAModel.from_pretrained(base_model, peft_model_name, lora_config=lora_config, dtype="bfloat16")
+        return hf_model


hf_model -> model

sijunhe · 2024-07-24T13:54:27Z

legacy/pipelines/examples/contrastive_training/evaluation/mteb/mteb_models_nv.py

+        self.cross_attend_blocks_0_fn_to_kv = paddle.nn.Linear(in_features=4096, out_features=65536, bias_attr=False)
+        self.cross_attend_blocks_0_fn_to_out = paddle.nn.Linear(in_features=32768, out_features=4096, bias_attr=False)
+        self.cross_attend_blocks_0_fn_to_q = paddle.nn.Linear(in_features=4096, out_features=32768, bias_attr=False)
+        self.cross_attend_blocks_0_norm = paddle.nn.LayerNorm(4096)
+        self.cross_attend_blocks_0_norm_context = paddle.nn.LayerNorm(4096)
+
+        self.cross_attend_blocks_1_fn_net_0 = paddle.nn.Linear(in_features=4096, out_features=32768)
+        self.cross_attend_blocks_1_fn_net_2 = paddle.nn.Linear(in_features=16384, out_features=4096)
+        self.cross_attend_blocks_1_norm = paddle.nn.LayerNorm(4096)


这个模型里面的参数大小为什么是写死的？既然是基于Mistral，应该基于MistralConfig里面的大小

已修改为：使用config.json中的参数值

sijunhe · 2024-07-24T14:00:56Z

legacy/pipelines/examples/contrastive_training/README.md

+```
+export CUDA_VISIBLE_DEVICES=0
+python eval_mteb.py \
+       --base_model_name_or_path NV-Embed-v1 \


这个NV-Embed-v1 是怎么得到的呢？从torch 转过来的吗？

是的，陆老师发您的文件就是从torch转过来的paddle版本的NV-Embed-v1模型权重

sijunhe

lgtm

add paddle nv-embed-v1

f9b61a6

paddle-bot bot added the contributor label Jul 19, 2024

paddle-bot bot assigned ZHUI Jul 19, 2024

sijunhe reviewed Jul 24, 2024

View reviewed changes

rename hf_model and use config in models

6815387

sijunhe approved these changes Jul 28, 2024

View reviewed changes

sijunhe merged commit ee4944e into PaddlePaddle:develop Jul 28, 2024
11 of 12 checks passed

sijunhe added the Beijing Innovation Consortium label Jul 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add paddle nv-embed-v1 #8785

add paddle nv-embed-v1 #8785

Li-Z-Q commented Jul 19, 2024

paddle-bot bot commented Jul 19, 2024

codecov bot commented Jul 19, 2024 •

edited

Loading

sijunhe Jul 24, 2024

Li-Z-Q Jul 25, 2024

sijunhe Jul 24, 2024

Li-Z-Q Jul 25, 2024

sijunhe Jul 24, 2024

Li-Z-Q Jul 25, 2024 •

edited

Loading

sijunhe left a comment

		hf_model = LoRAModel.from_pretrained(base_model, peft_model_name, lora_config=lora_config, dtype="bfloat16")
		return hf_model

add paddle nv-embed-v1 #8785

add paddle nv-embed-v1 #8785

Conversation

Li-Z-Q commented Jul 19, 2024

PR types

PR changes

Description

paddle-bot bot commented Jul 19, 2024

codecov bot commented Jul 19, 2024 • edited Loading

Codecov Report

sijunhe Jul 24, 2024

Choose a reason for hiding this comment

Li-Z-Q Jul 25, 2024

Choose a reason for hiding this comment

sijunhe Jul 24, 2024

Choose a reason for hiding this comment

Li-Z-Q Jul 25, 2024

Choose a reason for hiding this comment

sijunhe Jul 24, 2024

Choose a reason for hiding this comment

Li-Z-Q Jul 25, 2024 • edited Loading

Choose a reason for hiding this comment

sijunhe left a comment

Choose a reason for hiding this comment

codecov bot commented Jul 19, 2024 •

edited

Loading

Li-Z-Q Jul 25, 2024 •

edited

Loading