MTEB: Massive Text Embedding Benchmark
-
Updated
Mar 5, 2026 - Python
MTEB: Massive Text Embedding Benchmark
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings
Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard
Codebase for RetroMAE and beyond.
Code & data accompanying the KDD 2017 paper "KATE: K-Competitive Autoencoder for Text"
Efficient LLM inference on Slurm clusters.
Simple customizable evaluation for text retrieval performance of Sentence Transformers embedders on PDFs
Simple script to compute CLIP-based scores given a DALL-e trained model.
StickerSelector 是一个基于语义向量匹配的表情包选择系统,用于让 AI 在聊天中发送真正符合语境的表情包。 与传统依赖关键词或规则的方案不同,StickerSelector 会将 AI 生成的“表情包意图描述”与本地表情包进行语义匹配,从而选出在当前语境下最自然、最像真人会使用的表情。 适用于 AI 聊天机器人、QQ或微信的自动聊天、AI 女友等需要“拟人化表达”的场景。
HSTU-BLaIR: Lightweight Contrastive Text Embedding for Generative Recommender 🌱
Simple script to re-rank images using OpenAI's CLIP https://github.com/openai/CLIP.
Topic Embedding, Text Generation and Modeling using diffusion
Flask API for generating text embeddings using OpenAI or sentence_transformers
Contextual embedding for text blobs.
Чат-бот с LLL + RAG
PL-MTEB: Polish Massive Text Embedding Benchmark
Functionalities to work with colors. Cast colors types, find closest color using text-embeddings, find color complement.
Embedding a text to a vector by pre-trained BERT word embeddings and pooling layers, for the pur[ose of text similarity measuring
CLI that helps with docs splitting, embedding and exposing them in a seamless manner
Automatic generation of descriptive radiological reports from X-RAY scans
Add a description, image, and links to the text-embedding topic page so that developers can more easily learn about it.
To associate your repository with the text-embedding topic, visit your repo's landing page and select "manage topics."