Sprinter1999

Follow

🏀

Working out

Xuefen Sprinter1999

🏀

Working out

Follow

Ph.D. student@Institute of Computing Technology, Chinese Academy of Sciences

286 followers · 934 following

Beijing, China
https://sprinter1999.github.io/website/

Achievements

Achievements

Stars

🎶Multi-modal

44 repositories

openai / CLIP

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Jupyter Notebook 28,365 3,537 Updated Jul 23, 2024

liuxubo717 / LASS

This repo hosts the code and model of "Separate What You Describe: Language-Queried Audio Source Separation", Interspeech 2022

Python 145 8 Updated Oct 11, 2023

zhang-tuo-pdf / FedAudio

[ICASSP 2023] FedAudio: A Federated Learning Benchmark for Audio and Speech Tasks

Python 49 1 Updated Feb 21, 2024

facebookresearch / learning-audio-visual-dereverberation

Code for paper Learning Audio-Visual Dereverberation

Python 27 5 Updated Aug 10, 2022

zeroQiaoba / MERTools

Toolkits for Multimodal Emotion Recognition

Python 196 17 Updated May 26, 2024

JiabenChen / iQuery

[CVPR 2023] iQuery: Instruments as Queries for Audio-Visual Sound Separation

Python 65 Updated Jul 25, 2023

minfenli / Segment-Anything-CLIP

Using Segment-Anything and CLIP to generate pixel-aligned semantic features.

Python 39 3 Updated Apr 27, 2023

JusperLee / Look2hear

A toolkit for researchers in the multimodal sound separation.

16 Updated Oct 20, 2023

UbiquitousLearning / Efficient_Foundation_Model_Survey

Survey Paper List - Efficient LLM and Foundation Models

242 18 Updated Sep 22, 2024

salesforce / BLIP

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Jupyter Notebook 5,169 681 Updated Aug 5, 2024

OpenGVLab / LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Python 5,855 381 Updated Mar 14, 2024

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 22,143 2,437 Updated Aug 12, 2024

LargeWorldModel / LWM

Large World Model -- Modeling Text and Video with Millions Context

Python 7,269 557 Updated Oct 19, 2024

vishaal27 / SuS-X

Code for the paper: "SuS-X: Training-Free Name-Only Transfer of Vision-Language Models" [ICCV'23]

Python 97 4 Updated Aug 22, 2023

Dlut-lab-zmn / SRS-ME

Separable Diffusion Model Unlearning

Python 12 Updated Jan 29, 2025

dorothy-yao / drfuse

DrFuse: Learning Disentangled Representation for Clinical Multi-Modal Fusion with Missing Modality and Modal Inconsistency (AAAI24)

Python 42 4 Updated Aug 20, 2024

cobanov / image-captioning

Image captioning using python and BLIP

Python 47 11 Updated Aug 16, 2023

opendatalab / MinerU

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具，将PDF转换成Markdown和JSON格式。

Python 30,209 2,406 Updated Apr 9, 2025

opendatalab / labelU

Data annotation toolbox supports image, audio and video data.

Python 1,142 116 Updated Apr 10, 2025

caoyunkang / AdaCLIP

[ECCV2024] The Official Implementation for ''AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection''

Python 209 9 Updated Dec 26, 2024

ECNU-DASE-NLP / RQP

Python 7 Updated Nov 16, 2023

LHRLAB / ChatKBQA

[ACL 2024] Official resources of "ChatKBQA: A Generate-then-Retrieve Framework for Knowledge Base Question Answering with Fine-tuned Large Language Models".

Python 293 25 Updated Aug 17, 2024

donghao51 / MultiOOD

[NeurIPS 2024, spotlight] Scaling Out-of-Distribution Detection for Multiple Modalities

Python 56 4 Updated Apr 8, 2025

penghu-cs / MRL

Learning Cross-Modal Retrieval with Noisy Labels (CVPR 2021, PyTorch Code)

Python 53 9 Updated Mar 5, 2023

WillDreamer / Aurora

[NeurIPS2023] Parameter-efficient Tuning of Large-scale Multimodal Foundation Model

Python 87 7 Updated Nov 28, 2023

open-compass / VLMEvalKit

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python 2,169 320 Updated Apr 10, 2025

docling-project / docling

Get your documents ready for gen AI

Python 26,735 1,607 Updated Apr 9, 2025

locuslab / llava-token-compression

Python 40 1 Updated Nov 8, 2024

ZebangCheng / Emotion-LLaMA

Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning

Python 263 26 Updated Apr 2, 2025

zhengli97 / Awesome-Prompt-Adapter-Learning-for-VLMs

A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.

489 22 Updated Apr 1, 2025