-
The Hong Kong University of Science and Technology
- Hong Kong SAR
- http://chendelong.world/
- @Delong0_0
- in/chendelong
Stars
Official repo of paper "Linguistic Minimal Pairs Elicit Linguistic Similarity in Large Language Models" in NeurIPS 2024 Workshop on Foundation Model Interventions (MINT)
Official reposity for paper "High-Dimension Human Value Representation in Large Language Models"
Multimodal Large Language Models for Remote Sensing (RS-MLLMs): A Survey
EntitySeg Toolbox: Towards Open-World and High-Quality Image Segmentation
Efficient vision foundation models for high-resolution generation and perception.
⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~
This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
A batched offline inference oriented version of segment-anything
Unofficial edge detection implementation using the Automatic Mask Generation (AMG) of the Segment Anything Model (SAM).
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
🛰️ Official repository of paper "RemoteCLIP: A Vision Language Foundation Model for Remote Sensing" (IEEE TGRS)
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
🦩 Visual Instruction Tuning with Polite Flamingo - training multi-modal LLMs to be both clever and polite! (AAAI-24 Oral)
Collection of Remote Sensing Vision-Language Models
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
🧀 Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".
Code and documentation to train Stanford's Alpaca models, and generate the data.
Instruct-tune LLaMA on consumer hardware
Data and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".
Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]
A Benchmark for Efficient and Compositional Visual Reasoning
An open-source framework for training large multimodal models.