Skip to content
View ashun989's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report ashun989

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Offical implementation of "Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection"

Python 48 Updated Jan 20, 2025

Offical implementation of "Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection"

Python 18 Updated Jan 20, 2025
27 1 Updated Sep 27, 2024

[Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning

Python 212 9 Updated Dec 22, 2024

DeepSpeed教程 & 示例注释 & 学习笔记 (大模型高效训练)

Python 147 1 Updated Sep 7, 2023

Use PEFT or Full-parameter to finetune 400+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek3, ...) and 150+ MLLMs (Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, Inter…

Python 5,146 445 Updated Jan 24, 2025

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,829 124 Updated Oct 30, 2024

Official Code for 'TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction'

53 Updated Dec 26, 2024

Official PyTorch Code for "ATPrompt: Textual Prompt Learning with Embedded Attributes"

Python 18 Updated Dec 23, 2024

PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"

Python 559 38 Updated Jan 7, 2024

A Survey on Benchmarks of Multimodal Large Language Models

83 6 Updated Jan 2, 2025
15 Updated Dec 11, 2024

Official repository of the paper "MaskCLIP++: A Mask-Based CLIP Fine-tuning Framework for Open-Vocabulary Image Segmentation"

Python 17 1 Updated Jan 17, 2025

Official implement of ICML2024 Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation

Python 46 3 Updated Aug 15, 2024

Official code for "SRFormer: Permuted Self-Attention for Single Image Super-Resolution" (ICCV 2023) and SRFormerV2

Python 251 22 Updated Aug 18, 2024
Python 171 13 Updated Jan 2, 2025
Python 90 15 Updated Dec 21, 2024

[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"

Python 783 42 Updated Aug 5, 2024

Official code implementation of Vary-toy (Small Language Model Meets with Reinforced Vision Vocabulary)

Python 613 46 Updated Dec 30, 2024

[ECCV 2024] Early Preparation Pays Off: New Classifier Pre-tuning for Class Incremental Semantic Segmentation

Python 28 2 Updated Nov 1, 2024

EVA Series: Visual Representation Fantasies from BAAI

Python 2,397 173 Updated Aug 1, 2024

The official VOT Challenge evaluation and analysis toolkit

Python 173 48 Updated Dec 18, 2024

[ECCV'18] Long-term Tracking in the Wild: A Benchmark

Python 180 37 Updated Dec 26, 2019

OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]

Python 1,214 44 Updated Dec 11, 2024

🎨 ML Visuals contains figures and templates which you can reuse and customize to improve your scientific writing.

14,114 1,416 Updated Feb 13, 2023

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 38,473 5,047 Updated Jan 23, 2025

Visual tracking library based on PyTorch.

Python 3,308 608 Updated Aug 8, 2024

LAVIS - A One-stop Library for Language-Vision Intelligence

Jupyter Notebook 10,187 991 Updated Nov 18, 2024

Accepted as [NeurIPS 2024] Spotlight Presentation Paper

Jupyter Notebook 6,139 616 Updated Sep 26, 2024

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Python 503 31 Updated May 8, 2024
Next