Skip to content
View yangbang18's full-sized avatar
  • Peking University & Peng Cheng Laboratory
  • Shenzhen, China

Block or report yangbang18

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 36,379 5,501 Updated Feb 6, 2025

Official implementation of "ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis"

Python 1,130 42 Updated Nov 6, 2024

[NeurIPS 2024] Mitigating Object Hallucination via Concentric Causal Attention

Python 47 1 Updated Dec 24, 2024
Python 188 15 Updated Dec 22, 2024

[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation

Python 307 26 Updated Aug 24, 2024

[CVPRW 2024] TrafficVLM: A Controllable Visual Language Model for Traffic Video Captioning. Official code for the 3rd place solution of the AI City Challenge 2024 Track 2.

Python 30 3 Updated Jun 17, 2024

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

JavaScript 67,957 8,057 Updated Feb 5, 2025

Ollama Python library

Python 6,261 533 Updated Jan 28, 2025

[ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning

Python 347 29 Updated Sep 6, 2024

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch and FLAX.

Python 27,387 5,619 Updated Feb 6, 2025

Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.

Go 120,093 9,611 Updated Feb 6, 2025
Python 3,336 302 Updated Oct 16, 2024

ProtTrans is providing state of the art pretrained language models for proteins. ProtTrans was trained on thousands of GPUs from Summit and hundreds of Google TPUs using Transformers Models.

Jupyter Notebook 1,170 157 Updated Jan 22, 2025

A Flexible and Powerful Parameter Server for large-scale machine learning

Java 6,752 1,597 Updated Jan 16, 2024

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 18,124 1,299 Updated Jan 27, 2025

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 13,905 1,403 Updated Dec 25, 2024

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 6,932 529 Updated Dec 25, 2024

Ultralytics YOLO11 🚀

Python 36,081 6,952 Updated Feb 5, 2025

Official Pytorch Implementation for “DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Video”

Python 458 44 Updated Nov 23, 2024

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,837 126 Updated Oct 30, 2024
Python 92 9 Updated Jul 6, 2024

Usable Implementation of "Bootstrap Your Own Latent" self-supervised learning, from Deepmind, in Pytorch

Python 1,790 248 Updated Jul 15, 2024

Making large AI models cheaper, faster and more accessible

Python 39,042 4,363 Updated Feb 3, 2025

PyTorch code and models for the DINOv2 self-supervised learning method.

Jupyter Notebook 9,742 871 Updated Aug 7, 2024

[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

Python 269 8 Updated Nov 13, 2024

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 5,384 416 Updated Aug 7, 2024

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

Python 16,521 1,363 Updated Feb 1, 2025

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Python 1,693 134 Updated Jan 23, 2025

Radiology Report Generation with Frozen LLMs

Python 64 7 Updated Apr 19, 2024

A method to increase the speed and lower the memory footprint of existing vision transformers.

Python 1,003 72 Updated Jun 17, 2024
Next