-
Nankai University
- Tianjin
- http://ajupytetr.blog.csdn.net
- https://www.yuque.com/ajupyter
- https://www.zhihu.com/people/grit-35-86/posts
Lists (1)
Sort Name ascending (A-Z)
Stars
[ACL'25] SocialEval: Evaluating Social Intelligence of Large Language Models
Trinity-RFT is a general-purpose, flexible and scalable framework designed for reinforcement fine-tuning (RFT) of large language models (LLM).
aJupyter / LLM-RLHF-Tuning
Forked from Joyce94/LLM-RLHF-TuningLLM Tuning with PEFT (SFT+RM+PPO+DPO with LoRA)
LLM Tuning with PEFT (SFT+RM+PPO+DPO with LoRA)
aJupyter / LLM-Tuning
Forked from beyondguo/LLM-TuningTuning LLMs with no tears💦; Sample Design Engineering (SDE) for more efficient downstream-tuning.
Tuning LLMs with no tears💦; Sample Design Engineering (SDE) for more efficient downstream-tuning.
Recent Advances on MLLM's Reasoning Ability
My learning notes/codes for ML SYS.
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
A curated list of cutting-edge research papers and resources on Long Chain-of-Thought (CoT) Reasoning with Tools.
slime is a LLM post-training framework for RL Scaling.
Official Repo for Open-Reasoner-Zero
这是一个简单的技术科普教程项目,主要聚焦于解释一些有趣的,前沿的技术概念和原理。每篇文章都力求在 5 分钟内阅读完成。
[ICLR 2025] The First Multimodal Seach Engine Pipeline and Benchmark for LMMs
MMSearch-R1 is an end-to-end RL framework that enables LMMs to perform on-demand, multi-turn search with real-world multimodal search tools.
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
从零到一实现了一个多模态大模型,并命名为Reyes(睿视),R:睿,eyes:眼。Reyes的参数量为8B,视觉编码器使用的是InternViT-300M-448px-V2_5,语言模型侧使用的是Qwen2.5-7B-Instruct,Reyes也通过一个两层MLP投影层连接视觉编码器与语言模型。
Minimal reproduction of DeepSeek R1-Zero
The simplest, fastest repository for training/finetuning small-sized VLMs.
This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025]
The code used to train and run inference with the ColVision models, e.g. ColPali, ColQwen2, and ColSmol.
A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, llama-3.2-vision, qwen-vl, qwen2-vl, phi3-v etc.
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
Automatically crawl arXiv papers daily and summarize them using AI. Illustrating them using GitHub Pages.