
- Beijing, China
- https://sprinter1999.github.io/website/
Lists (30)
Sort Name ascending (A-Z)
⏩Acceleration
AIGC
🎵Audio
🚗Auto-drive & 🤖Embodied
Awesome List
🏫Course
🐱CV
🏘FL
Foundation Model
🌐Graph
✨ Inspiration
👾Interesting
📚IR
iris
LT-experiments
MLsys
🎶Multi-modal
My track
📕NLP
🏴OOD&OD
💻RecSys
🤖RL
Security
Self-Supervised
Series learning
Tabular
Transfer Learning
👻Util
⚡Workspace
Σ Math Inspired
Starred repositories
A curated collection of research and techniques for protecting intellectual property of large language models, including watermarking, fingerprinting, and more.
[EMNLP 2025] UniFilter: A Unified Multimodal Data Quality Classifier for High-Quality Image-Text Interleaved and Caption Data Curation
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
A paper list of some recent works about Token Compress for Vit and VLM
Code release for Dataless Knowledge Fusion by Merging Weights of Language Models (https://openreview.net/forum?id=FCnohuR6AnM)
An unified model that seamlessly integrates multimodal understanding, text-to-image generation, and image editing within a single powerful framework.
[ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
[RelKD'24] Mamba4Rec: Towards Efficient Sequential Recommendation with Selective State Space Models
[KDD'25] Flow Matching for Collaborative Filtering
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
GRID: Generative Recommendation with Semantic IDs
Efficient vision foundation models for high-resolution generation and perception.
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
Wan: Open and Advanced Large-Scale Video Generative Models
A repo for open research on building large reasoning models
[ICML'25] EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling.
Official implementation of Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence
Efficient Reasoning Vision Language Models
Imitation learning algorithms with Co-training for Mobile ALOHA: ACT, Diffusion Policy, VINN
Pytorch Lightning Implement of Generative Recommenders
Official implementation of "Traceable Evidence Enhanced Visual Grounded Reasoning: Evaluation and Methodology"
[ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”
[CVPR 2025] "DiC: Rethinking Conv3x3 Designs in Diffusion Models", a performant & speedy Conv3x3 diffusion model.
[CVPR 2025] HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation