Skip to content
View cdxeve's full-sized avatar
  • Renmin University of China
  • Renmin University of China

Highlights

  • Pro

Block or report cdxeve

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
cdxeve/README.md

Daixuan Cheng

  • I am a Ph.D. student at Gaoling School of AI, Renmin University of China, fortunately advised by Xin Zhao.

  • Ever since 2021, I have been a research student advised by Shaohan Huang and Furu Wei from the GenAI Group of Microsoft Research, with whom I have accomplished many of my representative works.

  • I was previously a research assistant in the CoAI Group, Tsinghua University, fortunately advised by Yuxian Gu and Minlie Huang. I also worked as a research engineer at BIGAI, fortunately collaborating with Xuekai Zhu.

Recent Focus:
My current research emphasizes Reinforcement Learning for LLM Reasoning, especially the Exploration Mechanisms!
Check out our works: Reasoning with Exploration: An Entropy Perspective (AAAI 2026), FlowRL and STILL.
Feel free to reach out if you are interested in collaboration or discussions!

Contact

Education

  • Ph.D. Student in Artificial Intelligence, Gaoling School of AI, Renmin University of China (2025 – Present)

  • M.S. in Computer Science, School of Computer Science, Beijing University of Posts and Telecommunications (2020 – 2023)

  • B.S. in Communication Engineering, School of Information and Communication Engineering, Beijing University of Posts and Telecommunications (2016 – 2020)

Research Interests

I am dedicated to enhancing Large Language Models (LLMs) across their entire lifecycle, including:

Selected Papers

(Full list on Google Scholar)

  • Reasoning with Exploration: An Entropy Perspective
    Daixuan Cheng, Shaohan Huang, Xuekai Zhu, Bo Dai, Wayne Xin Zhao, Zhenliang Zhang, Furu Wei
    (AAAI 2026 — Earliest Research on Exploration of RL in LLM reasoning, Relation between Entropy and Exploration, Proposed Entropy Advantage, Significant Pass@K Gain) pdf

  • FlowRL: Matching Reward Distributions for LLM Reasoning
    Xuekai Zhu, Daixuan Cheng, Dinghuai Zhang, Hengli Li, Kaiyan Zhang, Che Jiang, Youbang Sun, Ermo Hua, Yuxin Zuo, Xingtai Lv, Qizheng Zhang, Lin Chen, Fanghao Shao, Bo Xue, Yunchong Song, Zhenjie Yang, Ganqu Cui, Ning Ding, Jianfeng Gao, Xiaodong Liu, Bowen Zhou, Hongyuan Mei, Zhouhan Lin
    (arXiv Preprint, 2025 — Exploration of RL in LLM reasoning, 🤗 #1 Paper of the Day, Recipe at VERL) pdf code

  • Adapting Large Language Models via Reading Comprehension
    Daixuan Cheng, Shaohan Huang, Furu Wei
    (ICLR 2024 — Earliest Research on Domain LLMs, 500K+ Downloads on Hugging Face, #1 Trending of ALL Domain LLMs on Huggingface, 🤗 #2 Paper of the Day) pdf code huggingface

  • Instruction Pre-Training: Language Models are Supervised Multitask Learners
    Daixuan Cheng, Yuxian Gu, Shaohan Huang, Junyu Bi, Minlie Huang, Furu Wei
    (EMNLP 2024 (Main, Long Paper) — LLM pre-training, Recommended by Sebastian Raschka, 200K+ Downloads on Hugging Face, #2 Trending of ALL Huggingface Datasets, 🤗 #2 Paper of the Day) pdf code

  • Uprise: Universal Prompt Retrieval for Improving Zero-Shot Evaluation
    Daixuan Cheng, Shaohan Huang, Junyu Bi, Yuefeng Zhan, Jianfeng Liu, Yujing Wang, Hao Sun, Furu Wei, Denvy Deng, Qi Zhang
    (EMNLP 2023 (Main, Long Paper) — Early Research on RAG for LLMs, Top ML Papers of the Week (along with GPT-4)) pdf code

  • On Domain-Adaptive Post-Training for Multimodal Large Language Models
    Daixuan Cheng, Shaohan Huang, Ziyu Zhu, Xintong Zhang, Wayne Xin Zhao, Zhongzhi Luan, Bo Dai, Zhenliang Zhang
    (EMNLP 2025 (Findings, Long Paper) — Earliest Research on Domain MLLMs) pdf code huggingface

  • How to Synthesize Text Data without Model Collapse?
    Xuekai Zhu, Daixuan Cheng, Hengli Li, Kaiyan Zhang, Ermo Hua, Xingtai Lv, Ning Ding, Zhouhan Lin, Zilong Zheng, Bowen Zhou
    (ICML 2025 — Synthetic data for LLMs) pdf code

  • VL-Match: Enhancing Vision-Language Pretraining with Token-Level and Instance-Level Matching
    Junyu Bi, Daixuan Cheng, Ping Yao, Bochen Pang, Yuefeng Zhan, Chuanguang Yang, Yujing Wang, Hao Sun, Weiwei Deng, Qi Zhang
    (ICCV 2023 — Pre-training of Vision-language Models) pdf

  • Snapshot-guided domain adaptation for ELECTRA
    Daixuan Cheng, Shaohan Huang, Jianfeng Liu, Yuefeng Zhan, Hao Sun, Furu Wei, Denvy Deng, Qi Zhang
    (EMNLP 2022 (Findings, Short Paper) — Domain Adaptation of LM) pdf

Honors & Awards

  • Outstanding Reviewer of EMNLP (Top 0.5%)
  • 1st Place in the PhD Entrance Exam (Preliminary) at the GSAI, Renmin University of China
  • National Scholarship for Master Students (Top 1%)
  • 1st Prize in the National English Competition (Top 0.5%)

Pinned Loading

  1. microsoft/LMOps microsoft/LMOps Public

    General technology for enabling AI capabilities w/ LLMs and MLLMs

    Python 4.2k 353

  2. Xuekai-Zhu/FlowRL Xuekai-Zhu/FlowRL Public

    Python 120 11