Skip to content

Latest commit

 

History

History
36 lines (21 loc) · 1.57 KB

llm_model.md

File metadata and controls

36 lines (21 loc) · 1.57 KB

LLM Model

Survey

LLM Models

  • TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters, arXiv, 2410.23168, arxiv, pdf, cication: -1

    Haiyang Wang, Yue Fan, Muhammad Ferjad Naeem, ..., Federico Tombari, Bernt Schiele · (TokenFormer - Haiyang-W) Star

  • MrT5: Dynamic Token Merging for Efficient Byte-level Language Models, arXiv, 2410.20771, arxiv, pdf, cication: -1

    Julie Kallini, Shikhar Murty, Christopher D. Manning, ..., Christopher Potts, Róbert Csordás

  • Scaling Diffusion Language Models via Adaptation from Autoregressive Models, arXiv, 2410.17891, arxiv, pdf, cication: -1

    Shansan Gong, Shivam Agarwal, Yizhe Zhang, ..., Hao Peng, Lingpeng Kong

    · (arxiv) · (DiffuLLaMA - HKUNLP) Star · (huggingface)

State Space Model

Projects

Misc