This document provides an overview of the research papers reviewed, organized by detailed research fields. Each section includes a table with the paper's review date, publication venue, published year, and a link to the presentation.
Papers discussing pretraining strategies and unified text-to-text frameworks, which have significantly influenced NLP.
| # | Paper Title | Review Date | Conference / Venue | Published Year | Link |
|---|---|---|---|---|---|
| 1 | Seq2Seq with Attention | 2024.06.29 | ICLR | 2015 | Link |
| 2 | Attention Is All You Need | 2025.01.17 | NeurIPS | 2017 | Link |
| 3 | BERT: Bidirectional Encoder Representations from Transformers | 2025.01.21 | NAACL | 2019 | Link |
| 4 | Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer | 2025.01.25 | JMLR | 2019 | Link |
| 5 | BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension | 2025.01.30 | ACL | 2019 | Link |
This section covers research on sparsely-gated and mixture-of-experts architectures, focusing on scalable deep learning models.
| # | Paper Title | Review Date | Conference / Venue | Published Year | Link |
|---|---|---|---|---|---|
| 1 | Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer | 2025.02.11 | ICLR | 2017 | Link |
| 2 | LoftQ: LoRA-Fine-Tuning-Aware Quantization | 2025.02.19 | ICLR | 2023 | Link |
This section covers research on reinforcement learning techniques for aligning language models with human preferences, as well as studies on reasoning model architectures and their applications. It highlights methods that optimize language model behavior through human feedback and models designed for complex reasoning tasks.
| # | Paper Title | Review Date | Conference / Venue | Published Year | Link |
|---|---|---|---|---|---|
| 1 | Direct Preference Optimization: Your Language Model is Secretly a Reward Model | 2025.02.25 | NeurIPS | 2023 | Link |
| 2 | Reasoning Model | 2025.04.02 | - | - | Link |
| 3 | DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning | 2025.05.02 | Deepseek | 2025 | Link |
| 4 | OpenAI o1 Model | 2025.09.17 | openAI | 2024 | Link |
| 5 | Training language models to follow instructions with human feedback | 2025.07.11 | NeurIPS | 2023 | Link |
This section provides an overview of research papers focusing on the Llama family of large language models developed by Meta AI.
| # | Paper Title | Review Date | Conference / Venue | Published Year | Link |
|---|---|---|---|---|---|
| 1 | The LLaMA 3 Herd of Model | 2025.02.24 | Meta | 2024 | Link |
| 2 | LLaMA3 Code Review | 2025.03.03 | Meta | 2024 | Link |
| 3 | The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation | 2025.04.07 | Meta | 2025 | Link |
This document provides a structured reference for reviewed papers, categorized by major research topics. The summaries highlight key contributions and methodologies in pretrained transformer models, scalable architectures, reinforcement learning for human feedback, and the Llama family of models.