Aligning Large Language Models

Repository of the papers:

Aligning Large Language Models via Chain-of-Thought Reasoning (EACL2024)

Citation

@inproceedings{ranaldi-freitas-2024-aligning,
    title = "Aligning Large and Small Language Models via Chain-of-Thought Reasoning",
    author = "Ranaldi, Leonardo  and
      Freitas, Andre",
    editor = "Graham, Yvette  and
      Purver, Matthew",
    booktitle = "Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = mar,
    year = "2024",
    address = "St. Julian{'}s, Malta",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.eacl-long.109",
    pages = "1812--1827",
    abstract = "Chain-of-Thought (CoT) prompting empowersthe reasoning abilities of Large Language Models (LLMs), eliciting them to solve complexreasoning tasks in a step-wise manner. However, these capabilities appear only in models with billions of parameters, which represent an entry barrier for many users who are constrained to operate on a smaller model scale, i.e., Small Language Models (SLMs). Although many companies are releasing LLMs of the same family with fewer parameters, these models tend not to preserve all the reasoning capabilities of the original models, including CoT reasoning.In this paper, we propose a method for aligning and transferring reasoning abilities between larger to smaller Language Models. By using an Instruction-tuning-CoT method, that is, an Instruction-tuning designed around CoT-Demonstrations, we enable the SLMs to generate multi-step controlled reasoned answers when they are elicited with the CoT mechanism. Hence, we instruct a smaller Language Model using outputs generated by more robust models belonging to the same family or not, evaluating the impact across different types of models. Results obtained on question-answering and mathematical reasoning benchmarks show that LMs instructed via the Instruction-tuning CoT method produced by LLMs outperform baselines within both in-domain and out-domain scenarios.",
}

Self-Refine Instruction-Tuning for Aligning Reasoning in Language Models (EMNLP2024)

Citation

@misc{ranaldi2024selfrefineinstructiontuningaligningreasoning,
      title={Self-Refine Instruction-Tuning for Aligning Reasoning in Language Models}, 
      author={Leonardo Ranaldi and Andrè Freitas},
      year={2024},
      eprint={2405.00402},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2405.00402}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
code		code
data		data
README.md		README.md
aligning.png		aligning.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Aligning Large Language Models

Aligning Large Language Models via Chain-of-Thought Reasoning (EACL2024)

Citation

Self-Refine Instruction-Tuning for Aligning Reasoning in Language Models (EMNLP2024)

Citation

About

Releases

Packages

Languages

lranaldii/Aligning_LLMs

Folders and files

Latest commit

History

Repository files navigation

Aligning Large Language Models

Aligning Large Language Models via Chain-of-Thought Reasoning (EACL2024)

Citation

Self-Refine Instruction-Tuning for Aligning Reasoning in Language Models (EMNLP2024)

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages