GitHub - TylerAnderton/NLP-Final-Paper: Graduate research paper on investigating adversarial attacks on an LLM model and the proposal of methods to enhance robustness and safety of LLMs in production.

This unguided research project served as the final paper for the Natural Language Processing graduate course at UT Austin. My partner and I sought to contribute to the field of large language models by attacking the ELECTRA-small model with adversarial data that mimics a production environment. We then evaluated the errors produced by these adversarial attacks and proposed methods for enhancing the robustness and safety of consumer-facing LLMs.

To learn more, please read through the full report, from which the abstract is displayed below. Unfortunately, as this was an assignment for an active class at the University of Texas, sharing the project files would breach the Academic Honesty agreement, but I hope that the paper includes enough detail that this work could be recreated by an interested party.

Abstract

Question answering is a popular NLP task, driven in part by popular interest in commercializing recent advances in LLMs; however, the excellent performance of these models on common academic QA benchmarks does not always transfer cleanly to industrial contexts (Ribiero et al. 2020). One egregious example of this is when seemingly innocuous changes to the input (e.g a typo or missing word) drastically reduce performance (Gardner et al. 2020). Such model “blind spots” are commonly referred to as dataset artifacts. In this paper we first identify some dataset artifacts that approximate the data imperfections and difficulties that these models might encounter when launched into commercial production. Then we explore methods to mitigate those artifacts during the fine-tuning process of an ELECTRA transformer model on the SQuAD QA benchmark (Clark et al. 2020; Rajpurkar et al. 2016).

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
NLP Final Paper.pdf		NLP Final Paper.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Abstract

About

Uh oh!

Releases

Packages

TylerAnderton/NLP-Final-Paper

Folders and files

Latest commit

History

Repository files navigation

Abstract

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages