This repository contains the code and data for the paper "Do Large Language Models Understand Word Senses?" accepted at the EMNLP 2025 main conference.
In our paper, we investigate whether Large Language Models truly understand word senses in context. We evaluate a wide range of models on classic Word Sense Disambiguation benchmarks and novel generative settings, showing that top LLMs match state-of-the-art systems in WSD and achieve up to 98% accuracy in free-form sense explanation tasks.
If you find our paper, code or framework useful, please reference this work in your paper:
@inproceedings{meconi-etal-2025-large,
title = "Do Large Language Models Understand Word Senses?",
author = "Meconi, Domenico and
Stirpe, Simone and
Martelli, Federico and
Lavalle, Leonardo and
Navigli, Roberto",
editor = "Christodoulopoulos, Christos and
Chakraborty, Tanmoy and
Rose, Carolyn and
Peng, Violet",
booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2025",
address = "Suzhou, China",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.emnlp-main.1720/",
pages = "33885--33904",
ISBN = "979-8-89176-332-6"
}
Clone the repository:
git clone https://github.com/Babelscape/LLM-WSD.git
cd LLM-WSDDownload the datasets:
pip install gdown
gdown 110XFfCq93zTGQHr65lNsOXzn-KXDFXdb -O LLM-WSD-datasets.zip
unzip LLM-WSD-datasets.zip
rm LLM-WSD-datasets.zip
We suggest to install conda and then use this command to create a new environment:
conda create -n llm-wsd python=3.11
Then, activate the newly-created environment and install the required libraries:
conda activate llm-wsd
pip install -r requirements.txt
Optionally, insert in the .env file your API keys and path if you run on an HPC cluster:
OPENAI_API_KEY=your-openai-key-here # if you need to evaluate gpt
DEEPSEEK_KEY=your-deepseek-key-here # if you need to evaluate deepseek
HPC_PATH=path-to-your-hpc-saved-models # if you run on HPCLLM-WSD/
├── data/
│ ├──development/ # Development datasets
│ └──evaluation/ # Evaluation datasets
├── src/
│ ├── disambiguate.py # Main WSD evaluation script
│ ├── score.py # Results evaluation and metrics
│ ├── generate_dataset_from_xml.py # Dataset preprocessing
│ ├── utils.py # Core utilities and functions
│ ├── variables.py # Model configs and prompt templates
│ └── env.py # Environment variable loading
├── .env # Environment variables
├── requirements.txt # Python dependencies
└── README.md # This file
Convert datasets from a format that follows the one introduced by Raganato et al. (2017) to a JSON format:
python src/generate_dataset_from_xml.py \
--data_path path/to/your/dataset.data.xml \
--gold_path path/to/your/dataset.gold.key.txt \ # [Optional] if you have a gold
--highlight_target \ # [Optional] if you want to highlight the target word
--shuffle_candidates # [Optional] if you want to create a dataset with candidates in random orderpython src/disambiguate.py \
--subtask selection \
--approach {zero_shot|one_shot|few_shot|perplexity} \
--shortcut_model_name model_name #see src/variables.py L22 for the supported modelspython src/disambiguate.py \
--subtask generation \
--approach zero_shot \
--shortcut_model_name model_name \ #see src/variables.py L22 for the supported models
--prompt_number {1 (Definition Generation)|2 (Free-form Explanation)|3 (Example Generation)}python src/score.py \
--approach {zero_shot|one_shot|few_shot|perplexity} \
--shortcut_model_name model_name #see src/variables.py L22 for the supported models
--pos {ALL|NOUN|ADJ|VERB|ADV}Optional values for points 2, 3 and 4:
--is_devel: Use SemEval-2007 development data--prompt_number: If is_devel is selected insert a number from 1 to 20--more_context: Extended context sentences--shuffle_candidates: Randomized definition order--hard: Challenging cases (hardEN dataset)--domain: Domain-specific evaluation (42D dataset)--custom_dataset_path"/path/to/your/preprocessed/dataset.json" : If you want to test on your custom dataset
This work is under the Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.
We welcome contributions! Please feel free to:
- Report bugs and issues
- Suggest new features or improvements
For major changes, please open an issue first to discuss what you would like to change.
For questions about this research, please contact:
- Domenico Meconi: meconi@babelscape.com
- Roberto Navigli: navigli@babelscape.com
