MoverScore-Modern

MoverScore-Modern is a refactored and modernized implementation of the MoverScore metric. This repository addresses significant some of the outdated aspects in the original 2019 codebase, ensuring compatibility with modern Python environments, diverse hardware architectures, and current deep learning libraries.

This version was developed to support the HyTE (Hybrid Triage Evaluation) framework for the evaluation of meteorological narratives at the Western Norway University of Applied Sciences (HVL).

Motivation

The original MoverScore implementation remains a highly effective semantic metric. However, the legacy codebase presents several barriers for modern researchers:

Hardcoded CUDA requirements that prevent execution on CPU-only environments or Apple Silicon (MPS).
Incompatibility with Python 3.10+ due to deprecated NumPy and Collections types.
Outdated transformers API calls that trigger errors in recent versions.

Technical Modernizations

The following changes have been implemented in this version:

1. Hardware Autodetect

Removed hardcoded cuda:0 device selection. The script now utilizes torch.device to automatically detect the best available backend (CUDA, MPS, or CPU).

2. NumPy 2.0 and Python 3.10+ Compatibility

Fixed crashes caused by deprecated aliases. All instances of np.float, np.int, and np.bool have been replaced with standard dtypes (e.g., np.float64).

3. Transformers API Updates

Updated get_bert_embedding to interface directly with last_hidden_state.
Implemented a lazy-loading singleton pattern for the model and tokenizer to optimize memory usage and avoid import-time initialization failures.

4. Robustness and Logging

Added granular logging controls to suppress telemetry and non-critical warnings from the Hugging Face hub and HTTPX libraries.

Usage

Prerequisites

Python 3.9+
PyTorch 2.0+
Transformers
PyEMD

Basic Example

from moverscore_v2 import word_mover_score, get_idf_dict

# Example data
references = ["High pressure building over the North Sea."]
hypotheses = ["A high pressure system is developing in the North Sea."]

# Pre-calculate IDF dictionaries as required by the metric
idf_dict_ref = get_idf_dict(references)
idf_dict_hyp = get_idf_dict(hypotheses)

# Compute scores
scores = word_mover_score(references, hypotheses, idf_dict_ref, idf_dict_hyp)
print(f"MoverScore: {scores[0]}")

Academic Attribution

This implementation is based on the research presented at EMNLP 2019. If you use this metric in your research, please cite the original authors below. Original Paper:

@inproceedings{zhao2019moverscore,
  title = {MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance},
  month = {August},
  year = {2019},
  author = {Wei Zhao, Maxime Peyrard, Fei Liu, Yang Gao, Christian M. Meyer, Steffen Eger},
  address = {Hong Kong, China},
  publisher = {Association for Computational Linguistics},
  booktitle = {Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing},
}

Original Repository: AIPHES/emnlp19-moverscore

Maintenance:

This refactored version is maintained by Philipp Stahlberg as part of a Master's Thesis project at the Western Norway University of Applied Sciences. It is provided "as-is" to assist the community in running MoverScore on modern stacks.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github		.github
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
moverscore_v2.py		moverscore_v2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MoverScore-Modern

Motivation

Technical Modernizations

1. Hardware Autodetect

2. NumPy 2.0 and Python 3.10+ Compatibility

3. Transformers API Updates

4. Robustness and Logging

Usage

Prerequisites

Basic Example

Academic Attribution

Maintenance:

About

Uh oh!

Releases 2

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MoverScore-Modern

Motivation

Technical Modernizations

1. Hardware Autodetect

2. NumPy 2.0 and Python 3.10+ Compatibility

3. Transformers API Updates

4. Robustness and Logging

Usage

Prerequisites

Basic Example

Academic Attribution

Maintenance:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages