An abbreviation expansion module based on n-gram matching developed at the Institute for Medical Informatics, Statistics and Documentation at the Medical University of Graz (Austria).
Note that a more recent version of this work using word embeddings for acronyms expansion is available at https://github.com/bst-mug/acres.
If you use data or code in your work, please cite our MEDINFO 2017 paper:
@inproceedings{oleynik2017unsupervised,
author = {Michel Oleynik and
Markus Kreuzthaler and
Stefan Schulz},
editor = {Adi V. Gundlapalli and
Marie{-}Christine Jaulent and
Dongsheng Zhao},
title = {Unsupervised Abbreviation Expansion in Clinical Narratives},
booktitle = {{MEDINFO} 2017: Precision Healthcare through Informatics - Proceedings
of the 16th World Congress on Medical and Health Informatics, Hangzhou,
China, 21-25 August 2017},
series = {Studies in Health Technology and Informatics},
volume = {245},
pages = {539--543},
publisher = {{IOS} Press},
year = {2017},
url = {https://doi.org/10.3233/978-1-61499-830-3-539},
doi = {10.3233/978-1-61499-830-3-539}
}
- Tab-separated (TSV) bigram list generated from the corpus
- Tab-separated (TSV) unigram list generated from the corpus