Skip to content

Latest commit

 

History

History
95 lines (67 loc) · 4.83 KB

README.md

File metadata and controls

95 lines (67 loc) · 4.83 KB

Introduction

This repository contains the data and source code for the EACL 2023 paper: An Empirical Study of Clinical Note Generation from Doctor-Patient Encounters

- An Empirical Study of Clinical Note Generation from Doctor-Patient Encounters. 
- Asma Ben Abacha, Wen-wai Yim, Yadan Fan and Thomas Lin. 
- EACL, May 3-5, 2023, Dubrovnik, Croatia. 

    @inproceedings{mts-dialog,
      title     = {An Empirical Study of Clinical Note Generation from Doctor-Patient Encounters},
        author = "Ben Abacha, Asma  and
          Yim, Wen-wai  and
          Fan, Yadan  and
          Lin, Thomas",
        booktitle = "Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics",
        month = may,
        year = "2023",
        address = "Dubrovnik, Croatia",
        publisher = "Association for Computational Linguistics",
        url = "https://aclanthology.org/2023.eacl-main.168",
        pages = "2291--2302"
    }

Datasets, Code & Annotations

Main Dataset

The MTS-Dialog dataset is a new collection of 1.7k short doctor-patient conversations and corresponding summaries (section headers and contents).
  • The training set consists of 1,201 pairs of conversations and associated summaries.

  • The validation set consists of 100 pairs of conversations and their summaries.

  • MTS-Dialog includes 2 test sets; each test set consists of 200 conversations and associated section headers and contents:

The full list of normalized section headers:

    1. fam/sochx [FAMILY HISTORY/SOCIAL HISTORY]
    2. genhx [HISTORY of PRESENT ILLNESS]
    3. pastmedicalhx [PAST MEDICAL HISTORY]
    4. cc [CHIEF COMPLAINT]
    5. pastsurgical [PAST SURGICAL HISTORY]
    6. allergy
    7. ros [REVIEW OF SYSTEMS]
    8. medications
    9. assessment
    10. exam
    11. diagnosis
    12. disposition
    13. plan
    14. edcourse [EMERGENCY DEPARTMENT COURSE]
    15. immunizations
    16. imaging
    17. gynhx [GYNECOLOGIC HISTORY]
    18. procedures
    19. other_history
    20. labs

Augmented dataset

The augmented dataset consists of 3.6k pairs of medical conversations and associated summaries created from the original 1.2k training pairs via back-translation using two languages French and Spanish, as described in the paper (cf. Section 4.2).

We provide the full augmented training set that we used in the experiments, as well as the separate datasets created using the French and Spanish translation models.

Source Code

The source code for the summarization of doctor-patient conversations and the automatic generation of clinical notes.

Manual Scores for Correlation Study

  • Manual fact-based scores for the evaluation of 400 automatic summaries generated using four summarization models from the validation set of 100 conversations and notes.

  • The Factual P/R/F1 Scores, Hallucination and Omission Rates, and Levenshtein Edit Distance are computed based on the fact-based manual counts and correction.

  • We used the manual scores to evaluate the performance of several evaluation metrics (e.g., ROUGE, BERTScore, and BLEURT) by computing the Pearson's correlation coefficients between the automatic and manual scores, as described in the paper (cf. Section 5.2 and Section 5.3).

  • We provide all the data needed to perform this correlation study on other evaluation metrics.

Challenges & Evaluation Scripts

License

Contact

-  Asma Ben abacha (abenabacha at microsoft dot com)
 - Wen-wai Yim (yimwenwai at microsoft dot com)