DL4H-Automated-ICD-9-Coding

Author: Ryan Fogle

Overview

The project is split up into 7 parts.

Prepare-data
SVM Training
Word2Vec Training
Doc2Vec Training
DeepLabeler Training
DeepLabeler minus Doc2Vec Training
DeepLabeler with Embedding Layer

All of these trainings happen sequentially, please go through one by one.

Environment setup

This environment setup uses python 3.10, a 6 core CPU, and a Nvidia 3080 10GB. You'll need to install the dependencies as well.

cd DL4H-Automated-ICD-9-Coding
pip install -r requirements.txt

Report

The report can be seen in report.pdf

Computation Times

Model Metrics

Summary of Paper

ICD-9 coding is a time-consuming task that requires a specialized skill set to provide accurate ICD-9 codings. Using the discharge summaries and ICD-9 diagnostic codes provided in the MIMIC-III dataset, the paper ”Automated ICD-9 Coding via A Deep Learning Approach” introduces a multi-label classification problem. The authors of the paper introduce a deep learning model called DeepLabeler which incorporates two models: a Word2Vec model (Mikolov et al., 2013) and a Doc2Vec model (Le and Mikolov, 2014). The paper claims DeepLabeler performs better (via an F1 score) than a traditional natural language processing model like a support vector machine (Li et al., 2019).

Summary of Findings

The results of this report do not support all of the claims made in the paper. The main premise of the paper is that by adding a deep learning architecture we can increase the micro-f1 score, my findings do not support that claim. Although, this report does support the claim that by adding the Doc2Vec vectors, the micro-f1 score increases.

References

Gensim library: https://radimrehurek.com/gensim/

sci-kit learn: https://scikit-learn.org/stable/

pytorch library: https://pytorch.org/

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
01-prepare-data.ipynb		01-prepare-data.ipynb
02-SVM.ipynb		02-SVM.ipynb
03-Word2Vec.ipynb		03-Word2Vec.ipynb
04-Doc2Vec.ipynb		04-Doc2Vec.ipynb
05-DeepLabeler.ipynb		05-DeepLabeler.ipynb
06-DeepLabeler-minus-d2v.ipynb		06-DeepLabeler-minus-d2v.ipynb
07-Embedding-CNN.ipynb		07-Embedding-CNN.ipynb
README.md		README.md
computation.png		computation.png
deeplabeler-50-epoch-fscore.png		deeplabeler-50-epoch-fscore.png
deeplabeler-50-epoch-loss.png		deeplabeler-50-epoch-loss.png
deeplabeler-50-epochs-scores.csv		deeplabeler-50-epochs-scores.csv
deeplabeler-embedding.csv		deeplabeler-embedding.csv
deeplabeler-fscore-5-epochs.png		deeplabeler-fscore-5-epochs.png
deeplabeler-fscore-50-epochs.png		deeplabeler-fscore-50-epochs.png
deeplabeler-fscore-embedding.png		deeplabeler-fscore-embedding.png
deeplabeler-fscore-no-d2v.png		deeplabeler-fscore-no-d2v.png
deeplabeler-fscore.png		deeplabeler-fscore.png
deeplabeler-scores-5-epochs.csv		deeplabeler-scores-5-epochs.csv
deeplabeler-scores-no-d2v.csv		deeplabeler-scores-no-d2v.csv
deeplabeler-scores.csv		deeplabeler-scores.csv
model-performance.png		model-performance.png
report.pdf		report.pdf
requirements.txt		requirements.txt
svm-fscore-no-weights.png		svm-fscore-no-weights.png
svm-fscore.png		svm-fscore.png
svm-scores.csv		svm-scores.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DL4H-Automated-ICD-9-Coding

Overview

Environment setup

Report

Computation Times

Model Metrics

Summary of Paper

Summary of Findings

References

About

Uh oh!

Releases

Packages

Languages

RynoXLI/DL4H-Automated-ICD-9-Coding

Folders and files

Latest commit

History

Repository files navigation

DL4H-Automated-ICD-9-Coding

Overview

Environment setup

Report

Computation Times

Model Metrics

Summary of Paper

Summary of Findings

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages