🦙 👀 ⛑ DeepDR-LLM: Integrated Image-based Deep Learning and Language Models for Primary Diabetes Care
DeepDR-LLM offers a holistic approach to primary diabetes care by combining image-based deep learning with advanced language models. This repository includes code for utilizing the Vision Transformer (ViT) for image analysis, alongside fine-tuned LLaMA models to produce detailed management suggestions for patients with diabetes. Here, we employ the LLaMA-7B model as the foundational language model.
- Requirements
- Environment Setup
- Linux System
- Dataset Preparation
- Model Training and Evaluation
This software is compatible with a Linux operating system, specifically Ubuntu 20.04 (compatibility with other versions has not been tested), and requires Python 3.9. It necessitates 64GB of RAM and 1TB of disk storage. Performance benchmarks are based on an Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz and an NVIDIA A100 GPU.
The following Python packages are required, which are also listed in requirements.txt
:
numpy>=1.25.0
datasets>=2.13.1
deepspeed>=0.10.0
huggingface-hub>=0.15.1
sentencepiece>=0.1.97
tokenizers>=0.13.1
torch>=2.0.1
transformers>=4.28.1
- Open the terminal, or press Ctrl+Alt+F1 to access the command line interface.
- Clone this repository to your home directory.
git clone https://github.com/DeepPros/DeepDR-LLM.git
- Navigate to the cloned repository's directory.
cd DeepDR-LLM
- Install the required Python packages.
python3 -m pip install --user -r requirements.txt
Supported Image File Formats JPEG, PNG, and TIFF file formats are supported and have been tested. Other formats compatible with OpenCV should also work. The input image must be a 3-channel color fundus image with the shorter side of the resolution being greater than 448 pixels.
Module 1 leverages the LLaMA model to generate comprehensive diagnostic and treatment recommendations, designed for easy integration with outputs from Module 2.
-
Dataset Preparation
- For training:
Ensure your dataset is formatted as shown in
DeepDR-LLM/Module1/Minimum/train_set/train_set.json
. Sample format: [{"instruction":"...","input":"...","output":"..."}]. - For validation:
Format your dataset according to the structure shown in
DeepDR-LLM/Module1/Minimum/valid_set.json
. Sample format: [{"instruction":"...","input":"...","output":"..."}].
- For training:
Ensure your dataset is formatted as shown in
-
Training
- Note: Make sure
llama-7b
model weights are downloaded fromhttps://huggingface.co/huggyllama/llama-7b
and saved inDeepDR-LLM/Module1/llama-7b-weights
. - Run
DeepDR-LLM/Module1/scripts/run_train.sh
to start training. - Please review the settings in
run_train.sh
, particularly thepaths
configuration.
- Note: Make sure
-
Inference
See
DeepDR-LLM/Module1/scripts/inference.py
for guidance. Be sure to configure necessary arguments properly. The input format should match that inDeepDR-LLM/Module1/Minimum/train_set/train_set.json
.
Module 2 is focused on analyzing and predicting outcomes based on fundus images.
-
Dataset Preparation
Includes tasks for classification and segmentation. Datasets for both are compiled using .txt files, where each line corresponds to an image. For classification, the format is "imagepath classindex". For segmentation, it is "imagepath maskpath", with segmentation labels formatted as [C,H,W], where C includes the background category.
-
Training
- For classification models, use
DeepDR-LLM/Module2/train_cla.py
. - For segmentation models, use
DeepDR-LLM/Module2/train_seg.py
. - **Note
- For classification models, use
**: Obtain pretrained vit-base
model weights from ImageNet before training (https://download.pytorch.org/models/vit_b_16-c867db91.pth).
-
Inference
Apply
DeepDR-LLM/Module2/test.py
for evaluation, ensuring trained models are accurately loaded. Outputs will be stored as specified.
- Starting point: A fundus image is obtained from a standard or portable imaging device, along with aligned clinical metadata, following the example structure in
DeepDR-LLM/Module1/Minimum/train_set/train_set.json
.
- Predict the quality of the fundus image, DR grade, DME grade, and the presence of retinal lesions.
- Example: {Sex: Female; Age: 47; BMI: 22.13 kg/m^2;....}
- Example: {Sex: Female; Age: 47; BMI: 22.13 kg/m^2;....; Fundus Image Quality: Gradable; DR Grade: 0; DME Grade: 0; Presence of Retinal Lesions: No microaneurysms, no cotton-wool spots, no hard exudates, no hemorrhages.}