π¨ [UI Repository] : (https://github.com/realranjan/DOCASSIST-AI) - Frontend implementation
In the current medical landscape, doctors face significant challenges in processing vast amounts of patient data to make treatment decisions. This can lead to:
- Delays in treatment decisions
- Inconsistencies in recommendations
- Suboptimal treatment choices
- Increased cognitive load on healthcare providers
DocAssist addresses these challenges by providing an AI-powered decision support tool that assists healthcare professionals in making informed, data-driven clinical decisions.

DocAssist AI System Architecture: Integrating Healthcare Support, Data Analysis, and Personalized Recommendations
The system features a modern, intuitive web interface built with:
- Frontend: Next.js, Tailwind CSS, shadcn/ui components
- Backend: Flask API server
- Deployment: Vercel (Frontend), Render (Backend)
- π Interactive dashboards for data visualization
- π± Responsive design for all devices
- π Real-time analysis updates
- π PDF report generation and viewing
- π¨ Modern and clean user interface
- π Secure data handling
The dataset is sourced from a private hospital in Indonesia and contains comprehensive patient laboratory test results used for treatment recommendations.
Feature Name | Data Type | Description |
---|---|---|
HAEMATOCRIT | Continuous | Proportion of blood volume occupied by red blood cells |
HAEMOGLOBINS | Continuous | Oxygen-carrying protein in red blood cells |
ERYTHROCYTE | Continuous | Red blood cell count per volume |
LEUCOCYTE | Continuous | White blood cell count per volume |
THROMBOCYTE | Continuous | Platelet count per volume |
MCH | Continuous | Mean Corpuscular Hemoglobin |
MCHC | Continuous | Mean Corpuscular Hemoglobin Concentration |
MCV | Continuous | Mean Corpuscular Volume |
AGE | Continuous | Patient age |
SEX | Nominal | Patient gender (M/F) |
SOURCE | Nominal | Patient care type (1 = In-care, 0 = Out-care) |
DOCASSIST-MODEL/
βββ data/ # Dataset files
βββ models/ # model files
βββ notebooks/ # notebook
βββ visuals/ # Project diagrams
-
Data Cleaning
- Handling missing values
- Removing duplicate entries
- Outlier detection and treatment
-
Feature Engineering
- Creation of derived features (e.g., thrombocyte-leucocyte ratio)
- Encoding of categorical variables
- Scaling numerical features using RobustScaler
-
Exploratory Data Analysis
- Distribution analysis of class labels
- Gender and age demographics
- Feature correlation analysis
- Statistical visualization of numerical features
Model | Train Accuracy | Test Accuracy | ROC AUC | Precision |
---|---|---|---|---|
Random Forest | 100.00% | 75.88% | 0.80 | 0.74 |
CatBoost | 87.19% | 75.31% | 0.82 | 0.73 |
LightGBM | 92.94% | 74.52% | 0.81 | 0.71 |
XGBoost | 98.67% | 74.41% | 0.81 | 0.70 |
AdaBoost | 75.57% | 74.07% | 0.79 | 0.72 |
Support Vector Machine | 76.79% | 73.61% | 0.79 | 0.74 |
K-Nearest Neighbors | 80.31% | 72.03% | 0.75 | 0.67 |
Logistic Regression | 72.83% | 71.46% | 0.75 | 0.70 |
Model | Train Accuracy | Test Accuracy | ROC AUC | Precision |
---|---|---|---|---|
Tuned XGBoost | 96.57% | 77.12% | 0.81 | 0.76 |
Tuned Random Forest | 91.67% | 76.67% | 0.81 | 0.76 |
Tuned CatBoost | 91.47% | 76.67% | 0.81 | 0.74 |
Tuned LightGBM | 88.18% | 77.34% | 0.82 | 0.75 |
- Optimal Model Selection: LightGBM achieved the highest test accuracy (77.34%) after tuning
- Reduced Overfitting: Training accuracy decreased while test accuracy increased
- Consistent Performance: All tuned models showed ROC AUC scores of 0.81
- High Precision: XGBoost and Random Forest achieved 0.76 precision after tuning
After comprehensive evaluation and hyperparameter tuning, LightGBM was selected as the final production model for the following reasons:
- Test Accuracy: 77.34% (highest among all models)
- Train Accuracy: 88.18% (good balance between bias and variance)
- ROC AUC: 0.82 (strong classification capability)
- Precision: 0.75 (reliable positive predictions)
- Gradient Boosting Framework: LightGBM uses a highly efficient gradient boosting framework
- Leaf-wise Growth: Employs leaf-wise tree growth strategy for better accuracy
- Memory Efficient: Uses histogram-based algorithms to handle categorical features
- Fast Training: Significantly faster training speed compared to traditional GBDT
- Handling Imbalanced Data: Better performance on slightly imbalanced medical datasets
lightgbm_params = {
'objective': 'binary',
'metric': 'binary_logloss',
'boosting_type': 'gbdt',
'num_leaves': 31,
'learning_rate': 0.05,
'feature_fraction': 0.9
}
The model is deployed with:
- Regular retraining pipeline for maintaining accuracy
- Model versioning for tracking performance
- Monitoring system for detecting drift
- Fallback mechanisms for reliable predictions
- β Blood test report analysis
- β Real-time parameter visualization
- β PDF report generation
- β Treatment recommendations
- β Historical data tracking
- Python 3.7+
- pip package manager
# Clone the repository
git clone [https://github.com/realranjan/DOCASSIST-MODEL.git]
# Navigate to project directory
cd docassist
- Data Preparation:
# Import required libraries
import pandas as pd
from docassist import preprocess
# Load and preprocess data
data = pd.read_csv('path_to_data.csv')
processed_data = preprocess.prepare_data(data)
- Model Training:
# Import model trainer
from docassist import model
# Train model
trained_model = model.train(processed_data)
- Making Predictions:
# Get predictions
predictions = model.predict(patient_data)
-
Data Enhancement
- Expand dataset diversity
- Include additional medical parameters
- Incorporate temporal patient data
-
Technical Improvements
- Implement deep learning models
- Develop REST API for model serving
- Create web-based user interface
- Add real-time monitoring capabilities
-
Clinical Integration
- Integrate with Electronic Health Records (EHR)
- Implement HIPAA compliance measures
- Add support for multiple medical specialties
We welcome contributions to improve DocAssist. Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Ranjan Vernekar - Project Lead
- LinkedIn: Ranjan Vernekar
- GitHub: @realranjan
- Private hospital in Indonesia for providing the dataset
- Healthcare professionals who provided domain expertise
- Open-source community for machine learning tools and libraries
As the project lead for DocAssist, I spearheaded the end-to-end development and deployment of an AI-powered medical decision support system. My key contributions include:
-
Full-stack Solution Design:
Architected and implemented a robust, modular system integrating a Python-based machine learning backend (LightGBM, XGBoost, CatBoost, etc.) with a modern, responsive Next.js frontend, ensuring seamless user experience for healthcare professionals. -
Data Engineering & Preprocessing:
Led the data pipeline design, including advanced feature engineering (e.g., thrombocyte-leucocyte ratio), robust handling of missing/duplicate values, and scaling/encoding strategies to optimize model performance on real-world medical datasets. -
Model Selection & Optimization:
Conducted comprehensive benchmarking of multiple ML algorithms, culminating in the selection and fine-tuning of LightGBM as the final production model, achieving a test accuracy of 77.34%, ROC AUC of 0.82, and precision of 0.75. -
Production-Ready ML Deployment:
Developed a retrainable, versioned model deployment pipeline with monitoring and fallback mechanisms, ensuring reliability and adaptability in clinical environments. -
UI/UX Innovation:
Designed and integrated a user-friendly web interface with real-time dashboards, PDF report analysis, and secure data handling, leveraging shadcn/ui and Tailwind CSS for a modern look and feel. -
Open Source & Documentation:
Authored comprehensive documentation and a visually rich README, including architecture diagrams, UI screenshots, and clear project structure, facilitating community contributions and transparency. -
Cross-functional Collaboration:
Coordinated with healthcare professionals for domain expertise, and managed open-source contributions, fostering a collaborative and innovative project culture. -
End-to-End Deployment:
Deployed the solution using Vercel (frontend) and Render (backend), and ensured accessibility via a live demo and public GitHub repositories (DOCASSIST-AI UI & Backend, DOCASSIST-MODEL).
- Led the design and deployment of DocAssist, an AI-driven medical decision support system, integrating LightGBM for 77.34% test accuracy and 0.82 ROC AUC.
- Architected a scalable, modular ML pipeline with robust data preprocessing, feature engineering, and model versioning for clinical reliability.
- Developed a modern, responsive web UI using Next.js and Tailwind CSS, enabling real-time blood test analysis and PDF report generation.
- Authored comprehensive project documentation, including technical architecture, UI/UX visuals, and open-source guidelines, driving community engagement.
- Deployed full-stack solution to production (Vercel/Render), providing a live demo and public repositories for global accessibility.
- Collaborated with healthcare professionals and open-source contributors to ensure clinical relevance and technical excellence.
Made with β€οΈ by the DocAssist AI Team
Β© 2024 DocAssist AI. All rights reserved.