This project combines multiple prediction models to provide evidence-based antibiotic recommendations. The system uses machine learning to recommend antibiotics in optimal order of efficacy, with cefiderocol as the last resort option.
The system follows a structured 4-step methodology:
- Step 1: Data Preparation and Exploration
- Step 2: Antibiotic Decision Tree Model Development
- Step 3: Phenotypic Signature Analysis and Clustering
- Step 4: Cefiderocol Use Prediction Model
Vivli/
βββ scripts/ # Python and R scripts
β βββ antibiotic_decision_tree.py
β βββ step4_prediction.py
β βββ generate_english_antibiotic_report.py
β βββ convert_md_to_html.py
β βββ multiple_regression_plot.R
β βββ univariate_analysis_script.R
β βββ ...
βββ docs/ # Documentation
β βββ vivli_complete_methodology.md
β βββ vivli_complete_methodology.html
β βββ antibiotic_recommendation_system_details_english.md
β βββ last_prediction_model_details.md
β βββ ...
βββ outputs/ # Generated outputs
β βββ reports/ # PDF and HTML reports
β βββ plots/ # Visualizations
β βββ models/ # Trained models
βββ data/ # Data files (not included in repo)
β βββ 1.xlsx # SIDERO-WT Database
β βββ 2.xlsx # ATLAS Database
βββ README.md
pip install pandas numpy scikit-learn matplotlib seaborn xgboost shap reportlab markdown- Antibiotic Decision Tree Model:
cd scripts
python antibiotic_decision_tree.py- Cefiderocol Prediction Model:
cd scripts
python step4_prediction.py- Generate English Report:
cd scripts
python generate_english_antibiotic_report.py- Algorithm: Decision Tree Classifier
- Features: 7 primary features (species, country, year, resistance patterns)
- Performance: 100% accuracy (reported)
- Output: Complete antibiotic sequence with cefiderocol as last resort
- Algorithm: Random Forest Classifier
- Features: 20+ features including MIC values, resistance patterns, ratios
- Performance: AUC 1.000, Precision 1.000, Recall 1.000
- Output: Binary decision for cefiderocol use
- Method: Clustering analysis with PCA
- Purpose: Identify resistance patterns and signatures
- Output: Cluster assignments and phenotypic signatures
- Accuracy: 100%
- Target: First antibiotic recommendation
- Coverage: 40+ antibiotics analyzed
- First-Line Treatment: Most effective antibiotic based on species, region, resistance
- Sequential Alternatives: Complete sequence of alternatives
- Phenotypic Analysis: Resistance patterns and clusters
- Last Resort Decision: Cefiderocol use determination
# Get antibiotic recommendations
recommendations = model.recommend_antibiotics(
species="Escherichia coli",
country="France",
year=2023,
resistance_profile={'beta_lactam': 0.3, 'quinolone': 0.7}
)-
ATLAS Database (2.xlsx): Global antimicrobial susceptibility data
- 966,805 isolates, 134 variables
- Multiple countries and species
- Temporal coverage
-
SIDERO-WT Database (1.xlsx): Cefiderocol-specific susceptibility data
- MIC values and resistance patterns
- Species and geographic information
- No real treatment failure data available
- Targets based on theoretical resistance patterns
- High performance likely reflects simplified target definitions
- Geographic and temporal biases possible
- Do NOT use for clinical decisions without validation
- Validate on real treatment failure data
- Include clinical factors (comorbidities, previous exposure)
- Prospective validation required before implementation
- Use as research tool only
- MIC value standardization
- Resistance threshold application
- Categorical encoding
- Composite resistance scores
- MIC ratios for comparative analysis
- Train/Test Split: 80/20
- Cross-validation: 5-fold StratifiedKFold
- Feature scaling: StandardScaler
- Random State: 42
- Scikit-learn: Machine learning algorithms
- XGBoost: Gradient boosting
- SHAP: Feature importance analysis
- Pandas/NumPy: Data manipulation
- Matplotlib/Seaborn: Visualization
- ReportLab: PDF generation
- Methodology:
docs/vivli_complete_methodology.html - Antibiotic System:
docs/antibiotic_recommendation_system_details_english.html - Cefiderocol Model:
docs/last_prediction_model_details.html
- English PDF Report:
outputs/reports/antibiotic_recommendation_report_english.pdf - Methodology HTML:
docs/vivli_complete_methodology.html
- Include patient factors (age, comorbidities, allergies)
- Add pharmacokinetic considerations
- Integrate with local resistance patterns
- Include genomic resistance markers
- Add temporal resistance trends
- Develop species-specific models
- Prospective clinical studies
- Real-world implementation
- Outcome assessment
- Extend to other novel antibiotics
- Develop comprehensive antimicrobial decision support systems
- Integrate with precision medicine approaches
Adekemi Adepeju, Christian Ako, Abeeb Adeniyi, Oluwatobiloba Kazeem, Oluwadamilare Olatunbosun using the Pfizer's ATLAS dataset and Sidero datatset as part of the 2025 Vivli AMR Data Challenge.