A Klebsiella pneumoniae discovery platform linking plate-to-fitness phenotypic data with established genomic typing (Kleborate) and predicted genetic associations (GWAS).
ColonyExplorer is an interactive Streamlit application for high-throughput chemical genomics screening of Klebsiella pneumoniae clinical isolates. It integrates four independent data streams into a single browser:
- Plate images — high-density agar plate images (1536-well format) captured with IRIS
- Phenotypic data — colony morphology metrics (size, circularity, opacity, biofilm) extracted per-colony by IRIS
- Absolute fitness — pre-computed fitness scores per isolate per condition
- Genomic data — AMR genes, virulence loci, and sequence types from Kleborate
- GWAS results — gene- and SNP-level associations from pan-genome and SNP GWAS analyses
Dataset: 1,462 isolates · 225 conditions · 214 conditions with fitness data · 2 GWAS modes
| Feature | Description |
|---|---|
| Colony Inspection | Browse plate images for any isolate and condition; view four replicate crops with quantitative IRIS morphology metrics |
| Fitness Distribution | Visualise absolute fitness density across all isolates for any condition; see where a strain sits relative to the population with percentile annotation |
| Genomic Profiling | Full Kleborate strain overview: ST, K-type, AMR genes, virulence score, resistance score — linked to each colony |
| Gene-level GWAS | Pan-genome associations between gene presence/absence and fitness phenotypes; interactive table with fitness boxplots |
| SNP-level GWAS | SNP associations across conditions; allele distributions, annotations, and isolate presence per significant SNP |
| Seamless Navigation | Jump from any GWAS hit to Colony Viewer for any isolate — and back — with condition and gene context preserved |
ColonyExplorer/
├── app/
│ ├── main.py # Entry point; navigation and home page
│ ├── colony_picker.py # Colony Viewer page
│ ├── strain_overview.py # Genomic metadata panels
│ └── utils/
│ ├── data_loading.py # CSV / Excel / IRIS file parsers
│ └── image_handling.py # Plate image loading and colony extraction
├── config/
│ └── config.yaml # File and directory paths
├── data/
│ ├── plate_images/ # *.JPG.grid.jpg plate images
│ ├── iris_measurements/ # *.iris measurement files
│ ├── GWAS_files/ # Pan-genome and SNP presence/absence files
│ ├── GWAS_results/ # GWAS summary result files
│ ├── strain_names.csv # Strain ID → Row / Column / Plate mapping
│ ├── kleborate_all.tsv # Kleborate output (genomic metadata)
│ └── condition_clean_tags_mapping.csv # Human-readable condition labels
├── requirements.txt
└── README.md
Filename convention: {Condition}-{Plate}-{Batch}_A.JPG.grid.jpg
Example: Ceftazidime-1ugml-1-1_A.JPG.grid.jpg
Filename convention: {Condition}-{Plate}-{Batch}_A.JPG.iris
Each file contains per-colony measurements and grid coordinates.
Must contain at minimum: ID, Row, Column, Plate, GenBank_acc, ENA_acc.
Standard Kleborate output TSV; matched to strains via strain, GenBank_acc, or ENA_acc.
gene_presence_absence_roary.csv— Roary pan-genome presence/absence matrixpan_genome_reference.fa— pan-genome reference sequencessignificant_snps_presence_absence.csv— SNP presence/absence matrix
Edit config/config.yaml to point to your data:
files:
strain_file: "data/strain_names.csv"
kleborate_file: "data/kleborate_all.tsv"
directories:
image_directory: "data/plate_images/"
iris_directory: "data/iris_measurements/"https://colonyexplorer.kaust.edu.sa
docker pull gzhoubioinf09/colonyexplorer:latest
docker run -p 8501:8501 gzhoubioinf09/colonyexplorer:latestOpen http://localhost:8501 in your browser.
Requirements: Python 3.11+
git clone https://github.com/gzhoubioinf/ColonyExplorer.git
cd ColonyExplorer
# Create virtual environment
python3.11 -m venv venv
source venv/bin/activate # macOS / Linux
# venv\Scripts\activate # Windows
# Install dependencies
pip install -r requirements.txtLocally (after Option B install):
streamlit run app/main.pyOpens at http://localhost:8501.
On Render (or any server):
streamlit run app/main.py --server.port $PORT --server.address 0.0.0.0- Navigate to Colony Viewer in the left panel.
- Select a Condition — the plate/batch list updates automatically.
- Select a Plate / Batch.
- Choose a lookup method:
- Search by accession number — select a strain from the dropdown, shown as
ID (Accession). Reference strains without an accession showID (NA); if the same ID appears at multiple wells, the well position is appended asID (NA, R#, C#). - Enter grid position — enter Row (1–32) and Column (1–48) directly.
- Search by accession number — select a strain from the dropdown, shown as
- Optionally adjust which Metrics to display.
- Click Analyse to load the plate image and display results.
- Navigate to Gen-GWAS Explorer.
- Select a condition and sample from the dropdowns.
- Browse the GWAS results table; filter by significance threshold.
- Expand any gene to view the isolate presence/absence table and pan-genome sequence.
- Click any isolate row to jump directly to Colony Viewer.
- Navigate to SNP-GWAS Explorer.
- Select a condition and SNP from the results table.
- View isolates carrying the SNP, sorted by p-value or effect size.
- Click any isolate row to jump to Colony Viewer.
| Metric | Description |
|---|---|
| Colony size | Total colony area in pixels |
| Circularity | Roundness (1 = perfect circle) |
| Opacity | Optical density proxy for colony density |
| Colony color intensity | Mean pixel intensity of the colony |
| Biofilm area size | Area covered by biofilm |
| Biofilm color intensity | Mean intensity within the biofilm region |
| Biofilm area ratio | Fraction of colony area covered by biofilm |
| Size normalized color intensity | Color intensity corrected for colony size |
| Mean sampled color intensity | Sampled mean intensity |
| Average pixel saturation | Mean HSV saturation across colony |
| Max 10% opacity | 90th-percentile opacity value |
If you use ColonyExplorer in your research, please cite our work:
- (citation forthcoming)
This application relies on the following foundational workflows and tools. Please also consider citing them:
-
High-Throughput Phenotypic Screening Pipeline: Williams G., Ahmad H., Sutherland S., et al. (2025). High-throughput chemical genomic screening: a step-by-step workflow from plate to phenotype. mSystems, 10(12), e00885-25. DOI: 10.1128/msystems.00885-25
-
Kleborate (Genomic Profiling): Lam, M. M. C., et al. (2021). A genomic surveillance framework and genotyping tool for Klebsiella pneumoniae and its related species complex. Nature Communications, 12(1), 4188. DOI: 10.1038/s41467-021-24448-3
-
IRIS (Phenotypic Image Analysis): Kritikos, G., Banzhaf, M., Herrera-Dominguez, L., et al. (2017). A tool named Iris for versatile high-throughput phenotyping in microorganisms. Nature Microbiology, 2(5), 17014. DOI: 10.1038/nmicrobiol.2017.14
-
Scoary (Bacterial GWAS): Brynildsrud, O., Bohlin, J., Scheffer, L., & Eldholm, V. (2016). Rapid scoring of genes in microbial pan-genome-wide association studies with Scoary. Genome Biology, 17(1), 238. DOI: 10.1186/s13059-016-1108-8
-
Panaroo (Pan-genome Pipeline): Tonkin-Hill, G., MacAlasdair, N., Ruis, C., et al. (2020). Producing polished prokaryotic pangenomes with the Panaroo pipeline. Genome Biology, 21(1), 180. DOI: 10.1186/s13059-020-02090-4
ColonyExplorer is a joint project developed by the Infectious Disease Epidemiology Lab (KAUST) and the Banzhaf Lab (Newcastle University).
Support and Inquiries:
| Name | |
|---|---|
| Ge Zhou (PhD student) | ge.zhou@kaust.edu.sa |
| Danesh Moradigaravand (PI) | danesh.moradigaravand@kaust.edu.sa |
| Manuel Banzhaf (PI) | manuel.banzhaf@newcastle.ac.uk |
See LICENSE.

