This repository explores statistical analysis of aromatase inhibitor compounds using Python and cheminformatics tools.
The project applies the Mann–Whitney U Test to compare molecular properties of compounds, focusing on features relevant to drug-likeness and bioactivity.
- aromatase.ipynb
Jupyter Notebook containing:- Dataset preparation (compounds retrieved from ChEMBL)
- Calculation of molecular descriptors:
- LogP
- Molecular Weight (MW)
- Number of Hydrogen Bond Acceptors
- Number of Hydrogen Bond Donors
- pIC50 values
- Mann–Whitney U Test applied to compare distributions of these features between compound groups
- Visualization of statistical results
- Python 3.8+
- RDKit → Molecular descriptors
- SciPy → Statistical tests (Mann–Whitney U)
- Pandas → Data handling
- Matplotlib / Seaborn → Visualization
- ChEMBL Database → Compound data
- Clone this repository:
git clone https://github.com/mike3119/https-github.com-mike3119-Aromatase.git cd https-github.com-mike3119-Aromatase - Install dependencies:
pip install rdkit pandas matplotlib seaborn scipy
- Open the notebook:
jupyter notebook aromatase.ipynb
📊 Example Analyses
LogP Distribution: Compare lipophilicity of active vs inactive compounds
MW Distribution: Assess molecular size trends
H-Bond Acceptors/Donors: Compare hydrogen bonding capacity
pIC50: Evaluate bioactivity and potency
Mann–Whitney U Test Results: Statistical significance of property differences
🎯 Project Goal
To investigate how key molecular descriptors differ between compound groups of aromatase inhibitors, using statistical hypothesis testing as a tool for cheminformatics-driven drug discovery insights.
👤 Author
Michael Hemen
B.Sc. Chemistry
PGD in Drug Analysis, Pharmaceutical Chemistry (University of Ibadan)
Open to internships and collaborations in bioinformatics, cheminformatics, and computational drug discovery.