This project focuses on developing a machine learning application for predicting molecular toxicity. By utilizing advanced deep learning models (Artificial Neural Networks) and Bayesian optimization, our solution surpasses existing tools like MolToxPred and ToxiM. We built a custom dataset from public databases such as T3DB and research papers, applying feature engineering to gain enhanced data insights.
- Custom Dataset: Compiled from public databases (T3DB) and research papers.
- Feature Engineering: Applied to extract meaningful insights and improve model performance.
- Model Evaluation: Compared multiple models including XGBoost, Logistic Regression, and ANN to identify the best-performing model.
- Sources: Public databases like T3DB and various research papers.
- Preprocessing: Data cleaning, normalization, and feature extraction were performed to prepare the dataset for model training.
- Deep learning model chosen for its superior performance in this domain.
- Implemented with Bayesian optimization for hyperparameter tuning.
- Accuracy: Percentage of correctly predicted instances.
- Precision: Proportion of true positive predictions among all positive predictions.
- Recall: Proportion of true positive predictions among all actual positives.
- F1 Score: Harmonic mean of precision and recall, providing a balance between the two.
- ANN with Bayesian Optimization: Outperformed other models, showing the highest accuracy and best overall performance.
- Comparison: ANN > XGBoost > Logistic Regression in terms of accuracy and predictive capability.
- Clone the repository:
git clone https://github.com/ANGADJEET/MolToxInsight
- Navigate to the
neuralToxdirectory:cd neuralTox
-
Navigate to the backend directory:
cd backend -
Run the server:
python app.py
-
Wait for the server to start.
- Navigate to the frontend directory:
cd .. cd frontend
- If running for the first time or the node modules folder is not there
npm install
npm install axios
npm install react-router-dom
- Run the development server:
npm run dev
- Click on the provided link to access the application.
- T3DB: For providing comprehensive data on toxic compounds.
- Research Papers: For the foundational knowledge and additional data sources.
- Open Source Libraries: Including Scikit-learn, TensorFlow, and XGBoost, which made this project possible.
For any questions or issues, please open an issue on the repository or contact us at angadjeet22071@iiitd.ac.in arav22091@iiitd.ac.in.