Table of Contents |
The iPhyloGeo++ tool is an advanced bioinformatics application specifically designed for the integration and analysis of phylogeographic data. By leveraging both genetic and climatic information, it offers comprehensive insights into the evolutionary processes and geographical distribution of various species.
- Cross-Platform Compatibility: Compatible with Windows, macOS, and Linux.
- Comprehensive Data Integration: Merges genetic sequences with climatic data for robust analysis.
- Intuitive User Interface: Developed with PyQt5 to ensure ease of use.
- Advanced Visualization Tools: Provides visualization of phylogenetic trees and climatic data on interactive maps.
- Enhanced Comparative Analysis: Facilitates the comparison of different phylogenetic trees.
1. Clone the repository
git clone https://github.com/tahiri-lab/iPhyloGeo_plus_plus.git
cd iPhyloGeo_plus_plus
2. Set Up a Virtual Environment
python3 -m venv iPhyloGeo++_env # Use only `python` instead of `python3` if it doesn't work
# Linux
source iPhyloGeo++_env/bin/activate
# Windows
iPhyloGeo++_env\Scripts\activate
3. Install Dependencies
pip install -r requirements.txt
4. Set Up Pre-commit Hooks(Optional)
pre-commit install
5. Run the Application
python3 scripts/main.py
- Navigate to File Browser on the Genetic Page:
- Access the genetic data interface through the File Browser tab.
- Select and Load Your Fasta File:
To help you, you can go to the Wiki to understand how to navigate into the iPhyloGeo application.
- Choose your Fasta file containing the genetic sequences. Supported formats should be specified (e.g., .fasta, .fa).
- Ensure the file adheres to the correct format and structure.
- Perform Sequence Alignment, Statistics, and Generate Genetic Trees:
- Sequence Alignment:
- Utilize built-in tools for aligning sequences, detailing available algorithms (e.g., MUSCLE, ClustalW).
- Statistics: Generate statistics such as nucleotide frequencies, sequence length distribution, and GC content.
- Genetic Trees: Construct phylogenetic trees using methods like Neighbor-Joining, Maximum Likelihood, or Bayesian inference. Visualize trees with options for customization (e.g., color-coding branches, annotating clades).
- Navigate to File Browser on the Climatic Page:
- Access the climatic data interface through the File Browser tab.
- Select and Load Your CSV File Containing Climatic Data:
- Choose your CSV file with climatic information. Supported data formats and required structure should be clarified.
- View the Generated Maps, Data Tables, Statistics, and Climatic Trees as Needed:
- Display and interact with the visual representations of the climatic data, including maps, tables, and statistical summaries.
- Navigate to the Results Page:
- Access the results interface.
- Adjust the Parameters as Needed:
- Modify settings to refine the analysis.
- Click on Submit to View the Phylogenetic Results:
- Generate and display the results based on the input data and parameters.
- Navigate to the Stats Button for Phylogenetic Trees Visualization:
- Use the stats button to visualize the phylogenetic trees and related statistics.
To help you, you can follow the Tutorial part of the Wiki to achieve all these steps.
This project is organized into several key directories to help you navigate and understand the codebase.
- img/: Contains images used by the README and the application.
- datasets/: Includes sample data for testing purposes.
- scripts/: Houses the Python files for the project.
- requirements.txt: List of dependencies.
- scripts/main.py: Main application entry point.
We welcome contributions to iPhyloGeo++. Please follow these steps:
- Fork the repository.
- Create a new branch (
git checkout -b feature-branch
). - Set up pre-commit hooks (
pre-commit install
). - Commit your changes (
git commit -m 'Add new feature'
). - Push to the branch (
git push origin feature-branch
). - Create a new Pull Request.
1️⃣ Calculation of distance between phylogenetic tree: Least Square metric
- Cavalli-Sforza, L. L., & Edwards, A. W. (1967). Phylogenetic analysis. Models and estimation procedures. American journal of human genetics, 19(3 Pt 1), 233.
- Felsenstein, J. (1997). An alternating least squares approach to inferring phylogenies from pairwise distances. Systematic biology, 46(1), 101-111.
- Makarenkov, V., & Lapointe, F. J. (2004). A weighted least-squares approach for inferring phylogenies from incomplete distance matrices. Bioinformatics, 20(13), 2113-2121.
2️⃣ Calculation of distance between phylogenetic tree: Robinson-Foulds metric
3️⃣ Dataset full description: Analysis of genetic and climatic data of SARS-CoV-2
Please email us at: Nadia.Tahiri@USherbrooke.ca for any questions or feedback.