MUSCLE
MAFFT
MSAProbs
HMMER3
EMBOSS + PHYLIPNEW (EMBASSY package)
FastME 2.07
Blast+
KaKs_Calculator 2.0
Python 2.7
- wget >= 3.2
- pyaml >= 3.12
- numpy >= 1.14.3
- pandas == 0.22.0
- matplotlib >= 2.2.3
- scipy >= 1.1.0
- DendroPy >= 4.4.0
- biopython >= 1.72
- redis >= 2.10.6
- psutil >= 5.6.1
docker pull loven7doo/eagle
The workdir in this image is '/EAGLE' so to run all commands from the image use this way:
docker run -v </host/system/workdr/location/>:/EAGLE <any command>
Do not forget to replace the path to the host system workdir with '/EAGLE' in commands
If docker is not the appropriate way follow steps below (requirements installation and the package installation)
It can be very difficult to install some requirements on Windows. Linux is recommended to use.
pip install git+https://github.com/loven-doo/EAGLE.git --upgrade
from dev branch:
pip install git+https://github.com/loven-doo/EAGLE.git@dev --upgrade
You can (recommended way) download the default database from here. The downloaded database should be placed into the workdir (This is not usable at all - will be uprades for it. To save the diskspace, a symlink to EAGLEdb directory can be created in each workdir with 'ln -s </path/to/extracted/EAGLEdb> </path/to/workdir/EAGLEdb>' command). Each database scheme placed in <db_name>_info.json files that are located in the archive root directory ('EAGLEdb').
Other option is to build it from prepared lists of NCBI genomes:
eagle_db -dbt bacteria
The created database scheme (db_info.json) will be located in the databese directory (EAGLEdb/bacteria)
Also below is the instruction for building a database from NCBI if you do not like to use the default database or prepared lists (another option):
-
Download assembly summary (here is RefSeq assembly summary table for bacteria and here is Genbank assembly summary table for bacteria).
-
Prepare genomes lists:
eagle_db.prepare_ncbi_summary <downloaded/summary/path> <prepared/genomes/list/path>
- Build the database
eagle_db -dbt bacteria -igenbank <prepared/genomes/list/path>
All this commands can be run as Python functions: see below eagledb package reference
WARNING: sequences names in input fasta file longer than 10 symbols may produce errors.
Type the command below to start the analysis:
eagle -i <fasta/path> -db <EAGLEdb/scheme/json/path> -nt <threads_number> -o <out/dir/path>
for detailed parameters description type:
eagle -h
Also the analysis can be run from Python:
from eagle import explore_orfs
explore_orfs(...)