Skip to content

Add new taxa to HaMStR

Vinh Tran edited this page Jul 1, 2020 · 15 revisions

To add a new taxon to HaMStR, you need to follow its naming schema ([Species acronym]@[NCBI ID]@[Proteome version]) and place the necessary files in the correct folders:

  • genome_dir (Contains sub-directories for proteome fasta files for each species)
  • blast_dir (Contains sub-directories for BLAST databases made with makeblastdb out of your proteomes)
  • weight_dir (Contains feature annotation files for each proteome)

We simplify this process by providing 2 python scripts bin/addTaxonHamstr.py and bin/addTaxaHamstr.py.

Add a single taxon

For this, you can use the bin/addTaxonHamstr.py script:

python3 addTaxonHamstr.py -f your_genome.fa -n abbr_tax_name -I tax_id -o /path/to/your/HaMStR -c

It will add a new folder named abbr_tax_name@tax_id@1 and the corresponding content into genome_dir and blast_dir , as well as a annotation abbr_tax_name@tax_id@1.json file to weight_dir.

Add a list of taxa

Clone this wiki locally