Skip to content

A Pyrodigal extension to predict genes in giant viruses and viruses with alternative genetic code.

License

Notifications You must be signed in to change notification settings

althonos/pyrodigal-gv

Repository files navigation

🔥🦠 Pyrodigal-gv Stars

A Pyrodigal extension to predict genes in giant viruses and viruses with alternative genetic code.

Actions Coverage License PyPI Bioconda AUR Wheel Python Versions Python Implementations Source GitHub issues Changelog Downloads

🗺️ Overview

Pyrodigal is a Python module that provides Cython bindings to Prodigal, an efficient gene finding method for genomes and metagenomes based on dynamic programming.

pyrodigal-gv is a small extension module for pyrodigal which distributes additional metagenomic models for giant viruses and viruses that use alternative genetic codes, first provided by Antônio Camargo in prodigal-gv. The new models are the following:

  • Acanthamoeba polyphaga mimivirus
  • Paramecium bursaria Chlorella virus
  • Acanthocystis turfacea Chlorella virus
  • VirSorter2's NCLDV gene model
  • Topaz (genetic code 15)
  • Agate (genetic code 15)
  • Gut phages (genetic code 15)
  • Gut phages (genetic code 11) × 5

🔧 Installing

pyrodigal-gv can be installed directly from PyPI as a universal wheel that contains all required data files:

$ pip install pyrodigal-gv

💡 Example

Just use the provided ViralGeneFinder class instead of the usual GeneFinder from pyrodigal, and the new viral models will be used automatically in meta mode:

import Bio.SeqIO
import pyrodigal_gv

record = Bio.SeqIO.read("sequence.gbk", "genbank")

orf_finder = pyrodigal_gv.ViralGeneFinder(meta=True)
for i, pred in enumerate(orf_finder.find_genes(bytes(record.seq))):
    print(f">{record.id}_{i+1}")
    print(pred.translate())

ViralGeneFinder has an additional keyword argument, viral_only, which can be set to True to run gene calling using only viral models.

🔨 Command line

pyrodigal-gv comes with a very simple command line similar to Prodigal and pyrodigal:

$ pyrodigal-gv -i <input_file.fasta> -a <gene_translations.fasta> -d <gene_sequences.fasta>

Contrary to prodigal and pyrodigal, the pyrodigal-gv script runs in meta mode by default! Running in single mode can be done with pyrodigal-gv -p single but the results will be exactly the same as pyrodigal, so why would you ever do this ⁉️

🔖 Citation

If you use the features provided by pyrodigal-gv beyond the base Pyrodigal functionality, please cite the original manuscript detailing these extensions:

Camargo, A. P., Roux, S., Schulz, F., Babinski, M., Xu, Y., Hu, B., ... and Kyrpides, N. C. (2023). Identification of mobile genetic elements with geNomad. Nature Biotechnology, 1-10.

Pyrodigal is scientific software, with a published paper in the Journal of Open-Source Software. Please cite both Pyrodigal and Prodigal if you are using it in an academic work, for instance as:

Pyrodigal (Larralde, 2022), a Python library binding to Prodigal (Hyatt et al., 2010).

Detailed references are available on the Publications page of the online documentation.

💭 Feedback

⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

🏗️ Contributing

Contributions are more than welcome! See CONTRIBUTING.md for more details.

📋 Changelog

This project adheres to Semantic Versioning and provides a changelog in the Keep a Changelog format.

⚖️ License

This library is provided under the GNU General Public License v3.0. The Prodigal code was written by Doug Hyatt and is distributed under the terms of the GPLv3 as well. See vendor/Prodigal/LICENSE for more information. The giant virus and alternative genetic code virus parameters were created by Antônio Camargo.

This project is in no way not affiliated, sponsored, or otherwise endorsed by the original Prodigal authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

About

A Pyrodigal extension to predict genes in giant viruses and viruses with alternative genetic code.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages