Skip to content

Pynteny: a Python package to perform synteny-aware, profile HMM-based searches in sequence databases #65

Closed
@Robaina

Description

Submitting Author: Semidán Robaina (@Robaina)
Package Name: Pynteny: a Python package to perform synteny-aware, profile HMM-based searches in sequence databases
One-Line Description of Package: Query sequence database by HMMs arranged in predefined synteny structure
Repository Link (if existing): https://github.com/Robaina/Pynteny


Description

  • Include a brief paragraph describing what your package does:

Pynteny is Python tool to search for synteny blocks in (prokaryotic) sequence data through HMMs of the ORFs of interest and HMMER. By leveraging genomic context information, Pynteny can be employed to decrease the uncertainty of functional annotation of unlabelled sequence data due to the effect of paralogs. Pynteny can be accessed (i) through the command line, (ii) as a Python module or (iii) as a (locally served) web application.

Scope

  • Please indicate which category or categories this package falls under:

    • Data retrieval
    • Data extraction
    • Data munging
    • Data deposition
    • Data visualization
    • Reproducibility
    • Geospatial
    • Education
    • Unsure/Other (explain below)
  • Explain how and why the package falls under these categories (briefly, 1-2 sentences). Please note any areas you are unsure of:

Pynteny's main objective is to provide a means to query NGS (unannotated) sequence databases, such as metagenomic/metatranscriptomic datasets using syntenic blocks (i.e. spatial arrangements of genes) rather than single target genes/protein domains. In this sense, I would classify Pynteny within Data Extraction.

On the other hand, Pynteny can also be employed in microbiology / genetic courses. To this end, it provides a web graphical interface (Streamlit app) to facilitate interaction. We have successfully employed Pynteny in some of our microbiology courses at the University of La Laguna. Hence, I think tagging Pynteny within "Education" may be appropriate.

  • Who is the target audience and what are the scientific applications of this package?

Pynteny was designed to be used by researchers working with large, unannotated sequence databases, such as those typically encountered in metagenomic analyses. It can be accessed through a command line interface or easily integrated into pipelines as a Python package. Pynteny can also be used through a graphical interface running locally in the browser, which is more suitable for educational purposes.

  • Are there other Python packages that accomplish similar things? If so, how does yours differ?

To extent of my knowledge, there isn't any Python package that provides the functionality provided by Pynteny.

  • Any other questions or issues we should be aware of:

I submitted this package for publication at JOSS a few days back. The submission is currently under consideration for scope.

P.S. *Have feedback/comments about our review process? Leave a comment here

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions