Skip to content

taxpasta: TAXonomic Profile Aggregation and STAndardisation #84

Closed
@Midnighter

Description

@Midnighter

Submitting Author: Moritz E. Beber (@Midnighter)
All current maintainers: (@Midnighter, @sofstam, @jfy133)
Package Name: taxpasta
One-Line Description of Package: TAXonomic Profile Aggregation and STAndardisation
Repository Link: https://github.com/taxprofiler/taxpasta
Version submitted: 0.2.1
Editor: @ctb
Reviewer 1: @snacktavish
Reviewer 2: @bluegenes
Archive: https://github.com/taxprofiler/taxpasta/releases/tag/0.4.0
JOSS DOI: DOI
Version accepted: 0.4.0
Date accepted (month/day/year): 07/05/2023


Code of Conduct & Commitment to Maintain Package

Description

The main purpose of taxpasta is to standardise taxonomic profiles created by a
range of bioinformatics tools. We call those tools taxonomic profilers. They
each come with their own particular, tabular output format. Across the profilers,
relative abundances can be reported in read counts, fractions, or percentages,
as well as any number of additional columns with extra information. We therefore
decided to take the lessons learnt to heart and provide
our own solution to deal with this pasticcio. With taxpasta you can ingest all
of those formats and, at a minimum, output taxonomy identifiers and their
integer counts.

Taxpasta can not only standardise profiles but also merge them across samples
for the same profiler into a single table. In future, we also intend to offer
methods for forming a consensus for the same sample analyzed by different
profilers.

Scope

  • Please indicate which category or categories.
    Check out our package scope page to learn more about our
    scope. (If you are unsure of which category you fit, we suggest you make a pre-submission inquiry):

    • Data retrieval
    • Data extraction
    • Data processing/munging
    • Data deposition
    • Data validation and testing
    • Data visualization **
    • Workflow automation
    • Citation management and bibliometrics
    • Scientific software wrappers
    • Database interoperability

Domain Specific & Community Partnerships

- [ ] Geospatial
- [ ] Education
- [ ] Pangeo

Community Partnerships

If your package is associated with an
existing community please check below:

** Please fill out a pre-submission inquiry before submitting a data visualization package.*

  • For all submissions, explain how the and why the package falls under the categories you indicated above. In your explanation, please address the following points (briefly, 1-2 sentences for each):

    • Who is the target audience and what are scientific applications of this package?

      Taxpasta is a tool for anyone working with taxonomic profiles from metagenomic sequencing experiments. Mostly that means ecologists, bioinformaticians, statisticians. Taxpasta's main application is to standardise profiles from a range of different tools. Having a singular format facilitates downstream analyses. Taxpasta is used, for example, in the upcoming taxprofiler pipeline implemented in nextflow. There, it also serves to combine the profiles of many samples into a single file.

    • Are there other Python packages that accomplish the same thing? If so, how does yours differ?

      The BIOM format was created with the intention of standardizing a storage format for microbiome analyses. However, creating this format was entirely left to the user. Taxpasta conveniently knows how to read profiles from a range of tools and can also produce BIOM output.

      Some of the taxonomic profilers also come with scripts to convert their output into another format but none of them support such a wide range of tools as taxpasta does.

    • If you made a pre-submission enquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted:

Technical checks

For details about the pyOpenSci packaging requirements, see our packaging guide. Confirm each of the following by checking the box. This package:

  • does not violate the Terms of Service of any service it interacts with.
  • uses an OSI approved license.
  • contains a README with instructions for installing the development version (development version is described in CONTRIBUTING.rst).
  • includes documentation with examples for all functions.
  • contains a tutorial with examples of its essential functions and uses.
  • has a test suite.
  • has continuous integration setup, such as GitHub Actions CircleCI, and/or others.

Publication Options

JOSS Checks
  • The package has an obvious research application according to JOSS's definition in their submission requirements. Be aware that completing the pyOpenSci review process does not guarantee acceptance to JOSS. Be sure to read their submission requirements (linked above) if you are interested in submitting to JOSS.
  • The package is not a "minor utility" as defined by JOSS's submission requirements: "Minor ‘utility’ packages, including ‘thin’ API clients, are not acceptable." pyOpenSci welcomes these packages under "Data Retrieval", but JOSS has slightly different criteria.
  • The package contains a paper.md matching JOSS's requirements with a high-level description in the package root or in inst/.
  • The package is deposited in a long-term repository with the DOI:

Note: JOSS accepts our review as theirs. You will NOT need to go through another full review. JOSS will only review your paper.md file. Be sure to link to this pyOpenSci issue when a JOSS issue is opened for your package. Also be sure to tell the JOSS editor that this is a pyOpenSci reviewed package once you reach this step.

Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

This option will allow reviewers to open smaller issues that can then be linked to PR's rather than submitting a more dense text based review. It will also allow you to demonstrate addressing the issue via PR links.

  • Yes I am OK with reviewers submitting requested changes as issues to my repo. Reviewers will then link to the issues in their submitted review.

Confirm each of the following by checking the box.

  • I have read the author guide.
  • I expect to maintain this package for at least 2 years and can help find a replacement for the maintainer (team) if needed.

Please fill out our survey

P.S. *Have feedback/comments about our review process? Leave a comment here

Editor and Review Templates

The [editor template can be found here][Editor Template].

The [review template can be found here][Review Template].

Metadata

Metadata

Assignees

Type

No type

Projects

Status

joss-accepted

Relationships

None yet

Development

No branches or pull requests

Issue actions