Skip to content

Latest commit

 

History

History
76 lines (57 loc) · 3.11 KB

README.md

File metadata and controls

76 lines (57 loc) · 3.11 KB

tximeta

R build status

Automatic metadata for RNA-seq

tximeta provides a set of functions for conveniently working with metadata for transcript quantification data in Bioconductor. The tximeta() function imports quantification data from Salmon or other quantifiers, and returns a SummarizedExperiment object. tximeta works natively with Salmon, alevin, or piscem-infer, but can easily be configured to work with any transcript quantification tool.

If tximeta() recognizes the reference transcripts used for quantification, it will automatically download relevant information about the location of the transcripts in the correct genome. These actions happen in the background without requiring any extra effort or information from the user.

This metadata is attached to the SummarizedExperiment in the metadata() and rowRanges() slots.

For a list of the reference transcriptomes supported by tximeta(), see the "Pre-computed digests" section of the vignette in the Get started tab. We call the computed identifier for the reference transcriptome a "digest" or sometimes a "checksum".

Further steps are also facilitated, e.g. summarizeToGene(), addIds(), or even retrieveCDNA() (the transcripts used for quantification) or retrieveDb() (the correct TxDb or EnsDb to match the quantification data).

How it works

The key idea behind tximeta is that Salmon, alevin, and piscem-infer propagate a hash value summarizing the reference transcripts into each quantification directory it outputs. tximeta can be used with other tools as long as the hash of the transcripts is also included in the output directories. See customMetaInfo argument of tximeta() for more details.

Reference

A reference for tximeta package is:

Michael I. Love, Charlotte Soneson, Peter F. Hickey, Lisa K. Johnson, N. Tessa Pierce, Lori Shepherd, Martin Morgan, Rob Patro. "Tximeta: reference sequence checksums for provenance identification in RNA-seq" PLOS Computational Biology (2020) doi: 10.1371/journal.pcbi.1007664

Feedback

We would love to hear your feedback. Please post to Bioconductor support site for software usage help or post an Issue on GitHub, for software development questions.

Funding

tximeta was developed as part of NIH NHGRI R01-HG009937.

tximeta was also supported by the Chan Zuckerberg Initiative as part of the EOSS grants.