Skip to content

Sargasso disambiguates mixed-species high-throughput sequencing data.

License

Notifications You must be signed in to change notification settings

sidbdri/Sargasso

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DOI Build Status

Sargasso

Sargasso is a Python tool to disambiguate mixed-species high-throughput sequencing reads according to their species of origin. Given a set of samples containing sequencing data from multiple species, mapped, disambiguated reads are written to per-sample and species-specific output BAM files.

The latest Sargasso documentation can be found here.

Installation

Note: as Sargasso has a number of dependencies on other Python packages, it is strongly recommended to install in an isolated environment using the virtualenv tool. The virtualenvwrapper tool makes managing multiple virtual environments easier.

After setting up virtualenv and virtualenvwrapper, create and work in a virtual environment for Sargasso using the virtualenvwrapper tool:

mkproject sargasso

Then install the Sargasso package and its Python package dependencies into the virtual environment by running:

pip install git+https://github.com/statbio/sargasso.git

Note that Sargasso v2.0 should work with Python versions >= 2.6 (including Python 3). Versions before v1.2.2 will only work with Python 2.

Citation

If you make use of Sargasso please cite our protocol paper:

  • Qiu et al., "Mixed-species RNA-seq for elucidation of non-cell-autonomous control of gene transcription", Nature Protocols 13, 2176–2199 (2018).

Changelog

  • 2.0.2 (19/08/2019): Bugfix release to correctly handle single-end reads with Bowtie2.
  • 2.0.1 (06/06/2019): Bugfix release to correctly handle Bowtie2 mismatch count.
  • 2.0 (16/01/2019): Sargasso now separates reads derived from DNA-based sequencing technologies (for example, ChIP-seq and ATAC-seq), in addition to RNA-seq reads.
  • 1.2.2 (11/10/2018): Bugfix release for compatibility with Python 3.
  • 1.2.1 (02/10/2018): Bugfix release for incompatibilities between Mac OS and Linux.
  • 1.2 (16/02/2018):
    • Improvements to species read assignment logic gives better precision and recall.
    • Added --delete-intermediate option to delete intermediate BAM files.
    • Added --star-executable option to allow different versions of STAR to be used.
    • Added --sambamba-sort-tmp-dir option to specify a different temporary directory for 'sambamba sort'.
  • 1.1.2 (14/12/2017): Minor improvements to interpretability of results.
  • 1.1.1 (02/03/2017): Add "permissive" filtering strategy.
  • 1.1 (26/01/2017): Filtering of RNA-seq data from more than two species.
  • 1.0 (16/12/2016): First full release

About

Sargasso disambiguates mixed-species high-throughput sequencing data.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 79.9%
  • Shell 20.1%