GitHub - schneebergerlab/coelsch: crossover estimation, localisation, and single cell haplotyping 🤡

coelsch: Crossover estimation, localisation and single cell haplotyping

(formerly known as snco)

coelsch is a set of tools for identifying recombination events at the single nucleus/cell level, using read alignments and/or SNP matrices from single nucleus sequencing experiments of recombinant haploid gametes. Reads alignments or SNPs are summarised into a per-cell barcode "marker" dataset which can then be used to perform crossover analysis with a hidden Markov model. coelsch also provides commands for generating per-barcode summary statistics, plotting marker and crossover profiles, identifying segregation distortions, and simulating ground truth datasets using marker distributions from real data.

coelsch is still under active development, so expect some bugs and changes! If in doubt about the behaviour of coelsch or how it might change, feel free to reach out by opening an issue!

Installation:

The easiest way to install coelsch is using the conda yaml provided:

git clone https://github.com/schneebergerlab/coelsch.git
cd coelsch
conda env create -f coelsch.yml

alternatively, you can install it using pip:

pip install git+https://github.com/schneebergerlab/coelsch.git

coelsch requires torch and pomegranate>=1.0 for performing crossover detection, pysam and joblib for bam file manipulation and numpy, pandas, scipy and matplotlib for general marker analysis and plotting. The command line interface is build using click. For some of the provided helper scripts, parasail is also required.

Quickstart:

coelsch divides the analysis into several subcommands for modular and flexible workflows. There are currently two different methods provided for loading markers into the format used for crossover prediction. These are:

coelsch loadbam: read a bam file aligned with cell barcode and haplotype alignment tags. The best way to generate this is using the coelsch_mapping_pipeline, a haplotype and single-cell aware alignment pipeline using STAR consensus.
coelsch loadcsl: read a matrix market + vcf file describing biallelic SNP counts for individual cells, generated by cellsnp-lite. The reference allele is assumed to derive from haplotype 1, and the alternative allele from haplotype 2.

The output of both load commands is a json file containing marker information, which can be fed into the downstream clean, predict, stats and plot commands.

If you would like to perform the full analysis in a single step, then the core analysis pipeline, load(+clean)+predict, can be run together as a single command using coelsch bam2pred or coelsch csl2pred.

Python API

coelsch also provides an API with some helpful classes for working with output marker and predictions datasets in python. These are the MarkerRecords and PredictionRecords classes:

from coelsch import MarkerRecords, PredictionRecords

co_markers = MarkerRecords.read_json('markers.json')
co_preds = PredictionRecords.read_json('pred.json')

co_markers.barcodes[:5]

    ['TGGTTAGGTAGATTGA',
     'ACGTAGTTCATCAGTG',
     'GACCAATCAACAAAGT',
     'CTATAGGGTTACCTTT',
     'AGAGAATCAGACAATA']

Plotting functions can also accessed both from the command line using coelsch plot and also in python using the coelsch.plot module or built in methods of MarkerRecords/PredictionRecords

co_markers.plot_barcode('TGGTTAGGTAGATTGA', co_preds=co_preds, max_yheight=10);

Name		Name	Last commit message	Last commit date
Latest commit History 207 Commits
coelsch		coelsch
docs		docs
images		images
scripts		scripts
tests		tests
.gitignore		.gitignore
coelsch.yml		coelsch.yml
readme.md		readme.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

coelsch: Crossover estimation, localisation and single cell haplotyping

Installation:

Quickstart:

Python API

About

Uh oh!

Releases

Packages

Uh oh!

Languages

schneebergerlab/coelsch

Folders and files

Latest commit

History

Repository files navigation

coelsch: Crossover estimation, localisation and single cell haplotyping

Installation:

Quickstart:

Python API

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages