Skip to content

Using Deepbinner with Albacore

Ryan Wick edited this page Aug 28, 2018 · 3 revisions

Sometimes you might want to demultiplex with both Albacore and Deepbinner, only keeping reads where both tools agree on the bin. This approach should result in very high precision demultiplexing but at the cost of recall. I.e. you won't bin as many reads as using Deepbinner alone, but there will be a very low rate of misclassification.

This Bash code outlines one way to do this – first demultiplexing fast5s with Deepbinner, then basecalling each bin separately with Albacore:

# Run Deepbinner classification (possibly in real time during sequencing)
deepbinner realtime --in_dir fast5s --out_dir demultiplexed_fast5s --native

mkdir demultiplexed_fastqs

# Loop through each of Deepbinner's bin directories of fast5s.
for b in $(ls demultiplexed_fast5s | sed 's/unclassified//' | sort); do  # skip the 'unclassified' bin

    # Basecall with Albacore (change settings as appropriate to suit your needs).
    albacore_in=demultiplexed_fast5s/"$b"
    albacore_out=albacore_"$b"
    read_fast5_basecaller.py -f FLO-MIN106 -k SQK-LSK108 -i $albacore_in -t 16 --barcoding -s $albacore_out -o fastq --disable_filtering

    # Gzip the reads which Albacore put in the matching bin.
    cat "$albacore_out"/workspace/"$b"/*.fastq | gzip > demultiplexed_fastqs/"$b".fastq.gz

    # Clean up Albacore's output directory (to save space).
    rm -r "$albacore_out"/workspace/*/*.fastq
done

For other scripts which use Deepbinner, including its use with Guppy, you might want to check out this repo:
https://github.com/rrwick/MinION-desktop