Skip to content

Commit 25738f6

Browse files
committed
More reads.bam info
1 parent 69b9c62 commit 25738f6

File tree

4 files changed

+78
-40
lines changed

4 files changed

+78
-40
lines changed

docs/faq/mode-all.md

+1-37
Original file line numberDiff line numberDiff line change
@@ -5,43 +5,7 @@ title: Mode --all
55
---
66

77
# Process _all_ reads
8-
Have you ever run _ccs_ with different cutoffs, e.g. tuning `--min-rq` , because
9-
out of the fear of missing out on yield?
10-
Similar to the CLR instrument mode, in which subreads are accompanied by
11-
a scraps file, _ccs_ offers a new mode to never lose a single read due to
12-
filtering, without massive run time increase by polishing low-pass productive ZMWs.
13-
14-
Starting with SMRT Link v10.0 and Sequel IIe, _ccs_ v5.0 or newer is able to generate
15-
one representative sequence per productive ZMW, irrespective of quality and passes.
16-
This ensures no yield loss due to filtering and enables users to have maximum
17-
control over their data. Never fear again that SMRT Link or the Sequel IIe
18-
HiFi mode filtered precious data.
19-
20-
The default command-line behavior has not changed;
21-
it still generates only HiFi quality reads by default.
22-
But the new `--all` mode has been set as default when running the
23-
_Circular Consensensus Sequencing_ SMRT Link application or
24-
selecting the on-instrument Sequel IIe capabilities:
25-
<p align="left"><img width="500px" src="../img/run-design-oiccs.png"/></p>
26-
27-
Output will be a `reads.bam` that contains:
28-
29-
- HiFi Reads (≥Q20)
30-
- Lower-quality but still polished consensus reads (<Q20)
31-
- Unpolished consensus reads (`rq = -1`)
32-
- 0- or 1-pass subreads unaltered (`rq = -1`)
33-
34-
If you want to only use HiFi reads, SMRT Link automatically generates additional
35-
files for your convenience that only contain HiFi reads:
36-
37-
- hifi_reads.**fastq**.gz
38-
- hifi_reads.**fasta**.gz
39-
- hifi_reads.**bam**
40-
41-
If you work with the `reads.bam` file directly, be aware that CCS reads of all
42-
qualities are present. This file needs to be understood before piping
43-
into your typical HiFi application.
44-
8+
Output of `--all` is a `reads.bam` file, please see the [`reads.bam` FAQ](/faq/reads-bam) for more info!
459
## How does `--all` work?
4610
With the special option `--all`, _ccs_ generates one representative
4711
sequence per ZMW, irrespective of quality and passes.

docs/faq/reads-bam.md

+73
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
---
2+
layout: default
3+
parent: FAQ
4+
title: reads.bam
5+
---
6+
7+
# What is the `reads.bam`?
8+
Have you ever run _ccs_ with different cutoffs, e.g. tuning `--min-rq` , because
9+
out of the fear of missing out on yield?
10+
Similar to the CLR instrument mode, in which subreads are accompanied by
11+
a scraps file, _ccs_ offers a new mode to never lose a single read due to
12+
filtering, without massive run time increase by polishing low-pass productive ZMWs.
13+
14+
Starting with SMRT Link v10.0 and Sequel IIe, _ccs_ v5.0 or newer is able to generate
15+
one representative sequence per productive ZMW, irrespective of quality and passes.
16+
This ensures no yield loss due to filtering and enables users to have maximum
17+
control over their data. Never fear again that SMRT Link or the Sequel IIe
18+
HiFi mode filtered precious data.
19+
20+
**Attention:** If you work with the `reads.bam` file directly, be aware that CCS reads of all
21+
qualities are present. This file needs to be understood before piping
22+
into your typical HiFi application.
23+
24+
## How to generate `reads.bam`?
25+
26+
The default command-line behavior has not changed;
27+
it still generates only HiFi quality reads by default.
28+
But the new `--all` mode has been set as default when running the
29+
_Circular Consensensus Sequencing_ SMRT Link application or
30+
selecting the on-instrument Sequel IIe capabilities:
31+
<p align="left"><img width="500px" src="../img/run-design-oiccs.png"/></p>
32+
33+
## What is in the `reads.bam`?
34+
35+
- HiFi Reads with predicted accuracy ≥Q20 (`rq ≥ 0.99`)
36+
- Lower-quality but still polished consensus reads with predicted accuracy <Q20 (`rq < 0.99`)
37+
- Unpolished consensus reads (`rq = -1`)
38+
- Partial or single full-length subreads unaltered (`rq = -1`)
39+
40+
## How to get HiFi reads
41+
42+
### SMRT Link
43+
44+
If you want to only use HiFi reads, SMRT Link automatically generates additional
45+
files for your convenience that only contain HiFi reads:
46+
47+
- hifi_reads.**fastq**.gz
48+
- hifi_reads.**fasta**.gz
49+
- hifi_reads.**bam**
50+
51+
### Command line
52+
53+
Following tools can be installed with
54+
55+
conda install -c bioconda tool_name
56+
57+
#### extracthifi
58+
We provide a simple tool, called `extracthifi` to generate a HiFi-only BAM from a `reads.bam` file. Usage is:
59+
60+
extracthifi reads.bam extracthifi.bam
61+
62+
#### bamtools
63+
Alternatively use `bamtools`:
64+
65+
bamtools filter -in reads.bam -out hifi_reads.bam -tag "rq":">=0.99"
66+
67+
## FAQ: How can I filter by number of passes?
68+
69+
We **strongly** advise against filtering by anything than predicted accuracy,
70+
BAM tag `rq`. The `rq` tag is the best predictor for read quality. Number of
71+
passes is not reliable enough and you might discard too much data. This `np`
72+
tag is an implementation detail that is guaranteed to be present in future
73+
_ccs_ versions.

docs/faq/sqiie.md

+3-2
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ core [auxilliary files](/faq/reports-aux-files):
5959
The on-instrument _ccs_ version and also SMRT Link ≥v10 run in the `--all` mode
6060
by default. In this mode, _ccs_ outputs one representative sequence per
6161
productive ZMW, irrespective of quality and passes. More information
62-
[in the `--all` mode FAQ](/faq/mode-all).
62+
[in the `--all` mode FAQ](/faq/mode-all) and [in the `reads.bam` FAQ](/faq/reads-bam).
6363

6464
## Can you go back to `subreads.bam` from `reads.bam`?
6565
Not when operating the instrument in CCS mode. See next question.
@@ -115,4 +115,5 @@ the `reads.bam` files with your own tools.
115115
Yes, the SMRT Analysis pipeline "Export Reads" in SMRT Link v10.0 or newer can
116116
export HiFi reads to BAM/FASTA/FASTQ format; when adjusting minimum CCS
117117
predicted accuracy, you can include CCS reads <Q20. On the command line, tools
118-
can be used to filter the BAM file for the read quality `rq` tag.
118+
can be used to filter the BAM file for the read quality `rq` tag, please see
119+
the [`reads.bam` FAQ](/faq/reads-bam).

docs/index.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ Version **6.0.0**: [Full changelog here](/changelog)
3636
## What's new!
3737
_ccs_ is now running on the Sequel IIe instrument, transferring HiFi reads
3838
directly off the instrument.\
39-
Read how _ccs_ works on [Sequel IIe](/faq/sqiie)!
39+
Read how _ccs_ works on [Sequel IIe](/faq/sqiie) and what is in the [`reads.bam` file](/faq/reads-bam)!
4040

4141
## Schematic Workflow
4242
<p align="center"><img width="1000px" src="img/generate-hifi.png"/></p>

0 commit comments

Comments
 (0)