Skip to content

Commit

Permalink
docs: Better documentation on reclustering
Browse files Browse the repository at this point in the history
  • Loading branch information
adrientaudiere committed Nov 4, 2024
1 parent a29cd5c commit a974360
Showing 1 changed file with 5 additions and 3 deletions.
8 changes: 5 additions & 3 deletions vignettes/articles/Reclustering.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,13 @@ library(MiscMetabar)

# Re-clustering ASVs

**ASV** (stands for *Amplicon Sequence Variant*; also called **ESV** for Exact Amplicon Variant) is a DNA sequence obtained from high-throughput analysis of marker genes. **OTU** are a group of closely related individuals created by clustering sequences based on a threshold of similarity. An ASV is a special case of an OTU with a similarity threshold of 100%. A third concept is the zero-radius OTU **zOTU** [@edgar2016] which is the same concept than ASV but compute with other softwares than [dada](https://github.com/benjjneb/dada2) (e.g. [vsearch](https://github.com/torognes/vsearch)).
**ASV** (stands for *Amplicon Sequence Variant*; also called **ESV** for Exact Amplicon Variant) is a DNA sequence obtained from high-throughput analysis of marker genes. **OTU** (stands for *Operational Taxonomic Unit*) is a group of closely related individuals created by clustering sequences based on a threshold of similarity. An ASV is a special case of an OTU with a similarity threshold of 100%. A third concept is the zero-radius OTU **zOTU** [@edgar2016] which is the same concept than ASV but compute with other softwares than [dada](https://github.com/benjjneb/dada2) (e.g. [vsearch](https://github.com/torognes/vsearch)).

The choice between ASV and OTU is important because they lead to different results (@joos2020, Box 2 in @tedersoo2022, @chiarello2022). Most articles recommend making a choice depending on the question [@mclaren2018], For example, ASV may be better than OTU for describing a group of very closely related species. In addition, ASV are comparable across different datasets (obtained using identical marker genes). On the other hand, [@tedersoo2022] report that ASV approaches overestimate the richness of common fungal species (due to haplotype variation), but underestimate the richness of rare species. They therefore recommend the use of OTUs in metabarcoding analyses of fungal communities. Finally, [@kauserud2023] argues that the ASV term falls within the original OTU term and recommends adopting only the OTU terms, but with a concise and clear report on how the OTUs were generated.
The choice between ASV and OTU is important because they lead to different results (@joos2020, Box 2 in @tedersoo2022, @chiarello2022). Most articles recommend making a choice depending on the question [@mclaren2018], For example, ASV may be better than OTU for describing a group of very closely related species. In addition, ASV are comparable across different datasets (obtained using identical marker genes). [@fasolo2024] showed that that the OTUs clustering of 16S rDNA proportionally led to a marked underestimation of the ecological indicators values for species diversity and to a distorted behaviour of the dominance and evenness indexes with respect to the direct use of the ASV data. On the other hand, [@tedersoo2022] report that ASV approaches overestimate the richness of common fungal species (due to haplotype variation), but underestimate the richness of rare species. They recommend the OTUs approach in metabarcoding analyses of fungal communities. Finally, [@kauserud2023] argues that the ASV term falls within the original OTU term and recommends adopting only the OTU terms, but with a concise and clear report on how the OTUs were generated.

Recent articles [@forster2019; @antich2021; @brandt2021] propose to use both approach together. They recommend (i) using ASV to denoise the dataset and (ii) for some questions, clustering the ASV sequences into OTUs. [@garcia2019] used both concept to demonstrate that ecotypes (ASV within OTUs) are adapted to different values of environmental factors favoring the persistence of OTU across changing environmental conditions. This is the goal of the function `asv2otu()`, using either the `DECIPHER::Clusterize` function from R or the [vsearch](https://github.com/torognes/vsearch) software.
Recent articles [@forster2019; @antich2021; @brandt2021] propose to use both approaches together. They recommend (i) using ASV to denoise the dataset and (ii) for some questions, clustering the ASV sequences into OTUs. [@garcia2019] used both concept to demonstrate that ecotypes (ASV within OTUs) are adapted to different values of environmental factors favoring the persistence of OTU across changing environmental conditions.

The goal of the function `asv2otu()` is to facilitate the reclustering of ASV into OTU, using either the `DECIPHER::Clusterize` function from R or the [vsearch](https://github.com/torognes/vsearch) software.

## Using decipher or Vsearch algorithm
```{r}
Expand Down

0 comments on commit a974360

Please sign in to comment.