microbiome · antagomir · Jul 30, 2023 · Feb 19, 2023 · Mar 8, 2023 · Mar 10, 2023
diff --git a/01_intro.Rmd b/01_intro.Rmd
@@ -7,8 +7,7 @@ library(rebook)
 chapterPreamble()
 ```
 
-This work - [**Orchestrating Microbiome Analysis with R and
-Bioconductor**](https://microbiome.github.io/OMA/) [@OMA] - contributes novel
+This work - [**Orchestrating Microbiome Analysis with Bioconductor**](https://microbiome.github.io/OMA/) [@OMA] - contributes novel
 methods and educational resources for microbiome data science.  It
 aims to teach the grammar of Bioconductor workflows in the context of
 microbiome data science. We show through concrete examples how to use

diff --git a/04_containers.Rmd b/04_containers.Rmd
@@ -33,16 +33,16 @@ sequencing.
 information (such as phylogenetic trees and sample hierarchies) and
 reference sequences.
 
-[`MultiAssayExperiment`] (`MAE`) [@Ramos2017] provides an organized way to bind several different data
-structures together in a single object. For example, we can bind
-microbiome data (in `TreeSE` format) with metabolomic profiling data
-(in `SE`) format, with shared sample metadata. This is convenient and
-robust for instance in subsetting and other data manipulation
-tasks. Microbiome data can be part of multiomics experiments and
-analysis strategies and we want to outline the understanding in which
-we think the packages explained and used in this book relate to these
-experiment layouts using the `TreeSummarizedExperiment` and classes
-beyond.
+[`MultiAssayExperiment`] (`MAE`) [@Ramos2017] provides an organized
+way to bind several different data containers together in a single
+object. For example, we can bind microbiome data (in `TreeSE`
+container) with metabolomic profiling data (in `SE`) container, with
+(partially) shared sample metadata. This is convenient and robust for
+instance in subsetting and other data manipulation tasks. Microbiome
+data can be part of multiomics experiments and analysis strategies. We
+highlight how the methods used througout in this book relate to this
+data framework by using the `TreeSummarizedExperiment`,
+`MultiAssayExperiment`, and classes beyond.
 
 This section provides an introductions to these data containers. In
 microbiome data science, these containers link taxonomic abundance
@@ -282,7 +282,7 @@ GlobalPatterns
 [HintikkaXOData](https://microbiome.github.io/microbiomeDataSets/reference/HintikkaXOData.html)
 is derived from a study about the effects of fat diet and prebiotics on the
 microbiome of rat models [@Hintikka2021]. It is available in the MAE data
-container for R. The dataset is briefly presented in
+container for R. The dataset is briefly summarized in
 [these slides](https://microbiome.github.io/outreach/hintikkaxo_presentation.html).
 
 
@@ -654,7 +654,7 @@ abundance table is named as "counts".  Let us inspect only the first
 cols and rows.
 
 ```{r}
-assays(se)$counts[1:3, 1:3]
+assay(se, "counts")[1:3, 1:3]
 ```
 
 The `rowdata` includes taxonomic information from the biom file. The `head()` command

diff --git a/06_packages.Rmd b/06_packages.Rmd
@@ -11,12 +11,11 @@ chapterPreamble()
 The Bioconductor microbiome data science framework consists of:
 
 - **data containers**, designed to organize multi-assay microbiome data
-- **R packages** that provide dedicated methods for analysing such data
+- **R/Bioconductor packages** that provide dedicated methods 
 - **community** of users and developers 
 
 <img src="general/figures/ecosystem.png" width="100" alt="mia logo" align="right" style="margin: 0 1em 0 1em" />
 
-
 This section provides an overview of the package ecosystem. Section
 \@ref(example-data) links to various open microbiome data resources
 that support this framework.
@@ -63,50 +62,47 @@ devtools::install_github("microbiome/mia")
 
 ## Package ecosystem {#ecosystem}
 
-Methods for the analysis and manipulation of
-`(Tree)SummarizedExperiment` and `MultiAssayExperiment` data
-containers are available through a number of R packages. Some of these
-are listed below. If you know more tips on such packages, data
-sources, or other resources, kindly [let us
-know](https://microbiome.github.io) through the issues, pull requests,
-or online channels.
+Methods for `(Tree)SummarizedExperiment` and `MultiAssayExperiment`
+data containers are provided by multiple independent developers
+through R/Bioconductor packages. Some of these are listed below (tips
+on new packages are [welcome](https://microbiome.github.io)).
+
 
+### mia package family
 
-### mia family of methods
+The mia package family provides general methods for microbiome data wrangling, analysis and visualization. 
 
 - [mia](https://microbiome.github.io/mia/): Microbiome analysis tools [@R_mia]
 - [miaViz](https://microbiome.github.io/miaViz/): Microbiome analysis specific visualization [@Ernst2022]
 - [miaSim](https://microbiome.github.io/miaSim/): Microbiome data simulations [@Simsek2021]
 - [miaTime](https://microbiome.github.io/miaTime/): Microbiome time series analysis [@Lahti2021]
 
 
-### Tree-based methods {#sub-tree-methods}
-
-- [philr](http://bioconductor.org/packages/devel/bioc/html/philr.html) (@Silverman2017)
-
-
 ### Differential abundance {#sub-diff-abund}
 
+The following DA methods support `(Tree)SummarizedExperiment`.
+
 - [ANCOMBC](https://bioconductor.org/packages/devel/bioc/html/ANCOMBC.html) for differential abundance analysis
 - [benchdamic](https://bioconductor.org/packages/release/bioc/vignettes/benchdamic/inst/doc/intro.html) for benchmarking differential abundance methods
-- [LinDA](https://cran.r-project.org/web/packages/MicrobiomeStat/) for differential abundance analysis
-- [ZicoSeq](https://cran.r-project.org/web/packages/GUniFrac/) for differential abundance analysis
 - [ALDEx2](https://www.bioconductor.org/packages/release/bioc/html/ALDEx2.html) for differential abundance analysis
-- [phyloseq](https://www.bioconductor.org/packages/release/bioc/html/phyloseq.html) for data preparation into phyloseq format for differential abundance analysis, such as ANCOMBC requires the input data is phyloseq format
-
 
 
-### Manipulation {#sub-manipulation}
-
-- [MicrobiotaProcess](https://bioconductor.org/packages/release/bioc/html/MicrobiotaProcess.html) for analyzing microbiome and other ecological data within the tidy framework
-
-
-### Further options
+### Other packages
 
+- [philr](http://bioconductor.org/packages/devel/bioc/html/philr.html) (@Silverman2017) phylogeny-aware phILR transformation
+- [MicrobiotaProcess](https://bioconductor.org/packages/release/bioc/html/MicrobiotaProcess.html) for "tidy" analysis of microbiome and other ecological data
 - [Tools for Microbiome
   Analysis](https://microsud.github.io/Tools-Microbiome-Analysis/)
   site listed over 130 R packages for microbiome data science in
   2023. Many of these are not in Bioconductor, or do not directly
-  support the data containers used in this book but can be used with
-  minor modifications.
+  support the data containers used in this book but can be often used
+  with minor modifications. The phyloseq-based tools can be used by
+  converting the TreeSE data into phyloseq with
+  `makePhyloseqFromTreeSummarizedExperiment`.
+
+
+### Open microbiome data 
+
+Hundreds of published microbiome data sets are readily available in
+these data containers (see \@ref(example-data)).
 
diff --git a/11_taxonomic_information.Rmd b/11_taxonomic_information.Rmd
@@ -199,6 +199,7 @@ Here is an example that does a CLR transformation followed by the hierarchical
 clustering algorithm. 
 
 First, we import the library `bluster` that simplifies the clustering.
+
 ```{r bluster_dependence}
 library(bluster)
 ```
@@ -218,21 +219,23 @@ tse <- transformAssay(tse, assay.type = "clr", method = "z",
                       MARGIN = "features")
 
 # Cluster (with euclidean distance) on the features of the z assay
-tse <- cluster(tse, assay.type = "z",
-               clust.col = "hclustEuclidean", MARGIN = "features",
-               HclustParam(dist.fun = stats::dist, metric = "euclidean",
-                           method = "ward.D2"))
+tse <- cluster(tse,
+               assay.type = "z",
+               clust.col = "hclustEuclidean",
+	       MARGIN = "features",
+               HclustParam(dist.fun = stats::dist, method = "ward.D2"))
 
 # Declare the Kendall dissimilarity computation function
 kendall_dissimilarity <- function(x) {
     as.dist(1 - cor(t(x), method = "kendall"))
 }
 
 # Cluster (with Kendall dissimilarity) on the features of the z assay
-tse <- cluster(tse, assay.type = "z", MARGIN = "features", 
+tse <- cluster(tse,
+               assay.type = "z",
                clust.col = "hclustKendall",
-               HclustParam(method = "ward.D2", 
-                           dist.fun = kendall_dissimilarity))
+       	       MARGIN = "features", 	       
+               HclustParam(dist.fun = kendall_dissimilarity, method = "ward.D2"))
 ```
 
 Let us store the resulting cluster indices in the `rowData` column specified 
@@ -312,26 +315,3 @@ head(assay(tse, "pa"))
 assays(tse)
 ```
 
-## Pick specific {#pick-specific}
-
-Retrieving of specific elements that are required for specific analysis. For
-instance, extracting abundances for a specific taxa in all samples or all taxa 
-in one sample.  
-
-### Abundances of all taxa in specific sample 
-```{r}
-taxa.abund.cc1 <- getAbundanceSample(tse,
-                                     sample_id = "CC1",
-                                     assay.type = "counts")
-taxa.abund.cc1[1:10]
-```
-
-### Abundances of specific taxa in all samples   
-
-```{r}
-taxa.abundances <- getAbundanceFeature(tse,
-                                       feature_id = "Phylum:Bacteroidetes",
-                                       assay.type = "counts")
-taxa.abundances[1:10]
-```
-
diff --git a/30_differential_abundance.Rmd b/30_differential_abundance.Rmd
@@ -339,13 +339,15 @@ that we specify.
 
 
 ```{r ancombc2, warning = FALSE, eval=TRUE}
+
 # Agglomerate data to genus level and add this new abundance table to the altExp slot
 altExp(tse, "genus") <- agglomerateByRank(tse, "genus")
 
 # Identify prevalent genera
 prevalent.genera <- getPrevalentFeatures(altExp(tse, "genus"), detection = 0, prevalence = 30/100)
 
 # Run ANCOM-BC at the genus level and only including the prevalent genera
+
 out <- ancombc2(
   data = altExp(tse, "genus")[prevalent.genera, ],
   assay_name = "counts", 

diff --git a/80_training.Rmd b/80_training.Rmd
@@ -51,7 +51,7 @@ We encourage to familiarize with the material and test examples in advance:
 
  * [Other outreach material](https://github.com/microbiome/outreach)
 
- * [Orchestrating Microbiome Analysis with R/Bioconductor (OMA)](https://microbiome.github.io/OMA/) (this book)
+ * [Orchestrating Microbiome Analysis with Bioconductor (OMA)](https://microbiome.github.io/OMA/) (this book)
 
  * [Exercises](#exercises) for self-study
 

diff --git a/90_acknowledgments.Rmd b/90_acknowledgments.Rmd
@@ -1,38 +1,56 @@
-# Authors and contributors {-}
+# Developers {-}
 
 ```{r setup, echo=FALSE, results="asis"}
 library(rebook)
 chapterPreamble()
 ```
 
-
-### *Leo Lahti, DSc* {-}
-
-Leo Lahti is professor in Data Science at the [Department of Computing, University of Turku, Finland](https://datascience.utu.fi/), with a focus on computational microbiome analysis. Lahti obtained doctoral degree (DSc) from Aalto University in Finland (2010), developing probabilistic machine learning and data integration methods for high-throughput life science data. Since 2011 he has carried out microbiome research and developed, among other things, the _phyloseq_-based [microbiome R package](https://bioconductor.org/packages/release/bioc/html/microbiome.html) before starting to develop the mia libraries and _TreeSummarizedExperiment_ / _MultiAssayExperiment_ framework for microbiome data science introduced in this gitbook. In addition to carrying out computational microbiome research, Lahti is in the editorial board of _ISME_ and _Microbiome_ journals, work group leader in the European COST action network [ML4microbiome](https://ml4microbiome.eu/), national delegate in the International Science Council Committee on Data ([CODATA](https://codata.org/)), and has led the development of [national policy on open access to research methods in Finland](https://avointiede.fi/en/policies-materials/policies-open-science-and-research-finland/policy-open-research-data-and-methods). He is current member in the [Bioconductor Community Advisory Board](https://bioconductor.org/about/community-advisory-board/) and runs regular training workshops in microbiome data science.
-
-
-### *Tuomas Borman* {-}
-
-Tuomas Borman is a PhD researcher at the Department of Computing, University of Turku, and one of the key developers of the microbiome data science framework presented in this gitbook. He has helped to set up the base ecosystem of R/Bioconductor packages and other online resources.
-
-
-### *Giulio Benedetti* {-}
-
-Giulio Benedetti is a scientific programmer at the Department of Computing, University of Turku. His research interest is mostly related to Data Science. He has also helped to expand the SummarizedExperiment-based microbiome analysis framework to the Julia language, implementing [MicrobiomeAnalysis.jl](https://github.com/JuliaTurkuDataScience/MicrobiomeAnalysis.jl).
-
-
-### *Felix Ernst, PhD* {-}
-
-Felix Ernst is among the first developers of R/Bioc methods for microbiome research based on the _SummarizedExperiment_ class and its derivatives.
-
+### Core team {-}
+
+Contributions to this Gitbook from the various developers are
+coordinated by:
+
+- *Leo Lahti, DSc*, professor in Data Science at the [Department of
+   Computing, University of Turku,
+   Finland](https://datascience.utu.fi/), with a focus on
+   computational microbiome analysis. Lahti obtained doctoral degree
+   (DSc) from Aalto University in Finland (2010), developing
+   probabilistic machine learning with applications to high-throughput
+   life science data integration. Since then he has focused on
+   microbiome research and developed, for instance, the
+   _phyloseq_-based [microbiome R
+   package](https://bioconductor.org/packages/release/bioc/html/microbiome.html)
+   before starting to develop the _TreeSummarizedExperiment_ /
+   _MultiAssayExperiment_ framework and the mia family of Bioconductor
+   packages for microbiome data science introduced in this
+   gitbook. Lahti led the development of [national policy on open
+   access to research methods in
+   Finland](https://avointiede.fi/en/policies-materials/policies-open-science-and-research-finland/policy-open-research-data-and-methods).
+   He is current member in the [Bioconductor Community Advisory
+   Board](https://bioconductor.org/about/community-advisory-board/)
+   and runs regular training workshops in microbiome data science.
+
+- *Tuomas Borman*, PhD researcher and the lead developer of OMA/mia at
+   the Department of Computing, University of Turku. 
 
 
 ### Contributors {-}
 
-This work is a remarkably collaborative effort over the years. The
-full list of contributors is available via
+This work is a remarkably collaborative effort. The full list of
+contributors is available via
 [Github](https://github.com/microbiome/OMA/graphs/contributors). Some
-of the key contributors include:
+key authors/contributors include:
+
+- *Felix Ernst, PhD*, among the first developers of R/Bioc methods for
+   microbiome research based on the _SummarizedExperiment_ class and
+   its derivatives.
+
+- *Giulio Benedetti*, scientific programmer at the Department of
+   Computing, University of Turku. His research interest is mostly
+   related to Data Science. He has also helped to expand the
+   SummarizedExperiment-based microbiome analysis framework to the
+   Julia language, implementing
+   [MicrobiomeAnalysis.jl](https://github.com/JuliaTurkuDataScience/MicrobiomeAnalysis.jl).
 
 - *Sudarshan Shetty, PhD* has supported the establishment of the
    framework and associated tools. He also maintains a list of

diff --git a/98_exercises.Rmd b/98_exercises.Rmd
@@ -251,8 +251,9 @@ got stuck, you can refer to chapter \@ref(assay-slot) of this book.
 6. **Extra**: Create a taxonomy tree based on the taxonomy mappings with
    `addTaxonomyTree` and display its content with `taxonomyTree` and `ggtree`.
 
-If you got stuck, you can look up chapters \@ref(pick-specific) and \@ref(fly-tree)
-on how to pick specific abundances and generate row trees, respectively.
+If you got stuck, you can look up chapters \@fref{datamanipulation}
+and \@ref(fly-tree) on how to pick specific abundances and generate
+row trees, respectively.
 
 
 ### Other elements

diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,25 +1,22 @@
 Package: OMA
-Title: Orchestrating Microbiome Analysis
-Version: 0.98.15
-Date: 2023-07-13
+Title: Orchestrating Microbiome Analysis with Bioconductor
+Version: 0.98.16
+Date: 2023-07-29
 Authors@R: 
     c(person("Leo", "Lahti", role = c("aut"),
              comment = c(ORCID = "0000-0001-5537-637X")),
       person(given = "Tuomas", family = "Borman", role = c("aut", "cre"),
              email = "tuomas.v.borman@utu.fi",
              comment = c(ORCID = "0000-0002-8563-8884")),
-      person(given = "Henrik", family = "Eckermann", role = c("ctb"),
-             comment = c(ORCID = "0000-0001-8725-7770")),	     	     
-      person("Sudarshan", "Shetty", email = "sudarshanshetty9@gmail.com",
-             role = c("aut"),
-             comment = c(ORCID = "0000-0001-7280-9915")),
       person("Felix GM", "Ernst", email = "felix.gm.ernst@outlook.com",
              role = c("aut"),
-             comment = c(ORCID = "0000-0001-5064-0928"))
+             comment = c(ORCID = "0000-0001-5064-0928")),
+      person("and others", "(see the full list of contributors)", 
+             role = c("ctb"))
 	     )
 Description:
-    This is a reference cookbook for performing **Microbiome Analysis** with 
-    Bioconductor in R.
+    This is a reference cookbook for **Microbiome Data Science** with 
+    R and Bioconductor.
 License: CC BY-NC-SA 3.0 US
 Encoding: UTF-8
 URL: https://github.com/microbiome/OMA