A comprehensive book that provides step-by-step instructions on data analysis for researchers and students in natural sciences using R. This book is designed to guide users through fundamental statistical concepts and practical data analysis techniques with a focus on ecological, environmental, and life sciences applications.
Online Version: https://jm0535.github.io/dains/
The book covers:
| Part | Topics |
|---|---|
| Getting Started | Introduction to R, data analysis fundamentals, data basics |
| Data Analysis Fundamentals | Exploratory data analysis, hypothesis testing, statistical tests |
| Data Visualization | Visualization techniques, advanced graphics with ggplot2 |
| Advanced Topics | Regression analysis, conservation applications |
- Introduction to Data Analysis - R basics and analytical thinking
- Data Basics - Data structures, importing, and cleaning
- Exploratory Data Analysis - Descriptive statistics and pattern discovery
- Hypothesis Testing - Statistical inference fundamentals
- Statistical Tests - Common parametric and non-parametric tests
- Data Visualization - Creating effective scientific graphics
- Advanced Visualization - Interactive and publication-quality figures
- Regression Analysis - Linear models and tidymodels framework
- Conservation Applications - Real-world ecological case studies
All datasets are located in the data/ directory, organized by scientific discipline:
| Directory | Description | Source |
|---|---|---|
agriculture/ |
Crop yield data | Our World in Data |
botany/ |
Plant traits data | Break Free From Plastic |
ecology/ |
Plant biodiversity data | IUCN Red List |
economics/ |
Coffee economics data | Coffee Quality Institute |
entomology/ |
Animal data | Austin Animal Center |
environmental/ |
Climate data | Palmer penguins dataset |
epidemiology/ |
Disease/health data | Various sources |
forestry/ |
Forest inventory data | Field collections |
geography/ |
Spatial data | UN Office on Drugs and Crime |
marine/ |
Ocean/fishing data | Great Lakes Fishery Commission |
Each dataset directory contains a CITATION.txt file with source information and proper citation for academic use.
- R (version 4.0.0 or higher)
- RStudio (recommended IDE)
- Quarto (for building the book)
-
Clone the repository:
git clone https://github.com/jm0535/dains.git cd dains -
Install required R packages:
source("install_packages.R")Or manually install core packages:
install.packages(c( "tidyverse", "tidymodels", "ggplot2", "rstatix", "knitr", "rmarkdown", "performance", "see" ))
-
Download datasets (if needed):
source("download_datasets.R")
To build the HTML version of the book locally:
-
Install Quarto from quarto.org
-
Render the book:
quarto render
-
Preview locally:
quarto preview
The rendered book will be available in the docs/ directory.
dains/
βββ _quarto.yml # Quarto configuration
βββ index.qmd # Book landing page
βββ preface.qmd # Preface chapter
βββ references.qmd # References chapter
βββ chapters/ # Book chapters (01-09)
βββ data/ # Datasets by discipline
βββ docs/ # Rendered HTML output
βββ images/ # Book images and cover
βββ R/ # Helper R functions
βββ scripts/ # Utility scripts
βββ styles.css # Custom CSS styling
βββ references.bib # Bibliography
βββ apa.csl # Citation style
Contributions to improve the book are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-improvement) - Make your changes
- Run
quarto renderto ensure everything builds correctly - Commit your changes (
git commit -m 'Add some amazing improvement') - Push to the branch (
git push origin feature/amazing-improvement) - Open a Pull Request
Please read CONTRIBUTING.md for detailed guidelines.
This project is licensed under the MIT License - see the LICENSE file for details.
Jimmy Moses School of Forestry, Faculty of Natural Resources Papua New Guinea University of Technology PMB 411, Lae, Morobe Province, Papua New Guinea
- The R Core Team for developing R
- The tidyverse team for revolutionizing R programming
- The Quarto team for the publishing system
- All data providers who make their datasets openly available
- Students and colleagues who provided feedback
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Last updated: December 2025