Skip to content
/ dains Public

A comprehensive guide to data analysis in the natural sciences using R. This book provides researchers, students, and professionals with practical techniques for analyzing environmental, biological, and ecological data. From basic data manipulation to advanced statistical modeling and visualization, it covers essential methods with real-world examp

License

Notifications You must be signed in to change notification settings

jm0535/dains

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Data Analysis in Natural Sciences: An R-Based Approach

Publish to GitHub Pages License: MIT Quarto

A comprehensive book that provides step-by-step instructions on data analysis for researchers and students in natural sciences using R. This book is designed to guide users through fundamental statistical concepts and practical data analysis techniques with a focus on ecological, environmental, and life sciences applications.

πŸ“– Read the Book

Online Version: https://jm0535.github.io/dains/

πŸ“š Contents

The book covers:

Part Topics
Getting Started Introduction to R, data analysis fundamentals, data basics
Data Analysis Fundamentals Exploratory data analysis, hypothesis testing, statistical tests
Data Visualization Visualization techniques, advanced graphics with ggplot2
Advanced Topics Regression analysis, conservation applications

Chapter Overview

  1. Introduction to Data Analysis - R basics and analytical thinking
  2. Data Basics - Data structures, importing, and cleaning
  3. Exploratory Data Analysis - Descriptive statistics and pattern discovery
  4. Hypothesis Testing - Statistical inference fundamentals
  5. Statistical Tests - Common parametric and non-parametric tests
  6. Data Visualization - Creating effective scientific graphics
  7. Advanced Visualization - Interactive and publication-quality figures
  8. Regression Analysis - Linear models and tidymodels framework
  9. Conservation Applications - Real-world ecological case studies

πŸ“Š Datasets

All datasets are located in the data/ directory, organized by scientific discipline:

Directory Description Source
agriculture/ Crop yield data Our World in Data
botany/ Plant traits data Break Free From Plastic
ecology/ Plant biodiversity data IUCN Red List
economics/ Coffee economics data Coffee Quality Institute
entomology/ Animal data Austin Animal Center
environmental/ Climate data Palmer penguins dataset
epidemiology/ Disease/health data Various sources
forestry/ Forest inventory data Field collections
geography/ Spatial data UN Office on Drugs and Crime
marine/ Ocean/fishing data Great Lakes Fishery Commission

Each dataset directory contains a CITATION.txt file with source information and proper citation for academic use.

πŸš€ Getting Started

Prerequisites

  • R (version 4.0.0 or higher)
  • RStudio (recommended IDE)
  • Quarto (for building the book)

Installation

  1. Clone the repository:

    git clone https://github.com/jm0535/dains.git
    cd dains
  2. Install required R packages:

    source("install_packages.R")

    Or manually install core packages:

    install.packages(c(
      "tidyverse",
      "tidymodels",
      "ggplot2",
      "rstatix",
      "knitr",
      "rmarkdown",
      "performance",
      "see"
    ))
  3. Download datasets (if needed):

    source("download_datasets.R")

πŸ”¨ Building the Book

To build the HTML version of the book locally:

  1. Install Quarto from quarto.org

  2. Render the book:

    quarto render
  3. Preview locally:

    quarto preview

The rendered book will be available in the docs/ directory.

πŸ“ Project Structure

dains/
β”œβ”€β”€ _quarto.yml          # Quarto configuration
β”œβ”€β”€ index.qmd            # Book landing page
β”œβ”€β”€ preface.qmd          # Preface chapter
β”œβ”€β”€ references.qmd       # References chapter
β”œβ”€β”€ chapters/            # Book chapters (01-09)
β”œβ”€β”€ data/                # Datasets by discipline
β”œβ”€β”€ docs/                # Rendered HTML output
β”œβ”€β”€ images/              # Book images and cover
β”œβ”€β”€ R/                   # Helper R functions
β”œβ”€β”€ scripts/             # Utility scripts
β”œβ”€β”€ styles.css           # Custom CSS styling
β”œβ”€β”€ references.bib       # Bibliography
└── apa.csl              # Citation style

🀝 Contributing

Contributions to improve the book are welcome! Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-improvement)
  3. Make your changes
  4. Run quarto render to ensure everything builds correctly
  5. Commit your changes (git commit -m 'Add some amazing improvement')
  6. Push to the branch (git push origin feature/amazing-improvement)
  7. Open a Pull Request

Please read CONTRIBUTING.md for detailed guidelines.

πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.

✍️ Author

Jimmy Moses School of Forestry, Faculty of Natural Resources Papua New Guinea University of Technology PMB 411, Lae, Morobe Province, Papua New Guinea

πŸ™ Acknowledgments

  • The R Core Team for developing R
  • The tidyverse team for revolutionizing R programming
  • The Quarto team for the publishing system
  • All data providers who make their datasets openly available
  • Students and colleagues who provided feedback

πŸ“¬ Contact


Last updated: December 2025

About

A comprehensive guide to data analysis in the natural sciences using R. This book provides researchers, students, and professionals with practical techniques for analyzing environmental, biological, and ecological data. From basic data manipulation to advanced statistical modeling and visualization, it covers essential methods with real-world examp

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •