peskas.kenya.data.pipeline

The goal of peskas.kenya.data.pipeline is to implement, deploy, and execute the data and modelling pipelines that underpin Peskas in Kenya, a partnership between WorldFish and Wildlife Conservation Society.

The pipeline is an R package

peskas.kenya.data.pipeline is structured as an R package because it makes it easier to write production-grade software. Specifically, structuring the code as an R package allows us to:

better handle system and package dependencies,
forces us to split the code into functions,
makes it easier to document the code, and
makes it easier to test the code

We make heavy use of tidyverse style conventions and the usethis package to automate tasks during project setup and deployment.

For more information about the rationale of structuring the pipeline as a package check Chapter 3 in Engineering Production-Grade Shiny Apps. The book is focused on Shiny applications but the rationale also applies to data pipelines and production-ready code in general.

How the pipeline works

The pipeline is composed of different modules:

Data Collection: On site fishing landing surveys and continuous, solar-powered GPS vessel trackers to collect and send data in near real-time, alongside fishery metadata for a thorough data-gathering process.
Pre-processing: Data formatting, shaping, and standardisation to prepare the raw data for analysis.
Validation: Outlier detection and error identification, and includes an alert system to maintain data quality.
Analytics: Modelling fisheries indicators, nutritional characterization, and data mining to extract valuable insights.
Data export: Automated dissemination of processed and analysed fisheries data to ensure accessibility and comprehension. This involves restructuring data for dashboard integration and open publication.
isualisation: Tools for data reporting and sharing of insights through a comprehensive dedicated web app dashboard (not hosted in this repository).

See Peskas: Automated analytics for small-scale, data-deficient fisheries for further details.

Getting Started

This package uses a configuration file config.yml to manage environment-specific settings and connections. To get started, familiarize yourself with the package structure, particularly the R directory where the main functions are located.

Each function typically reads the configuration using read_config() to access necessary parameters. To work on this package locally, you’ll need to set up the required authentication files in the auth/ directory and ensure your environment variables are properly set. Remember to run devtools::load_all() when testing changes locally. If you’re new to R package development, consider reviewing the R packages book by Hadley Wickham and Jenny Brian.

Quick Guide for Contributors

To keep our repository clean and efficient, please keep these guidelines in mind:

Always work on a new branch, not directly on main.
Write clear, concise commit messages.
Avoid storing intermediate and garbage files, especially in the root folder.
Strive for soft-coded solutions.
Maintain consistent code style throughout the project.
Document your code well - future you (and others) will thank you.
Test your changes thoroughly before submitting a pull request.
Keep your fork synced with the main repository.

These practices help us maintain a clean, efficient codebase that’s easier for everyone to work with. For more detailed guidelines, check out our CONTRIBUTING.md file.

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
.github		.github
R		R
inst		inst
man		man
pkgdown/favicon		pkgdown/favicon
.DS_Store		.DS_Store
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
Dockerfile.prod		Dockerfile.prod
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.Rmd		README.Rmd
README.md		README.md
_pkgdown.yml		_pkgdown.yml
codecov.yml		codecov.yml
peskas.kenya.data.pipeline.Rproj		peskas.kenya.data.pipeline.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

peskas.kenya.data.pipeline

The pipeline is an R package

How the pipeline works

Getting Started

Quick Guide for Contributors

About

Releases

Packages

Contributors 3

Languages

License

WorldFishCenter/peskas.kenya.data.pipeline

Folders and files

Latest commit

History

Repository files navigation

peskas.kenya.data.pipeline

The pipeline is an R package

How the pipeline works

Getting Started

Quick Guide for Contributors

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages