Skip to content

Commit

Permalink
Merge pull request galaxyproject#4168 from ELIXIR-UK-DaSH/ELIXIR-UK-DaSH
Browse files Browse the repository at this point in the history
Expand the FAIR Training Topic
  • Loading branch information
hexylena authored May 30, 2023
2 parents ee4fc15 + 4f54e98 commit c7c213b
Show file tree
Hide file tree
Showing 15 changed files with 389 additions and 11 deletions.
8 changes: 5 additions & 3 deletions CONTRIBUTORS.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -913,11 +913,13 @@ kilpert:
name: Fabian Kilpert
joined: 2017-09

kpbioteam:
name: Katarzyna Murat
kkamieniecka:
name: Katarzyna Kamieniecka
orcid: 0009-0004-2454-5950
joined: 2018-08
elixir_node: uk

kpoterlowicz:
poterlowicz-lab:
name: Krzysztof Poterlowicz
twitter: bioinfbrad
orcid: 0000-0001-6173-5674
Expand Down
4 changes: 2 additions & 2 deletions topics/epigenetics/tutorials/ewas-suite/slides.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@
logo: "GTN"
title: EWAS Epigenome-Wide Association Studies Introduction
contributors:
- kpoterlowicz
- kpbioteam
- kkamieniecka
- poterlowicz-lab
---


Expand Down
6 changes: 3 additions & 3 deletions topics/epigenetics/tutorials/ewas-suite/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@ key_points:
- "Infinium Human Methylation BeadChip is an array based technology to generate DNA methylation profiling at individual CpG loci in the human genome based on Illumina’s bead technology."
- "Time and cost efficiency followed by high sample output, and overall quantitative accuracy and reproducibility made Infinium Human Methylation BeadChip one of the most widely used arrays on the market."
contributors:
- kpbioteam
- kpoterlowicz
- kkamieniecka
- poterlowicz-lab
---
> <agenda-title></agenda-title>
> In this tutorial we will do:
Expand All @@ -26,7 +26,7 @@ contributors:
>
{: .agenda}

This tutorial is based on [Hugo W, Shi H, Sun L, Piva M et al.: Non-genomic and Immune Evolution of Melanoma Acquiring MAPKi Resistance.](https://doi.org/10.1016/j.cell.2015.07.061).
This tutorial is based on Hugo W, Shi H, Sun L, Piva M et al.: Non-genomic and Immune Evolution of Melanoma Acquiring MAPKi Resistance {% cite Hugo2015 %}.

The data we use in this tutorial are available at [Zenodo](https://zenodo.org/record/1251211).

Expand Down
Binary file added topics/fair/images/fair_gtn.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added topics/fair/images/fair_open.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 6 additions & 1 deletion topics/fair/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,17 @@ title: "FAIR Data, Workflows, and Research"
summary: "These lessons will teach you how to make your research objects more FAIR with practical, hands-on advice."

subtopics:
- id: fair-data
title: "FAIR Data Management"
description: "The FAIR (Findable, Accessible, Interoperable, Reusable) data stewardship created the foundation for sharing and publishing digital assets. These lessons will apply to machine accessibility and emphasize that all digital assets should share data in a way that will enable maximum use and reuse."
- id: ro-crate
title: "Make workflows fair with RO-Crate"
title: "Make workflows FAIR with RO-Crate"
description: "This section brought to you by the BY-COVID project, and will teach you how to make your research objects more FAIR with practical, hands-on advice."

maintainers:
- simleo
- ilveroluca
- stain
- pauldg
- kkamieniecka
- poterlowicz-lab
3 changes: 3 additions & 0 deletions topics/fair/tutorials/data-management/faqs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
---
layout: faq-page
---
97 changes: 97 additions & 0 deletions topics/fair/tutorials/data-management/tutorial.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
---
layout: tutorial_hands_on
title: FAIR data management solutions
abbreviations:
FAIR: Findable, Accessible, Interoperable, Reusable
DMP: Data Management Plan
PID: Persistent Identifier

zenodo_link: ''
questions:
- Is there a reproducibility crisis?
- What can go wrong with data analysis?
objectives:
- Learn best practices in data management
- Learn how to introduce computational reproducibility in your research
time_estimation: "10M"
key_points:
- FAIR data management allows machines to automatically find and use the data accordingly.
tags:
- fair
- dmp
- data stewardship
priority: 2
contributions:
authorship:
- kkamieniecka
- poterlowicz-lab
editing:
- hexylena
subtopic: fair-data

requirements:
- type: "internal"
topic_name: fair
tutorials:
- fair-intro

---


# Introduction

The FAIR (Findable, Accessible, Interoperable, Reusable) data stewardship created the foundation for sharing and publishing digital assets, especially data. This apply to machine accessibility and emphasize that all digital assets should share data in a way that will enable maximum use and reuse.

This tutorial is a short introduction to the FAIR data management framework. You can find out more at the [FAIR Pointers](https://elixir-uk-dash.github.io/FAIR-Pointers/ep1/index.html) online course.

> <agenda-title></agenda-title>
>
> In this tutorial, we will cover:
>
> 1. TOC
> {:toc}
>
{: .agenda}

# Data management planning
In recent years we have notice a data explosion. Number of sequence records in each release of [GenBank](https://www.ncbi.nlm.nih.gov/genbank/statistics/), from 1982 to the present, doubled in size approximately every 18 months. Great amounts of data available followed by expanding range of tools and computational solutions result in reproducibility crisis. Having **data management plan (DMP)** in place is essential to achieve FAIR data management. DMPs are often described as living documents and should be updated according to changing circumstances.

There are several ways to set up FAIR (Findable, Accessible, Interoperable, Reusable) data management plans (DMPs) :
- Findable (F): Data description and collection or reuse of existing data
- Accessible (A): Standardised authentication or authorisation (e.g. HTTP, HTTPS)
- Interoperable (I): Documentation and data quality
- Reusable (R): Storage and backup supported by legal and ethical requirements

## Data description and collection or reuse of existing data
Reusing legacy datasets from institutional repositories or the digital libraries data collections can be FAIRified retrospectively. Support for data collection and development, throughout the life cycle can be provided and followed by change management and capacity improvement.

Multi-part FAIR research need a way of wrapping up, describing and sharing to promote the reuse of data. **Data sharing agreements** define the purpose of data sharing. Reference roles and responsibilities; specifies the purpose and legal requirements, e.g. for data security.

An institutional aim should be to create an integrated view and context over fragmented resources using their **persistent identifiers (PIDs)** and **metadata**. To make datasets findable, these metadata need to be as widely available as possible.

Enhancing reproducibility, quality and transparency by ensuring information flow and showcasing secondary use is also a part of data management. Promoting hands-on data experience and events activities built an collaborative environment for reproducible science.


## Documentation and data quality
Having access to local knowledge and encouraging best practises at the departmental level is a smart way to offer direction on a variety of standards and methods. In order to implement FAIR data practises within an institution, resources and infrastructure are needed. To increase the possibility of data reuse, several FAIR requirements can be satisfied using freely available guidelines e.g. [RDMkit](https://rdmkit.elixir-europe.org/), [FAIR Cookbook](https://faircookbook.elixir-europe.org/content/home.html), [ELIXIR-UK DaSH Fellowship](https://sites.google.com/view/navigation-portal-fellowship/home?authuser=0) initiative and repositories e.g. [Zenodo](https://zenodo.org/), [Harvard Dataverse](https://dataverse.harvard.edu/) and [figshare](https://figshare.com/).

## Storage and backup
Systems for storage, backup and collaboration depend upon technical infrastructure.The **'3-2-1 rule'**, a recommendation for saving three copies of the research data—two locally and one off-site—is a standard backup strategy for research data. Data, metadata, and other research artefacts, such as ontologies, software, documentation, and papers, must all be kept in locations where they are adequately safeguarded, backed up, and accessible to maximise their potential for reuse. Appropriate access management is essential, in addition to backup and restoration services that protect researchers against data loss, theft, malfunctioning computers or storage media, and accidental deletion or inadvertent alterations to the data.

The fundamental component of infrastructure required for the FAIR research data lifecycle are repository services. They allow access to the data, a persistent identifier, and the descriptive metadata that support interoperability. Repositories can include basic data storage, resource finding, managing access and use of private information, facilitating peer review of information related to publications or services requiring digital preservation, and more.

The [OpenAIRE](https://www.openaire.eu/opendatapilot-repository-guide) repository guide advises users to check the availability of a suitable repository in following order:

1. The most effective option (if available) is to maintain the data in accordance with acknowledged discipline-specific criteria using an established, dedicated (external) data archive or repository that caters specifically to the study domain.
2. Making use of institutional data repositories is the second-best option.
3. If none of those options is practical, a free data repository should be used.

Up-to-date lists of available registered data repositories can be found at [re3data](https://www.re3data.org/) and [FAIRsharing](https://fairsharing.org/search?fairsharingRegistry=Database).

## Legal and ethical requirements
Institutional support network (data stewards, ethics boards, IP, legal and financial offices) need to guide researchers in safeguarding data management responsibilities and resources.

# Conclusion
You will have the advantage of saving time and resources by planning how to FAIRify your data in the early phases of your research endeavour. To put this into action, a data management strategy, or DMP, must be written. A DMP is also where you outline your data collection, storage, processing, sharing, and disposal procedures. Planning the management and FAIRification of your data reduces the possibility of issues down the road, whether they be practical, legal, or technical.

Keep in mind that creating FAIR data is a complex process. Consider how you can make your data FAIR one step at a time at each stage of the creation, collection, documentation, storage, sharing, archiving, and preservation processes. The framework for the rest of your study planning is laid by incorporating your data documentation. Imagine you want to use a dataset that was generated by another researcher and how you would like it to be found. Hope this quick introduction to FAIR data management solutions will help you improve not only your experience with data but also influence others by using your guidance and FAIRified resources.
3 changes: 3 additions & 0 deletions topics/fair/tutorials/fair-gtn/faqs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
---
layout: faq-page
---
69 changes: 69 additions & 0 deletions topics/fair/tutorials/fair-gtn/tutorial.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
@misc{wiegers_luc_2019_3593258,
author = {Wiegers, Luc and
van Gelder, Celia W. G.},
title = {{Illustration for "Ten simple rules for making
training materials FAIR"}},
month = dec,
year = 2019,
publisher = {Zenodo},
version = {1.0},
doi = {10.5281/zenodo.3593258},
url = {https://doi.org/10.5281/zenodo.3593258}
}

@article{chevron2014metacognitive,
title={A metacognitive tool: Theoretical and operational analysis of skills exercised in structured concept maps},
author={Chevron, Marie-Pierre},
journal={Perspectives in Science},
doi = {10.1016/j.pisc.2014.07.001},
volume={2},
number={1-4},
pages={46--54},
year={2014},
publisher={Elsevier}
}

@article{mcmurry2017identifiers,
title={Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data},
author={McMurry, Julie A and Juty, Nick and Blomberg, Niklas and Burdett, Tony and Conlin, Tom and Conte, Nathalie and Courtot, M{\'e}lanie and Deck, John and Dumontier, Michel and Fellows, Donal K and others},
journal={PLoS biology},
doi = {10.1371/journal.pbio.2001414},
volume={15},
number={6},
pages={e2001414},
year={2017},
publisher={Public Library of Science San Francisco, CA USA}
}

@article{hiltemann2023galaxy,
title={Galaxy Training: A powerful framework for teaching!},
author={Hiltemann, Saskia and Rasche, Helena and Gladman, Simon and Hotz, Hans-Rudolf and Larivi{\`e}re, Delphine and Blankenberg, Daniel and Jagtap, Pratik D and Wollmann, Thomas and Bretaudeau, Anthony and Gou{\'e}, Nadia and others},
journal={PLOS Computational Biology},
doi = {10.1371/journal.pcbi.1010752},
volume={19},
number={1},
pages={e1010752},
year={2023},
publisher={Public Library of Science San Francisco, CA USA}
}

@misc{fair-training-materials,
url = {https://training.galaxyproject.org/training-material/faqs/gtn/fair_training.html},
note = {Accessed 2023-05-23},
title = {How does the GTN ensure our training materials are FAIR?}
}

@article{Garcia2020,
doi = {10.1371/journal.pcbi.1007854},
url = {https://doi.org/10.1371/journal.pcbi.1007854},
year = {2020},
month = may,
publisher = {Public Library of Science ({PLoS})},
volume = {16},
number = {5},
pages = {e1007854},
author = {Leyla Garcia and B{\'{e}}r{\'{e}}nice Batut and Melissa L. Burke and Mateusz Kuzak and Fotis Psomopoulos and Ricardo Arcila and Teresa K. Attwood and Niall Beard and Denise Carvalho-Silva and Alexandros C. Dimopoulos and Victoria Dominguez del Angel and Michel Dumontier and Kim T. Gurwitz and Roland Krause and Peter McQuilton and Loredana Le Pera and Sarah L. Morgan and P\"{a}ivi Rauste and Allegra Via and Pascal Kahlem and Gabriella Rustici and Celia W. G. van Gelder and Patricia M. Palagi},
editor = {Scott Markel},
title = {Ten simple rules for making training materials {FAIR}},
journal = {{PLOS} Computational Biology}
}
Loading

0 comments on commit c7c213b

Please sign in to comment.