Merge pull request galaxyproject#4168 from ELIXIR-UK-DaSH/ELIXIR-UK-DaSH

Expand the FAIR Training Topic
workflow4metabolomics · May 30, 2023 · c7c213b · c7c213b
2 parents ee4fc15 + 4f54e98
commit c7c213b
Show file tree

Hide file tree

Showing 15 changed files with 389 additions and 11 deletions.
diff --git a/CONTRIBUTORS.yaml b/CONTRIBUTORS.yaml
@@ -913,11 +913,13 @@ kilpert:
     name: Fabian Kilpert
     joined: 2017-09
 
-kpbioteam:
-    name: Katarzyna Murat
+kkamieniecka:
+    name: Katarzyna Kamieniecka
+    orcid: 0009-0004-2454-5950
     joined: 2018-08
+    elixir_node: uk
 
-kpoterlowicz:
+poterlowicz-lab:
     name: Krzysztof Poterlowicz
     twitter: bioinfbrad
     orcid: 0000-0001-6173-5674

diff --git a/topics/epigenetics/tutorials/ewas-suite/slides.html b/topics/epigenetics/tutorials/ewas-suite/slides.html
@@ -3,8 +3,8 @@
 logo: "GTN"
 title: EWAS Epigenome-Wide Association Studies Introduction
 contributors:
- - kpoterlowicz
- - kpbioteam
+  - kkamieniecka
+  - poterlowicz-lab
 ---
 
 

diff --git a/topics/epigenetics/tutorials/ewas-suite/tutorial.md b/topics/epigenetics/tutorials/ewas-suite/tutorial.md
@@ -14,8 +14,8 @@ key_points:
 - "Infinium Human Methylation BeadChip is an array based technology to generate DNA methylation profiling at individual CpG loci in the human genome based on Illumina’s bead technology."
 - "Time and cost efficiency followed by high sample output, and overall quantitative accuracy and reproducibility made Infinium Human Methylation BeadChip one of the most widely used arrays on the market."
 contributors:
-  - kpbioteam
-  - kpoterlowicz
+  - kkamieniecka
+  - poterlowicz-lab
 ---
 > <agenda-title></agenda-title>
 > In this tutorial we will do:
@@ -26,7 +26,7 @@ contributors:
 >
 {: .agenda}
 
-This tutorial is based on [Hugo W, Shi H, Sun L, Piva M et al.: Non-genomic and Immune Evolution of Melanoma Acquiring MAPKi Resistance.](https://doi.org/10.1016/j.cell.2015.07.061).
+This tutorial is based on Hugo W, Shi H, Sun L, Piva M et al.: Non-genomic and Immune Evolution of Melanoma Acquiring MAPKi Resistance {% cite Hugo2015 %}.
 
 The data we use in this tutorial are available at [Zenodo](https://zenodo.org/record/1251211).
 

diff --git a/topics/fair/images/fair_gtn.png b/topics/fair/images/fair_gtn.png
diff --git a/topics/fair/images/fair_open.png b/topics/fair/images/fair_open.png
diff --git a/topics/fair/metadata.yaml b/topics/fair/metadata.yaml
@@ -5,12 +5,17 @@ title: "FAIR Data, Workflows, and Research"
 summary: "These lessons will teach you how to make your research objects more FAIR with practical, hands-on advice."
 
 subtopics:
+  - id: fair-data
+    title: "FAIR Data Management"
+    description: "The FAIR (Findable, Accessible, Interoperable, Reusable) data stewardship created the foundation for sharing and publishing digital assets. These lessons will apply to machine accessibility and emphasize that all digital assets should share data in a way that will enable maximum use and reuse."
   - id: ro-crate
-    title: "Make workflows fair with RO-Crate"
+    title: "Make workflows FAIR with RO-Crate"
     description: "This section brought to you by the BY-COVID project, and will teach you how to make your research objects more FAIR with practical, hands-on advice."
 
 maintainers:
   - simleo
   - ilveroluca
   - stain
   - pauldg
+  - kkamieniecka
+  - poterlowicz-lab
diff --git a/topics/fair/tutorials/data-management/faqs/index.md b/topics/fair/tutorials/data-management/faqs/index.md
@@ -0,0 +1,3 @@
+---
+layout: faq-page
+---
diff --git a/topics/fair/tutorials/data-management/tutorial.md b/topics/fair/tutorials/data-management/tutorial.md
@@ -0,0 +1,97 @@
+---
+layout: tutorial_hands_on
+title: FAIR data management solutions
+abbreviations:
+  FAIR: Findable, Accessible, Interoperable, Reusable
+  DMP: Data Management Plan
+  PID: Persistent Identifier
+
+zenodo_link: ''
+questions:
+- Is there a reproducibility crisis?
+- What can go wrong with data analysis?
+objectives:
+- Learn best practices in data management
+- Learn how to introduce computational reproducibility in your research
+time_estimation: "10M"
+key_points:
+- FAIR data management allows machines to automatically find and use the data accordingly.
+tags:
+- fair
+- dmp
+- data stewardship
+priority: 2
+contributions:
+  authorship:
+    - kkamieniecka
+    - poterlowicz-lab
+  editing:
+    - hexylena
+subtopic: fair-data
+
+requirements:
+  - type: "internal"
+    topic_name: fair
+    tutorials:
+      - fair-intro
+
+---
+
+
+# Introduction
+
+The FAIR (Findable, Accessible, Interoperable, Reusable)  data stewardship created the foundation for sharing and publishing digital assets, especially data. This apply to machine accessibility and emphasize that all digital assets should share data in a way that will enable maximum use and reuse.
+
+This tutorial is a short introduction to the FAIR data management framework. You can find out more at the [FAIR Pointers](https://elixir-uk-dash.github.io/FAIR-Pointers/ep1/index.html) online course.
+
+> <agenda-title></agenda-title>
+>
+> In this tutorial, we will cover:
+>
+> 1. TOC
+> {:toc}
+>
+{: .agenda}
+
+# Data management planning
+In recent years we have notice a data explosion. Number of sequence records in each release of [GenBank](https://www.ncbi.nlm.nih.gov/genbank/statistics/), from 1982 to the present, doubled in size approximately every 18 months. Great amounts of data available followed by expanding range of tools and computational solutions result in reproducibility crisis. Having **data management plan (DMP)** in place is essential to achieve FAIR data management. DMPs are often described as living documents and should be updated according to changing circumstances.
+
+There are several ways to set up FAIR (Findable, Accessible, Interoperable, Reusable) data management plans (DMPs) :
+  - Findable (F): Data description and collection or reuse of existing data
+  - Accessible (A): Standardised authentication or authorisation (e.g. HTTP, HTTPS)
+  - Interoperable (I): Documentation and data quality
+  - Reusable (R): Storage and backup supported by legal and ethical requirements
+
+## Data description and collection or reuse of existing data
+Reusing legacy datasets from institutional repositories or the digital libraries data collections can be FAIRified retrospectively. Support for data collection and development, throughout the life cycle can be provided and followed by change management and capacity improvement.
+
+Multi-part FAIR research need a way of wrapping up, describing and sharing to promote the reuse of data. **Data sharing agreements** define the purpose of data sharing. Reference roles and responsibilities; specifies the purpose and legal requirements, e.g. for data security.
+
+An institutional aim should be to create an integrated view and context over fragmented resources using their **persistent identifiers (PIDs)** and **metadata**. To make datasets findable, these metadata need to be as widely available as possible.
+
+Enhancing reproducibility, quality and transparency by ensuring information flow and showcasing secondary use is also a part of data management. Promoting hands-on data experience and events activities built an collaborative environment for reproducible science.
+
+
+## Documentation and data quality
+Having access to local knowledge and encouraging best practises at the departmental level is a smart way to offer direction on a variety of standards and methods. In order to implement FAIR data practises within an institution, resources and infrastructure are needed. To increase the possibility of data reuse, several FAIR requirements can be satisfied using freely available guidelines e.g. [RDMkit](https://rdmkit.elixir-europe.org/), [FAIR Cookbook](https://faircookbook.elixir-europe.org/content/home.html), [ELIXIR-UK DaSH Fellowship](https://sites.google.com/view/navigation-portal-fellowship/home?authuser=0) initiative and repositories e.g. [Zenodo](https://zenodo.org/), [Harvard Dataverse](https://dataverse.harvard.edu/) and [figshare](https://figshare.com/).
+
+## Storage and backup
+Systems for storage, backup and collaboration depend upon technical infrastructure.The **'3-2-1 rule'**, a recommendation for saving three copies of the research data—two locally and one off-site—is a standard backup strategy for research data. Data, metadata, and other research artefacts, such as ontologies, software, documentation, and papers, must all be kept in locations where they are adequately safeguarded, backed up, and accessible to maximise their potential for reuse. Appropriate access management is essential, in addition to backup and restoration services that protect researchers against data loss, theft, malfunctioning computers or storage media, and accidental deletion or inadvertent alterations to the data.
+
+The fundamental component of infrastructure required for the FAIR research data lifecycle are repository services. They allow access to the data, a persistent identifier, and the descriptive metadata that support interoperability. Repositories can include basic data storage, resource finding, managing access and use of private information, facilitating peer review of information related to publications or services requiring digital preservation, and more.
+
+The [OpenAIRE](https://www.openaire.eu/opendatapilot-repository-guide) repository guide advises users to check the availability of a suitable repository in following order:
+
+1. The most effective option (if available) is to maintain the data in accordance with acknowledged discipline-specific criteria using an established, dedicated (external) data archive or repository that caters specifically to the study domain.
+2. Making use of institutional data repositories is the second-best option.
+3. If none of those options is practical, a free data repository should be used.
+
+Up-to-date lists of available registered data repositories can be found at [re3data](https://www.re3data.org/) and [FAIRsharing](https://fairsharing.org/search?fairsharingRegistry=Database).
+
+## Legal and ethical requirements
+Institutional support network (data stewards, ethics boards, IP, legal and financial offices) need to guide researchers in safeguarding data management responsibilities and resources.
+
+# Conclusion
+You will have the advantage of saving time and resources by planning how to FAIRify your data in the early phases of your research endeavour. To put this into action, a data management strategy, or DMP, must be written. A DMP is also where you outline your data collection, storage, processing, sharing, and disposal procedures. Planning the management and FAIRification of your data reduces the possibility of issues down the road, whether they be practical, legal, or technical.
+
+Keep in mind that creating FAIR data is a complex process. Consider how you can make your data FAIR one step at a time at each stage of the creation, collection, documentation, storage, sharing, archiving, and preservation processes. The framework for the rest of your study planning is laid by incorporating your data documentation. Imagine you want to use a dataset that was generated by another researcher and how you would like it to be found. Hope this quick introduction to FAIR data management solutions will help you improve not only your experience with data but also influence others by using your guidance and FAIRified resources.
diff --git a/topics/fair/tutorials/fair-gtn/faqs/index.md b/topics/fair/tutorials/fair-gtn/faqs/index.md
@@ -0,0 +1,3 @@
+---
+layout: faq-page
+---
diff --git a/topics/fair/tutorials/fair-gtn/tutorial.bib b/topics/fair/tutorials/fair-gtn/tutorial.bib
@@ -0,0 +1,69 @@
+@misc{wiegers_luc_2019_3593258,
+  author       = {Wiegers, Luc and
+                  van Gelder, Celia W. G.},
+  title        = {{Illustration for "Ten simple rules for making
+                   training materials FAIR"}},
+  month        = dec,
+  year         = 2019,
+  publisher    = {Zenodo},
+  version      = {1.0},
+  doi          = {10.5281/zenodo.3593258},
+  url          = {https://doi.org/10.5281/zenodo.3593258}
+}
+
+@article{chevron2014metacognitive,
+  title={A metacognitive tool: Theoretical and operational analysis of skills exercised in structured concept maps},
+  author={Chevron, Marie-Pierre},
+  journal={Perspectives in Science},
+  doi = {10.1016/j.pisc.2014.07.001},
+  volume={2},
+  number={1-4},
+  pages={46--54},
+  year={2014},
+  publisher={Elsevier}
+}
+
+@article{mcmurry2017identifiers,
+  title={Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data},
+  author={McMurry, Julie A and Juty, Nick and Blomberg, Niklas and Burdett, Tony and Conlin, Tom and Conte, Nathalie and Courtot, M{\'e}lanie and Deck, John and Dumontier, Michel and Fellows, Donal K and others},
+  journal={PLoS biology},
+  doi = {10.1371/journal.pbio.2001414},
+  volume={15},
+  number={6},
+  pages={e2001414},
+  year={2017},
+  publisher={Public Library of Science San Francisco, CA USA}
+}
+
+@article{hiltemann2023galaxy,
+  title={Galaxy Training: A powerful framework for teaching!},
+  author={Hiltemann, Saskia and Rasche, Helena and Gladman, Simon and Hotz, Hans-Rudolf and Larivi{\`e}re, Delphine and Blankenberg, Daniel and Jagtap, Pratik D and Wollmann, Thomas and Bretaudeau, Anthony and Gou{\'e}, Nadia and others},
+  journal={PLOS Computational Biology},
+  doi = {10.1371/journal.pcbi.1010752},
+  volume={19},
+  number={1},
+  pages={e1010752},
+  year={2023},
+  publisher={Public Library of Science San Francisco, CA USA}
+}
+
+@misc{fair-training-materials,
+  url = {https://training.galaxyproject.org/training-material/faqs/gtn/fair_training.html},
+  note = {Accessed 2023-05-23},
+  title = {How does the GTN ensure our training materials are FAIR?}
+}
+
+@article{Garcia2020,
+  doi = {10.1371/journal.pcbi.1007854},
+  url = {https://doi.org/10.1371/journal.pcbi.1007854},
+  year = {2020},
+  month = may,
+  publisher = {Public Library of Science ({PLoS})},
+  volume = {16},
+  number = {5},
+  pages = {e1007854},
+  author = {Leyla Garcia and B{\'{e}}r{\'{e}}nice Batut and Melissa L. Burke and Mateusz Kuzak and Fotis Psomopoulos and Ricardo Arcila and Teresa K. Attwood and Niall Beard and Denise Carvalho-Silva and Alexandros C. Dimopoulos and Victoria Dominguez del Angel and Michel Dumontier and Kim T. Gurwitz and Roland Krause and Peter McQuilton and Loredana Le Pera and Sarah L. Morgan and P\"{a}ivi Rauste and Allegra Via and Pascal Kahlem and Gabriella Rustici and Celia W. G. van Gelder and Patricia M. Palagi},
+  editor = {Scott Markel},
+  title = {Ten simple rules for making training materials {FAIR}},
+  journal = {{PLOS} Computational Biology}
+}