Skip to content

Commit

Permalink
Merge branch 'release-automation' into release-automation-workflow
Browse files Browse the repository at this point in the history
  • Loading branch information
rvyuha authored Jun 16, 2023
2 parents 0ff3015 + d5840ca commit e48ca22
Showing 1 changed file with 67 additions and 37 deletions.
104 changes: 67 additions & 37 deletions specs/release-automation/release-automation-specs.qmd
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
---
editor:
markdown:
wrap: sentence
---

# Release Automation

This document will go over the specifications for a software system to automate the release process for the Ottawa Data Model (ODM).
Expand All @@ -8,11 +14,17 @@ The primary audience for this document are software engineers who will be respon

## Context

Wastewater surveillance enables public health departments to monitor communities for possible outbreaks of different infectious diseases using wastewater samples, most notably the different variants of the COVID-19 virus. The ODM dictionary is an open source data model used to represent wastewater surveillance data with all its documentation available [online](https://github.com/Big-Life-Lab/PHES-ODM).
Wastewater surveillance enables public health departments to monitor communities for possible outbreaks of different infectious diseases using wastewater samples, most notably the different variants of the COVID-19 virus.
The ODM dictionary is an open source data model used to represent wastewater surveillance data with all its documentation available [online](https://github.com/Big-Life-Lab/PHES-ODM).

Practically, the dictionary is implemented as an Excel document. Although the main purpose of the excel sheet is to document the data model details in a machine actionable way, it also contains other sheets, for example data templates that make it easy for users to input their wastewater data.
Practically, the dictionary is implemented as an Excel document.
Although the main purpose of the Excel sheet is to Excelent the data model details in a machine actionablemachine-actionableains other sheets, for example data templates that make it easy for users to input their wastewater data.

Releasing a new version of the dictionary is a laborious process that requires converting the excel document to multiple output formats. In addition, the different release files need to be uploaded to multiple release locations. Details about the release process are available [online](https://odm.discourse.group/t/generation-of-tables-and-lists-from-the-odm-working-excel-file/99/7). The current manual process of implementing a release takes time away from the dictionary developers as well as is susceptible to errors. Automating this process would increase the quality of each release as well as give back time to the dictionary staff.
Releasing a new version of the dictionary is a laborious process that requires converting the Excel document to multiple output formats.
In addition, the different release files are uploaded to multiple release locations.
Details about the release process are available [online](https://odm.discourse.group/t/generation-of-tables-and-lists-from-the-odm-working-excel-file/99/7).
The current manual process of implementing a release takes time away from the dictionary developers and is susceanderrors.
Automating this process would increase the release's quality, as well as give back time to the dictionary staff.

## User interactions

Expand All @@ -23,38 +35,51 @@ The user will interact with the software system in two ways:

## Software Constraints

* The software system will need to use GitHub actions as its continuous integration tool.
* The software system will need to be written in R or Python.
- The software system will use GitHub actions as its continuous integration tool.
- The software system will be written in R or Python.

## Features

### RA-1: Trigerring the Process

A user will need to manually triger the release process from the [Github Actions tab](https://github.com/Big-Life-Lab/PHES-ODM/actions) in the [PHES-ODM repo](https://github.com/Big-Life-Lab/PHES-ODM). The following inputs will need to be provided by the user:
A user will manually trigger the release process from the [Github Actions tab](https://github.com/Big-Life-Lab/PHES-ODM/actions) in the [PHES-ODM repo](https://github.com/Big-Life-Lab/PHES-ODM).
The following inputs will need to be provided by the user:

1. Link to the excel dictionary to use for the release. Currently, only links to an OSF repo are allowed.
2. The OSF personal access token to use. The system will need this to download the ODM dictionary file user to generate the release files.
1. Link to the Excel dictionary to use for the release. Currently, only links to an OSF repo are allowed.

### RA-2: Creating the Release Files
The Excel dictionary used for the release is in the OSF.io `Developer dictionaries/New version` folder (https://osf.io/sxuaf/). The developer's version of the Excel dictionary is used. I.e. `ODM_dev-dictionary-2.0.0.xlsx`

The first step in each release is the creation of the different files that form that release. The files are created from the dictionary Excel document whose link is provided by the user an as input. In addition, the files tab in the document contains all the metadata needed for this step.
2. The OSF personal access token to use. The system will need this to gain access to the repo and perform operations on it.

The structure of the files tab is shown [below](./#files-sheet). Each row in the files sheet represents a file that has to be created in the release.
### RA-2: Creating the Release Files

The file name can be constructed using the [`name`](./#name) and [`type`](./#type) columns in the files sheet. The [`type`](./#type) column decides what the file extension should be, **.csv** for CSV files and **.xlsx** for excel files.
The first step in each release is the creation of the different files that form the develop copy of the dictionary.
The orginal copy of the Excel files in on OHRI sharepoint.
The dictionary staff will manually copy the dictionary from Sharepoint and upload the copy to the OSF.io `Developer dictionaries/New version` folder.
The files are created from this dictionary Excel document whose link is provided by the user an as input.
In addition, the files tab in the document contains all the metadata needed for this step.

The [`part`](./#part) column determines where the contents of the file comes from, in other words what to fill the file with. The column should contain a reference to a part in the parts sheet or a set in the sets sheet. Concretly, it should match up with either a value in the `setID` column in the sets sheet, or a value in the `partID` column in the parts sheet.
The structure of the files tab is shown [below](./release-automation.qmd#files-sheet).
Each row in the files sheet represents a file to be created in the release.

When the column contains a reference to a part, the content of the file should be filled with the sheet in the dictionary that has the same name as that part.
When the column contains a reference to a set, the sheets in the dictionary with the same name as each part in the set should be added as a sheet in the file. The name of the sheets should match the name of the part it represents.
The file name can be constructed using the `[name](./release-automation.qmd#name)` and `[type](./release-automation.qmd#type)` columns in the files sheet.
The [`type`](./release-automation.qmd#type) column decides what the file extension should be, **.csv** for CSV files and **.xlsx** for excel files.

The [`addHeader`](./#addheaders) column allows the user to add a string as the first line in the file. Reasons for doing this are explained [in a discourse post](https://odm.discourse.group/t/generation-of-tables-and-lists-from-the-odm-working-excel-file/99/9). Each header should be added as a cell in the first row of the sheet.
The [`part`](./release-automation.qmd#part) column determines where the contents of the file comes from or what to fill the file with.
The column can contain an ID for a part or a set which should match up with a row in the parts sheet or sets sheet respectively.
When the column contains a reference to a part, the content of the file should be filled with the sheet in the dictionary that has the same name as that part.
When the column contains a reference to a set, the sheets in the dictionary with the same name as each part in the set should be added as a sheet in the file.
The name of the sheets should match the name of the part it represents.

The [`addHeader`](./release-automation.qmd#addheaders) column allows the user to add a string as the first line in the file.
Reasons for doing this are explained [here](https://odm.discourse.group/t/generation-of-tables-and-lists-from-the-odm-working-excel-file/99/9).
Each header should be added as a cell in the first row of the sheet.

For example, consider the following release file,

| A | B |
|---|---|
| 1 | 2 |
| A | B |
|-----|-----|
| 1 | 2 |

If the value of the `addHeader` column is `version;1.1.0;name;John Doe`, then the release file would be modified as below,

Expand Down Expand Up @@ -84,23 +109,21 @@ Finally, for every new release any existing release branches need to be deleted

### RA-4: Deploying the files to OSF

Similar to deplying files to OSF, files whose [`destinations`](./#destinations) column contains the `osf` keyword need to be uploaded to OSF. The [`osfLocation`](./#osfLocation) folder identifies the path where the file should be uploaded.
Similar to deploying files to OSF, files whose [`destinations`](./release-automation.qmd#destinations) column contains the `osf` keyword need to be uploaded to OSF.
The `osfLocation` folder identifies the path where the file should be uploaded.

The deployment to OSF should take place only when the release branch on GitHub has been merged to `main`.

There are three states that need to handled when deplying the files to OSF,

1. When there are no release files on OSF. This means that this is the first release of the dictionary and all the files should be created and put in their correct location.
2. When there is a previous release on OSF whose version is not the same as the new release.
2.1. If the previous release is newer than the new release, then an error should be thrown and the entire process should stop.
2.2: Otherwise, all the old files need to be moved to a sub folder within an archive folder. The name of the sub folder should be the previous release version. Within the sub folder, the previous release files should be placed in their old paths. From there, the new files should created and put in their correct location.
3. When there is a previous release on OSF whose version is the same as the new release.
All the old files should be deleted.
The new files should be created and put in their correct location.
1. When there are no release files on OSF. This means that this is the first release of the dictionary and all the files should be created and put in their correct location.
2. When there is a previous release on OSF whose version is not the same as the new release. 2.1. If the previous release is newer than the new release, then an error should be thrown and the entire process should stop. 2.2: Otherwise, all the old files need to be moved to a sub folder within an archive folder. The name of the sub folder should be the previous release version. Within the sub folder, the previous release files should be placed in their old paths. From there, the new files should created and put in their correct location.
3. When there is a previous release on OSF whose version is the same as the new release. All the old files should be deleted. The new files should be created and put in their correct location.

### RA-5: Trigger a PR in the PHES-ODM-Doc repo

Once the PR has been created in the PHES-ODM repo, a workflow should be trigged in the [PHES-ODM-Doc](https://github.com/Big-Life-Lab/PHES-ODM-Doc). This will allow the documentation repo to update itself with the new files.
Once the upload has been completed to all relevant destinations, a workflow should be trigged in the [PHES-ODM-Doc](https://github.com/Big-Life-Lab/PHES-ODM-Doc).
This will allow the documentation repo to update itself with the new files.

### RA-6: Trigger a PR in the PHES-ODM-Validation repo

Expand All @@ -112,15 +135,19 @@ This section contains reference material used throughout the document.

### Sheet Data Types

This section goes over the data types that each column in a sheet can be encoded as. Although all sheet files, for example CSV and Excel, are read in as a string, these data types build on top of that encoding to simulate other data types. The data types are:
This section goes over the data types that each column in a sheet can be encoded as.
Although all sheet files, for example CSV and Excel, are read in as a string, these data types build on top of that encoding to simulate other data types.
The data types are:

#### string

#### templateString

A string with placeholders for data that will need to be filled in by a program. The placeholders are identified by opening and closing curly braces.
A string with placeholders for data that will need to be filled in by a program.
The placeholders are identified by opening and closing curly braces.

For example, consider the template string "The file version is {version}". It has only one variable, `version`, which will need to be filled in.
For example, consider the template string "The file version is {version}".
It has only one variable, `version`, which will need to be filled in.

The full list of allowed variables are documented in the [template variables section](./#template-variables).

Expand Down Expand Up @@ -156,13 +183,15 @@ This variable should be set to the latest version in the version column

### Files Sheet

This section documents details about the different columns in the files sheet in the dictionary. This is the sheet that contains metadata used to build and deploy the release files.
This section documents details about the different columns in the files sheet in the dictionary.
This is the sheet that contains metadata used to build and deploy the release files.

Unless otherwise stated, all columns are required

#### ID

The unique identifier for this file. Mainly used as the primary key for the sheet.
The unique identifier for this file.
Mainly used as the primary key for the sheet.

type: [string](./#string)

Expand All @@ -186,8 +215,8 @@ type: [categorical](./#categorical)

categories:

* excel
* csv
- excel
- csv

#### part

Expand All @@ -202,7 +231,8 @@ Validations:

#### addHeaders

The contents of an optional header row to add as the first line in the file. Each header should added as a cell in the first row.
The contents of an optional header row to add as the first line in the file.
Each header should added as a cell in the first row.

type: [list](./#list) of [templateString](./#templatestring), [nullable](./#nullable)

Expand Down Expand Up @@ -239,4 +269,4 @@ type: [string](./#string), [nullable](./#nullable)

Validations:

* Required if one of the destinations is github
* Required if one of the destinations is github

0 comments on commit e48ca22

Please sign in to comment.