-
Notifications
You must be signed in to change notification settings - Fork 18
/
Copy pathrelease-automation-specs.qmd
272 lines (162 loc) · 11.8 KB
/
release-automation-specs.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
---
editor:
markdown:
wrap: sentence
---
# Release Automation
This document will go over the specifications for a software system to automate the release process for the Ottawa Data Model (ODM).
## Audience
The primary audience for this document are software engineers who will be responsible for developing the system.
## Context
Wastewater surveillance enables public health departments to monitor communities for possible outbreaks of different infectious diseases using wastewater samples, most notably the different variants of the COVID-19 virus.
The ODM dictionary is an open source data model used to represent wastewater surveillance data with all its documentation available [online](https://github.com/Big-Life-Lab/PHES-ODM).
Practically, the dictionary is implemented as an Excel document.
Although the main purpose of the Excel sheet is to Excelent the data model details in a machine actionablemachine-actionableains other sheets, for example data templates that make it easy for users to input their wastewater data.
Releasing a new version of the dictionary is a laborious process that requires converting the Excel document to multiple output formats.
In addition, the different release files are uploaded to multiple release locations.
Details about the release process are available [online](https://odm.discourse.group/t/generation-of-tables-and-lists-from-the-odm-working-excel-file/99/7).
The current manual process of implementing a release takes time away from the dictionary developers and is susceanderrors.
Automating this process would increase the release's quality, as well as give back time to the dictionary staff.
## User interactions
The user will interact with the software system in two ways:
1. **Trigerring a release**: The user will use the GitHub actions tab to start a new release. The steps are outlined in [this diagram](./trigerring-a-release.puml); and
2. **Merging a release**: Once the user is happy with the release changes, they can merge their release by merging the release PR. The steps are outlines in [this diagram](./merging-a-release.puml).
## Software Constraints
- The software system will use GitHub actions as its continuous integration tool.
- The software system will be written in R or Python.
## Features
### RA-1: Trigerring the Process
A user will manually trigger the release process from the [Github Actions tab](https://github.com/Big-Life-Lab/PHES-ODM/actions) in the [PHES-ODM repo](https://github.com/Big-Life-Lab/PHES-ODM).
The following inputs will need to be provided by the user:
1. Link to the Excel dictionary to use for the release. Currently, only links to an OSF repo are allowed.
The Excel dictionary used for the release is in the OSF.io `Developer dictionaries/New version` folder (https://osf.io/sxuaf/). The developer's version of the Excel dictionary is used. I.e. `ODM_dev-dictionary-2.0.0.xlsx`
2. The OSF personal access token to use. The system will need this to gain access to the repo and perform operations on it.
### RA-2: Creating the Release Files
The first step in each release is the creation of the different files that form the develop copy of the dictionary.
The orginal copy of the Excel files in on OHRI sharepoint.
The dictionary staff will manually copy the dictionary from Sharepoint and upload the copy to the OSF.io `Developer dictionaries/New version` folder.
The files are created from this dictionary Excel document whose link is provided by the user an as input.
In addition, the files tab in the document contains all the metadata needed for this step.
The structure of the files tab is shown [below](./release-automation.qmd#files-sheet).
Each row in the files sheet represents a file to be created in the release.
The file name can be constructed using the `[name](./release-automation.qmd#name)` and `[type](./release-automation.qmd#type)` columns in the files sheet.
The [`type`](./release-automation.qmd#type) column decides what the file extension should be, **.csv** for CSV files and **.xlsx** for excel files.
The [`part`](./release-automation.qmd#part) column determines where the contents of the file comes from or what to fill the file with.
The column can contain an ID for a part or a set which should match up with a row in the parts sheet or sets sheet respectively.
When the column contains a reference to a part, the content of the file should be filled with the sheet in the dictionary that has the same name as that part.
When the column contains a reference to a set, the sheets in the dictionary with the same name as each part in the set should be added as a sheet in the file.
The name of the sheets should match the name of the part it represents.
The [`addHeader`](./release-automation.qmd#addheaders) column allows the user to add a string as the first line in the file.
Reasons for doing this are explained [here](https://odm.discourse.group/t/generation-of-tables-and-lists-from-the-odm-working-excel-file/99/9).
Each header should be added as a cell in the first row of the sheet.
For example, consider the following release file,
| A | B |
|-----|-----|
| 1 | 2 |
If the value of the `addHeader` column is `version;1.1.0;name;John Doe`, then the release file would be modified as below,
| version | 1.1.0 | name | John Doe |
|---------|-------|------|----------|
| A | B | | |
| 1 | 2 | | |
### RA-3: Deploying the files to GitHub
Once the release files have been built they will need to be uploaded to their release destinations. All of this information is encoded in the [`destinations`](./#destinations) column in the [`files`](./#files-sheet) sheet.
Files whose [`destinations`](./#destinations) column contains the `github` keyword will need to be uploaded to the [PHES-ODM repo](https://github.com/Big-Life-Lab/PHES-ODM). The [`githubLocation`](./#githubLocation) column identifies the path where the file should be uploaded.
The following two states will need to be handled
1. When there are no release files on GitHub.
The files should be created and put in their correct locations.
A branch should be created from `main` and named `release-{version}` and files put in there.
A commit should be made with the new files called `[BOT] release-{version}`
A PR should be made from the new branch into `main`. The PR should be called `[BOT] Release {version}`
2. When there is a release version on GitHub
2.1. If the previous release is newer than the new release, then an error should be thrown and the entire process should stop.
2.2: Otherwise, all the old files need to be deleted. The same steps as the first state need to be followed
Finally, for every new release any existing release branches need to be deleted and their PRs need to be closed.
### RA-4: Deploying the files to OSF
Similar to deploying files to OSF, files whose [`destinations`](./release-automation.qmd#destinations) column contains the `osf` keyword need to be uploaded to OSF.
The `osfLocation` folder identifies the path where the file should be uploaded.
The deployment to OSF should take place only when the release branch on GitHub has been merged to `main`.
There are three states that need to handled when deplying the files to OSF,
1. When there are no release files on OSF. This means that this is the first release of the dictionary and all the files should be created and put in their correct location.
2. When there is a previous release on OSF whose version is not the same as the new release. 2.1. If the previous release is newer than the new release, then an error should be thrown and the entire process should stop. 2.2: Otherwise, all the old files need to be moved to a sub folder within an archive folder. The name of the sub folder should be the previous release version. Within the sub folder, the previous release files should be placed in their old paths. From there, the new files should created and put in their correct location.
3. When there is a previous release on OSF whose version is the same as the new release. All the old files should be deleted. The new files should be created and put in their correct location.
### RA-5: Trigger a PR in the PHES-ODM-Doc repo
Once the upload has been completed to all relevant destinations, a workflow should be trigged in the [PHES-ODM-Doc](https://github.com/Big-Life-Lab/PHES-ODM-Doc).
This will allow the documentation repo to update itself with the new files.
### RA-6: Trigger a PR in the PHES-ODM-Validation repo
Once the PR has been created in the PHES-ODM repo, a workflow will need to be trigged in the [PHES-ODM-Validation](https://github.com/Big-Life-Lab/PHES-ODM-Validation) repo to allow it to update to the new dictionary files.
## Reference
This section contains reference material used throughout the document.
### Sheet Data Types
This section goes over the data types that each column in a sheet can be encoded as.
Although all sheet files, for example CSV and Excel, are read in as a string, these data types build on top of that encoding to simulate other data types.
The data types are:
#### string
#### templateString
A string with placeholders for data that will need to be filled in by a program.
The placeholders are identified by opening and closing curly braces.
For example, consider the template string "The file version is {version}".
It has only one variable, `version`, which will need to be filled in.
The full list of allowed variables are documented in the [template variables section](./#template-variables).
#### categorical
A column with only a certain number of allowed values.
For example, a categorical column that encodes the type of a pet could have the categories "dog" and "cat"
#### list
A column that encodes multiple values
The multiple values are seperated by a semi-colon (;)
For example, a column that encodes the names of a person's pets could have the value "Roscoe;Amy".
#### nullable
An addon type that allows a column to have null values.
Null values are encoded as `N/A`
### Template Variables
#### version
The current release version
Can be obtained from the `version` column in the `summary` sheet in the dictionary
This variable should be set to the latest version in the version column
### Files Sheet
This section documents details about the different columns in the files sheet in the dictionary.
This is the sheet that contains metadata used to build and deploy the release files.
Unless otherwise stated, all columns are required
#### ID
The unique identifier for this file.
Mainly used as the primary key for the sheet.
type: [string](./#string)
#### label
Human readable description for the file
type: [string](./#string), [nullable](./#nullable)
#### name
The name of the file in the release
type: [templateString](./#templatestring)
#### type
The file type
type: [categorical](./#categorical)
categories:
- excel
- csv
#### part
The name of the part that identifies what sheet(s) from the dictionary should be included in the file
type: [string](./#string), [non-nullable](./#nullable)
Validations:
* The value should set to a `set` or a `part`
* The value can be sey to a `set` only if the [type](./#type) column is `excel`
#### addHeaders
The contents of an optional header row to add as the first line in the file.
Each header should added as a cell in the first row.
type: [list](./#list) of [templateString](./#templatestring), [nullable](./#nullable)
#### destinations
Where the file will be uploaded to
type: [list](./#list) of [categorical](./#categorical), [non-nullable](./#nullable)
Categories
* osf
* github
Validations
* Has to have at least one destination
#### osfLocation
The path for the file on OSF
type: [string](./#string), [nullable](./#nullable)
Validations:
* Required if one of the destinations is osf
#### githubLocation
The path for the file on GitHub
type: [string](./#string), [nullable](./#nullable)
Validations:
* Required if one of the destinations is github