-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Feedback from @AliHarp
Model inputs >Input modelling:
- “Identify possible distributions. This is based on knowledge of the process being modelled, and by inspecting the data using time series plots and histograms.
Fit distributions to your data and compare goodness-of fit. You can do this using a:
a. Targeted approach. Just test the distributions from step 1.
b. Comprehensive approach. Test a wide range of distributions.”
This is kind of confusing as you say (b) is an option but then say its still important to do Step 1. What about saying: Step 1 Identify candidate distributions (important because………….) Step 2: (a)… and (b) Comprehensive approach – Test many distributions and validate against step 1 knowledge.
-
As you talk about situations when there is not enough data, can you provide practical guidance on ‘what is enough’ eg a subsection on ‘working with limited data’ and some examples, or link to alternative guidance? This is common in healthcare.
-
Where you discuss Poisson/exponential – maybe you need a sentence stating that these are mathematically equivalent otherwise it may seem arbitrary, eg if arrivals follow a poisson distribution, the time between arrivaals follows an exponential distribution.
Model inputs> input data management
-
The FAIR code link doesn’t work
-
What is included in RAP? The first sentence is bit confusing – maybe ‘should begin with the first step in your data processing workflow
-
Input modelling code – as you talk about structuring code etc, should you mention where in the repo structure this should live?
-
Parameters – you say ‘your must share some parameters with your model….’ But hint that even they may be sensitive. Could you say:
WHY PARAMETERS MATTER FOR REPRODUCIBILITY
For others to run your model, they need parameter values. These could be real (ideal for full reproducibility) or synthetic (allows code testing but with different results). Parameters are often less sensitive than patient-level data because they are aggregated values, however in some cases they may be sensitive in which case….
-
A short flow map for Sceanrios 1 and 2 (public/private repos) would help a lot. Also where you say ‘you can’t split a repo into public and private sections’ you can use a .gitignore to exclude sensitive files from a repo.
-
Second test question is a bit confusingly worded, maybe a typo.
Model inputs>Parameters from script
-
Do we need to link to parameter validation? --> Linked for Python, removed R6 so N/A for R
-
Should we mention where they would live in the project structure? --> Add link referring back to package page which explained structure
Model inputs>Parameters from file
- Maybe you could clarify when JSON is needed? E.g. use JOSN when parameters have hierarchical relationships, eg distributions with multiple parameters, use CSV for simple key-value pairs.
Otherwise great!
Model inputs > Parmeter validation
-
This is missing the Motivation/context that is really nice in the other sections. Ie when and why validation matters. Eg silent errors, plausible but wrong results, life/death decisions – ie risk management
-
Can you connect to the JSON parameters? --> The example runs on parameter function that imports from JSON, so it is connected. Unless this is referring to the motivation? Could add about how JSON and CSV are easy to edit and make errors - but that is not a problem unique to them.
-
What about a small callout on ‘helpful error messages’ – eg what is wrong, suggest corrections --> Have switched many to checkmate which often generates error messages for you - have explained in text, that it is good for that purpose.
-
The testing task is a bit open ended, maybe provide a starter template? --> Have amended to tell them to use the examples on that page or prior as base templates
Metadata
Metadata
Assignees
Labels
Type
Projects
Status