Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add integration-testing blog post #378

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

joshwlambert
Copy link
Member

@joshwlambert joshwlambert commented Mar 20, 2025

  • The post specifies a license if you don't want to use the default CC BY
  • All authors have an ORCID iD
  • Relevant keywords / tags has been added. In particular, if you want your post to be shared on R-bloggers, you must tag it with R
  • Images or other external resources have been committed and pushed
  • The post uses pure quarto syntax, rather than HTML or R code, unless necessary
    R code is included in the post but is not executed.

Right before merging:

  • The date field has been updated
  • All reviewers have been acknowledged in a short paragraph
  • A PR has been opened in the blueprints to link to this post
  • The post has been re-rendered and content of the _freeze/ folder is up-to-date

This PR adds a blog post on integration testing in Epiverse-TRACE. It provides an overview of why we write these tests, how we write them and an example of what the tests look like.

As we are only starting out writing these types of tests, and in the hope that this will help team members internally, please add any information that you think will help set a good standard for integration testing in Epiverse.

This is just an early draft of the blog post, so please give feedback on how it can be improved. @Bisaloo and @Karim-Mane please feel free to add yourselves as authors if you'd like to contribute as an author rather than a reviewer.

A couple of questions:

  1. @Bisaloo do you think it is worth including the figure on slide 10 of your interoperability strategy in this post? I'm not sure if just stating the 3 principles is clear enough.
  2. @Karim-Mane are you happy with the description of {cleanepi} in the post, and the example integration test code?

/schedule 2025-04-14T08:01

Copy link

netlify bot commented Mar 20, 2025

Deploy Preview for tourmaline-marshmallow-241b40 ready!

Name Link
🔨 Latest commit d7bdad0
🔍 Latest deploy log https://app.netlify.com/sites/tourmaline-marshmallow-241b40/deploys/67e3da67be9a350008a174c7
😎 Deploy Preview https://deploy-preview-378--tourmaline-marshmallow-241b40.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@joshwlambert joshwlambert requested a review from Karim-Mane March 20, 2025 14:26
@joshwlambert joshwlambert marked this pull request as draft March 20, 2025 14:30
@joshwlambert
Copy link
Member Author

Converting to draft until getting some early feedback on the post to make clear this is still a work in progress.

In comparison to commonly used _unit testing_, which looks to isolate and test specific parts of a software package, e.g. a function; _integration testing_ is the testing of several components of software, both within and between packages. Therefore, integration testing can be used to ensure interoperability is maintained while one or multiple components in pipelines are being developed. [Continuous integration](https://en.wikipedia.org/wiki/Continuous_integration) provides a way to run these tests before merging, releasing, or deploying code.
:::

## How we setup integration testing in Epiverse
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @joshwlambert for this impressive blog. It looks good to me. Below are some of my comments or ideas:

  • Might be good to define an analysis pipeline you may have in mind, then define its steps (as a directed acyclic graph for example). You will then specify that this blog post only covers how to achieve integration testing between step 1 and N. That way, you could easily expand this topic to the other parts of your pipeline in this document or another one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a mention to pipelines as directed acyclic graphs in 6066440. Let me know if there is any other information on this point you'd like adding.


## Case study of interoperable functionality using {simulist} and {cleanepi}

The aim of {simulist} is to simulate outbreak data, such as line lists or contact tracing data. By default it generates complete and accurate data, but can also augment this data to emulate empirical data via post-processing functionality. One such post-processing function is `simulist::messy_linelist()`, which introduces a range of irregularities, missingness, and type coercions to simulated line list data. Complementary to this, the {cleanepi} package has a set of cleaning functions that standardised tabular epidemiological data, recording the set of cleaning operations run by compiling a [report and appending it to the cleaned data](https://epiverse-trace.github.io/cleanepi/articles/design_principle.html#output).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be good to specify what integration testing approach you are using in this example. My understanding is that you are using a big-bang approach according to this. But will let you define this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure whether the approach shown in the blog post is big-bang integration testing. My understanding is that that is a testing approach for all components in one go. Whereas our testing blocks each test a different aspect of functionality. But to be honest, reading about this on various websites, their examples are so abstract that it's hard to map each integration testing concept/approach to what we are doing.

If you can are sure which integration testing approach we're using and where to add it to the text, please add a suggested change on this PR and I will commit it.

Copy link
Member

@chartgerink chartgerink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 🙏 Great idea, and definitely worth a blog. 😊

Is the epiverse package also taking these contributions as tests already? I couldn't find this test there for example, even though the blog mentions it is a good place to put those.

author:
- name: "Joshua W. Lambert"
orcid: "0000-0001-5218-3046"
date: "2025-03-20"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pick any monday in the future and add it in the original PR post to ensure it gets merged then.

/schedule 2025-??-??

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a scheduled merge for 14th April, but the merge_schedule GitHub actions workflow has failed. Potentially from my PAT not having the required credentials. Can you take a quick look, it might work if you add the schedule keyword?

Copy link
Member

@Bisaloo Bisaloo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great blog post: important topic and well-written. Thanks!

My only potential comment is that the example here (simulist & cleanepi) is not the strongest & clearest example you could use to talk about integration testing.

In this type of tests, I would expect things like "does simulist accept inputs from epiparameter?". This would allow detecting when potential breaking changes break a previously working complex pipeline.

Here, it feels (at least to me) slightly more like you are testing cleanepi functionality rather than ensuring that it can take inputs from simulist.

On the other hand, it's the only example we have at the moment so feel free to decide for yourself if you want to go with this or create another one.

@joshwlambert
Copy link
Member Author

Is the epiverse package also taking these contributions as tests already? I couldn't find this test there for example, even though the blog mentions it is a good place to put those.

@chartgerink I hadn't opened a PR on {epiverse} when opening this PR, but it is now open at epiverse-trace/epiverse#7, apologies for the confusion.

@joshwlambert
Copy link
Member Author

In this type of tests, I would expect things like "does simulist accept inputs from epiparameter?". This would allow detecting when potential breaking changes break a previously working complex pipeline.

@Bisaloo I can write a couple of integration tests for {epiparameter} & {simulist} and then we can see which is a better example to include in the blog post before publishing.

Here, it feels (at least to me) slightly more like you are testing cleanepi functionality rather than ensuring that it can take inputs from simulist.

I agree, but this might be a result of the close design compatibility between the packages, a sort of the "mark of good design is that you don’t notice it" effect (or at least that's what I'm telling myself 😆)

@joshwlambert joshwlambert marked this pull request as ready for review March 26, 2025 10:54
@joshwlambert
Copy link
Member Author

Thanks everyone for your comments, I've moved this PR to ready for review.

Next steps are:

  1. I will draft some integration tests for {epiparameter} and {simulist} to see if they better showcase integration testing for this blog post.
  2. Proof read the post and improve wording
  3. Incorporate any further comments

I've set the merge scheduler to 14th April, so we have a couple of weeks to work through this before merging. It will also give us time to merge epiverse-trace/epiverse#7 which is a blocker for this post. If you could approve this PR prior to 14th April that would be great just to make sure you're happy for it to be published.

@joshwlambert
Copy link
Member Author

If we can address epiverse-trace/epiverse#8 it would be good to mention it in this blog post to show how we're maintaining interoperability.

@joshwlambert
Copy link
Member Author

Some initial integration tests for {epiparameter} and {simulist} are added on the epiparameter-simulist-tests branch of {epiverse}.

Again these are simple two step integration tests. They are black box tests as all they are currently testing is whether the output of simulist::sim_*() has the correct properties.

@Bisaloo
Copy link
Member

Bisaloo commented Mar 26, 2025

Some initial integration tests for {epiparameter} and {simulist} are added on the epiparameter-simulist-tests branch of {epiverse}.

Again these are simple two step integration tests. They are black box tests as all they are currently testing is whether the output of simulist::sim_*() has the correct properties.

Seems good to me. I would also add an expect_no_condition() around the simulist calls.

@chartgerink
Copy link
Member

chartgerink commented Apr 1, 2025

@joshwlambert - RE the merge schedule - I think the SUDO_GITHUB_TOKEN used in the workflow may have expired there are some issues - I will figure it out and manually merge on the 14th if necessary 👍

@Bisaloo - did you add this one? If so, did it expire? My tokens are still valid and named otherwise

@chartgerink
Copy link
Member

Merge Schedule
Scheduled to be merged on 2025-04-14 08:01:00 (UTC)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants