Skip to content

Experiments Proposal

Stephen R. Aylward edited this page Aug 5, 2020 · 2 revisions

The MONAI Working Groups on I/O, Data, and Transforms have devised a high-level approach to address the customization, reproducibility, and extensibility of MONAI.

Background

This proposal expands the concept of an Experiment to enable the user-specification of alternative implementation of MONAI methods that should be used for that specific experiment. In particular, this experiment-by-experiment specification is meant to addresses the software aspect of customization, reproducibility and extensibility. Thus, for any experiment, a user should be able to specify the exact version of ITK to be used for I/O, the exact revision of MONAI that should be used, and/or alternative implementations of specific transforms that should be used. Such detail specifications should not be required, but should be feasible. Additionally, after an experiment has been run, it should be possible to generate a comprehensive specification of the software and versions used to run that experiment.

This proposal somewhat parallels and greatly extends the draft I/O proposal, #856. Perhaps the same object factory mechanism used to extend the image I/O capabilities of MONAI on an experiment-by-experiment basis can be used to extend other aspects of MONAI (e.g., its transforms and dataset loaders).

Proposal Goals and Anticipated Benefits

This proposal has two goals:

  1. Specify an experiment: Enable a user to define customizations of the MONAI default behavior without having to directly modify code within the MONAI repository.
  2. Record an experiment: Enable a user to record the specific python environment in which an experiment was conducted.

Allowing an unmodified MONAI checkout to be maintained, while still allowing experiment-specific customizations, will have multiple benefits:

  • It should be possible to update the local copy of MONAI without having to re-apply customizations needed to execute an experiment.
  • It should be possible to share a local copy of MONAI across multiple experiments, even if those experiments use different implementations of certain transforms (e.g., one experiment uses a GPU-accelerated data augmentation transform).
  • It should be possible to share an experiment with others in a manner that lets them replicate that experiment more consistently
  • It should be possible to more accurately document an experiment for publications, FDA applications, and teaching
  • It should be possible to systematically explore experiment alternatives

Feedback from the Community

Please give us your feedback on this proposal via the MONAI issue tracker: #857

Proposal Overview

The slides used to introduce and discuss this idea at the combined meeting of the I/O, Data, and Transform Working Groups are available here:

Experiments Proposal Slides

Proposal Details

1) Three driving concepts

The benefit of this proposal roughly translate into three broad categories:

  • Provides a layer of extensibility
  • Separates dataset from experiments
  • Promotes reproducibility (consider both software and hardware details)

This work is also a step in the direction of maintaining data provinance. It may not be a comprehensive solution to every use case of data provinance, but when combined with a specific dataset, an experiment should address many data provinance concerns.

2) Learn from others

We should carefully consider related works done by others. For example,

3) Keep it simple

It should be possible to use MONAI without defining an experiment

  • We should not change the PyTorch/MONAI default way of doing things
  • An experiment specification (json file?) should be as simple or as complex as necessary to define an experiment, but never more complex than necessary.
  • An experiment specification should be "conversational" - it should be intuitive and concise
4) Our primary customers (at this time) are medical AI researchers
  • There is a continuum of potential MONAI users, ranging from medical AI researchers to clinical researchers. This Experiments proposal is focused primarily on the needs of medical AI researchers.
    • Medical AI researchers conduct research into data augmentation methods, network adaption to new problems/data, and alternative network architectures.
    • Alternatives applications to consider include methods for pathology, video (surgical guidance), and bioinformatics.
4) Versioning

A major consideration is version of the libraries used.

  • An open question is what to do about libraries/code that isn't under version control and yet is used in an experiment.
5) Implementation Considerations

It may be possible to build upon the object factory mechanism proposed for Image I/O to support the specification of alternative dataset and transform methods. This would involve using an object factory when transforms are composed and/or when transforms are called.

Clone this wiki locally