-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Evaluation, Reproducibility, Benchmarks Meeting 2
Date: 1st July 2020
Present: Lena Maier-Hein (Lead), Kevin Zhou (Lead), Carole Sudre, Dan Tudosiu, David Zimmerer, Jens Petersen, M. Jorge Cardoso, Michela Antonelli, Nicola Rieke, Paul Jäger, Annika Reinke
MONAI Evaluation / Reproducibility / Benchmarks
Every year, hundreds of new algorithms are published in the field of biomedical image analysis. While validation of new methods has long been based on private data, publically available data sets and international competitions (‘challenges’) meanwhile allow for benchmarking algorithms in a transparent and comparative manner. Recent research, however, has revealed several flaws related to common practice in validation. A core goal of the program is, therefore, to provide the infrastructure and tools for quality-controlled validation and benchmarking of medical image analysis methods. In collaboration with the international biomedical image analysis challenges (BIAS) initiative, open technical challenges and research questions related to a variety of topics will be addressed, ranging from best practices (e.g. How to make a trained model public? How to enable reproducibility of a training process? Which metric to use for which application?) and political aspects (e.g. How to create incentives for sharing code and data?) to performance aspects (e.g. How to report memory/compute requirements? How to support identification of performance bottlenecks in pipelines?) and implementation efficiency (e.g. How to provide baseline methods for comparative assessment?).
- General: Work on tools first, provide recommendations afterwards
- Priority 1:
- Easy access to medical data sets
- Taskforce: Michela (lead), Kevin, David
- Contact challenge organizers of the past years, explain what we are trying to achieve. Ask them if they are willing to contribute their data sets and what they would need
- Metric implementation
- Taskforce: Carole (lead), Dan, Jens
- Generate list of metrics
- Provide equation and/or reference
- Annika: Send list of metrics used in biomedical challenges
- Benchmarking
- Generation of rankings, plots etc.
- Link to BIAS initiative (Lena, Annika)
- Easy access to medical data sets
- Priority 2:
- Compute baseline methods
- Model sharing
- Definition of MONAI challenge needed (data set, tasks, task-related recommended metrics)
Hi Lena, Jorge, Andy, Stephen
Hope you are all doing well ! Have you heard about FAIR & fastMRI - Facebook AI Research initiative in collaboration with NYU. My initial sense is like they are positioning this as a benchmark - I think MONAI can provide reference implementation for data loading, transforms, baseline network architecture - this is well aligned with our Benchmarking & Challenges working group. I also think it would be worthwhile to make them aware of MONAI initiative and learn about their codebase roadmap - going beyond open dataset & benchmark - MONAI could be their domain specialized training engine for their research.
If this seems well aligned I can loop you into the meeting with the Research Engineer, Thoughts ?
https://fastmri.org Github Repository arXiv Paper
- Pro: We would move out to CVPR researchers
- Decision: Include them if they are willing to
- How to standardize data?
- How to standardize a challenge (dataset)? --> Link to BIAS checklist: set up a call
- Challenge level taskforce: Annika, Lena, Carole
- Taskforces will report results in two weeks (22nd July 2020)
- Goal: Proposal for MONAI leads by end of July
- Move to MONAI slack workspace: Annika will send an email to everyone