Skip to content

Evaluations, Reproducibility, Benchmarks Working Group Meeting Notes

AReinke edited this page Jul 27, 2020 · 13 revisions

Every year, hundreds of new algorithms are published in the field of biomedical image analysis. While validation of new methods has long been based on private data, publically available data sets and international competitions ('challenges') meanwhile allow for benchmarking algorithms in a transparent and comparative manner. Recent research, however, has revealed several flaws related to common practice in validation. A core goal of the program is, therefore, to provide the infrastructure and tools for quality-controlled validation and benchmarking of medical image analysis methods. In collaboration with the international biomedical image analysis challenges (BIAS) initiative, open technical challenges and research questions related to a variety of topics will be addressed, ranging from best practices (e.g. How to make a trained model public? How to enable reproducibility of a training process? Which metric to use for which application?) and political aspects (e.g. How to create incentives for sharing code and data?) to performance aspects (e.g. How to report memory/compute requirements? How to support identification of performance bottlenecks in pipelines?) and implementation efficiency (e.g. How to provide baseline methods for comparative assessment?).

2020


Group Lead

Lena Maier-Hein

Kevin Zhou

Task forces

Metrics task force

Lead: Carole Sudre

Quick data access task force

Lead: Michela Antonelli

Benchmarking task force

Lead: Annika Reinke

Clone this wiki locally