-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Evaluations, Reproducibility, Benchmarks Working Group Meeting Notes
Every year, hundreds of new algorithms are published in the field of biomedical image analysis. While validation of new methods has long been based on private data, publically available data sets and international competitions (‘challenges’) meanwhile allow for benchmarking algorithms in a transparent and comparative manner. Recent research, however, has revealed several flaws related to common practice in validation. A core goal of the program is, therefore, to provide the infrastructure and tools for quality-controlled validation and benchmarking of medical image analysis methods. In collaboration with the international biomedical image analysis challenges (BIAS) initiative, open technical challenges and research questions related to a variety of topics will be addressed, ranging from best practices (e.g. How to make a trained model public? How to enable reproducibility of a training process? Which metric to use for which application?) and political aspects (e.g. How to create incentives for sharing code and data?) to performance aspects (e.g. How to report memory/compute requirements? How to support identification of performance bottlenecks in pipelines?) and implementation efficiency (e.g. How to provide baseline methods for comparative assessment?).