Skip to content

Latest commit

 

History

History
77 lines (46 loc) · 4.85 KB

README.md

File metadata and controls

77 lines (46 loc) · 4.85 KB

Dermatology



This is a image data repository. It complements the

repositories. The augmentation repository augments the images of this repository, whilst the derma repository is a repository of models that use the augmentations.


The Original Data

The data is courtesy of the International Skin Imaging Collaboration (ISIC). It is a set of dermoscopic images of skin lesions: specifically, the images of the ISIC 2019 Challenge, i.e.,


file description size
ISIC_2019_Training_Input.zip 25,331 JPEG images of skin lesions ~9GB
ISIC_2019_Training_Metadata.csv 25,331 metadata entries of age, sex, general anatomic site, and common lesion identifier 1.15MB
ISIC_2019_Training_GroundTruth.csv 25,331 entries of gold standard lesion diagnoses 1.23MB

The images are either the same as those hosted by the ISIC Archive API or down-sampled versions. The data set outlined below might be used if the ground truths are released.


To ensure availability, the contents of ISIC_2019_Training_Input.zip are in the directory data/images, whilst copies of the ISIC_2019_Training_Metadata.csv & ISIC_2019_Training_GroundTruth.csv files are stored in data.



Augmentations

Augmented versions of the images in ISIC_2019_Training_Input.zip are created via the augmentation package. The package

  • ensures that all images are of the same size; the size is determined by the models
  • creates rotated forms of most images

The augmentations are stored in augmentations/images. The images are zipped, and heir metadata is summarised in augmentations/inventory.csv



Copyright and Attribution

Details: https://challenge2019.isic-archive.com/data.html

The images and metadata of the "ISIC 2019: Training" data used herein are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC-BY-NC). The copyright holders are:


References

  1. P. Tschandl, C. Rosendahl, H. Kittler: The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions, Scietific Data, Volume 5, Article Number: 180161, 2018, doi:10.1038/sdata.2018.161
  2. Noel C. F. Codella, David Gutman, M. Emre Celebi, Brian Helba, Michael A. Marchetti, Stephen W. Dusza, Aadi Kalloo, Konstantinos Liopyris, Nabin Mishra, Harald Kittler, Allan Halpern: Skin Lesion Analysis Toward Melanoma Detection: A Challenge at the 2017 International Symposium on Biomedical Imaging (ISBI), Hosted by the International Skin Imaging Collaboration (ISIC), 2018, arXiv:1710.05006
  3. Noel Codella, Veronica Rotemberg, Philipp Tschandl, M. Emre Celebi, Stephen Dusza, David Gutman, Brian Helba, Aadi Kalloo, Konstantinos Liopyris, Michael A. Marchetti, Harald Kittler, Allan Halpern: Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC), 2019, arXiv:1902.03368
  4. Marc Combalia, Noel C. F. Codella, Veronica Rotemberg, Brian Helba, Veronica Vilaplana, Ofer Reiter, Cristina Carrera, Alicia Barreiro, Allan C. Halpern, Susana Puig, Josep Malvehy: BCN20000: Dermoscopic Lesions in the Wild, 2019, arXiv:1908.02288