Metric learning for worms nuclei.
Call: ./src/scripts/consolidate_worms_dataset -c default.toml -i path_to_30WormsImagesGroundTruthSeg
Creates .hdf datasets for each worm. Gets rid of worm names and mis-matched label numbering by unifying these with
and worm_names.txt
. Most setting are set in .toml config file (check default.toml), such as
that here is used as path to output dataset, but throughout the project is used as input path.
HDF keys:
: raw input, [140x140x1166] uint8, without any modification e.g. normalizationvolumes/nuclei_seghyp
: instace labeling, [140x140x1166] uint16, labels no meaning, just a number to distinguish between instancesmatrix/con_seghyp
: center of nuclei, each row corresponds to the label involumes/nuclei_seghyp
, [max (nuclei_instances), 3] float32volumes/gt_nuclei_labels
: ground truth labels, segmentations the same as nuclei_instances, here just the invalid segmentations are removed, and also relabeled according touniverse.txt
, [140x140x1166] uint16matrix/gt_con_labels
: same ascon_instances
but forgt_nuclei_labels
, all fixed size of [559x3] float32 , missing labels are np.array([0.0, 0.0, 0.0])
Call: ./src/scripts/consolidate_cpm_dataset -c default.toml -i path_to_root_kolmogorov_sol_format_both_directions -i2 path_to_nucleinames_corresponding_to_QAP_sols_labeling -i3 path_to_nuclei_name_labels_in_30WormsImageGroundTruthInstanceSeg
Creates cpm dataset, default in ./data/processed (defined in default.toml). .pkl file containing a dictionary with keys '{w1id}-{w2id}' where w1id<w2id and value is a dict of consistent pairwise matchings.
convnet_models: (No Use For Now) a conventional VGG-style network with some conv layers + some fc layers. For extracting embeddings based on patches.
unet: unet model for pixel-wise embeddings.
├── <- The top-level README for developers using this project.
├── data
│ ├── external <- Data from third party sources.
│ ├── interim <- Intermediate data that has been transformed.
│ ├── processed <- The final, canonical data sets for modeling.
│ └── raw <- The original, immutable data dump.
├── models <- Trained and serialized models, model predictions, or model summaries
├── experiments <- keep experiment results
├── experiments_cfg <- config files to reproduce experiments
├── notebooks <- Jupyter notebooks. Naming convention is a number (for ordering),
│ the creator's initials, and a short `-` delimited description, e.g.
│ `1.0-jqp-initial-data-exploration`.
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
│ └── figures <- Generated graphics and figures to be used in reporting
├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
│ generated with `pip freeze > requirements.txt`
├── <- makes project pip installable (pip install -e .) so src can be imported
├── src <- Source code for use in this project.
│ ├── lib <- useful code for the project
│ │ │
│ │ ├── data <- data related code, creating dataset code chunks, or typical DL datasets
│ │ │
│ │ ├── modules <- model definition, together with train, fit, evaluate methods if it requires
│ │ │ specialized code
│ │ ├── utils <- general utility functions, summary writing, plotting and visualizations
│ │
│ └── scripts <- end-point scripts for different tasks
│ │ └──
│ │ └──
│ │ └──
│ │
│ └── test <- scripts to help writing code and testing them, not the typical UnitTest but sth similar
Project based on the cookiecutter data science project template. #cookiecutterdatascience