Welcome to Cookiecutter Deep Learning Template

Description

Yor are only seeing an example of the final CC-DL-template. The branch prepare_for_CC is used to make a template from this example. Please use https://github.com/Ivo-B/CC-DD-template to create your own project based on this example!

Requirements to use this project

python 3.9
poetry

You can use your favorite method to provide python 3.9 for poetry. We recommend https://github.com/pyenv/pyenv#installation, but you can also use https://docs.conda.io/en/latest/miniconda.html etc.

Install poetry:

# Poetry for Ubuntu
curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/install-poetry.py | python -

How to use

Cloning is only needed for this example

# clone project
git clone https://github.com/Ivo-B/CC-DL-template-example
cd CC-DL-template-example

If you use conda to provide the correct python version, installation and usability is a bit different:

# creates a conda environment
make environment
# run to activate env, install packages
source ./bash/finalize_environment.sh

When you use PyEnv to provide python, your virtualenvironment is installed in the project folder .venv

poetry install
# activate Virtualenv by
source ./.venv/Scripts/activate

Template contains examples with MNIST classification and Oxfordpet segmentation.

First step is only needed when cloning this example!

edit .env.example and set your PROJECT_DIR, rename file to .env
To download and prepare data, use make data_mnist or make data_oxford
To run the classification example, use python run_training.py mode=exp name=exp_test.
To run the segmentation example, use python run_training.py --config-name config_seg mode=exp name=exp_test.

Project Organization

Show project structure

├──.venv                        <- Local poetry environment
├── bash                        <- TODO
│   ├── finalize_environment.sh     <- Callbacks configs
│   └── init_development.sh         <- Main project configuration file
├── cctest                      <- Source code use in this example.
│   ├── data                        <- Scripts to download or generate data
│   ├── dataloaders                 <- Scripts to handel and load the preprocessed data
│   ├── evaluation                  <- Scripts to do evaluation of the results
│   ├── executor                    <- Scripts to train, eval and test models
│   ├── models                      <- Scripts to define model architecture
│   ├── utils                       <- Utility scripts for callback, losse, metric
│   ├── visualization               <- Scripts to create exploratory and results oriented
│   └── __init__.py                 <- Makes cctest a Python module
│
├── configs                     <- Hydra configuration files
│   ├── callback                    <- Callbacks configs
│   ├── datamodule                  <- Datamodule configs
│   │   └── data_aug                    <- Data augmentation pipline configs
│   ├── experiment                  <- Experiment configs
│   ├── hparams_search              <- Hyperparameter search configs
│   ├── logger                      <- Logger configs
│   ├── mode                        <- Running mode configs
│   ├── model                       <- Model configs
│   ├── trainer                     <- Trainer configs
│   │   ├── loss                        <- Loss function configs
│   │   ├── lr_scheduler                <- Learning rate scheduler configs
│   │   ├── metric                      <- Evaluation metric configs  
│   │   └── optimizer                   <- Optimizer configs
│   ├── config.yaml                 <- Main project configuration file for classification
│   └── config_seg.yaml             <- Main project configuration file for segmentation
├── data                        <- Data from third party sources.
│   ├── external                    <- Data from third party sources.
│   ├── interim                     <- Intermediate data that has been transformed.
│   ├── processed                   <- The final, canonical data sets for modeling.
│   └── raw                         <- The original, immutable data dump.
├── docs                        <- A default Sphinx project; see sphinx-doc.org for details
│   ├── documents                   <- Data from third party sources.
│   │   ├── models                      <- Trained and serialized models, model predictions, 
│   │   │                                   or model summaries
│   │   ├── references                  <- Data dictionaries, manuals, and all other 
│   │   │                                   explanatory materials.
│   │   └── reports                     <- Generated analysis as HTML, PDF, LaTeX, etc.
│   │       └── figures                     <- Generated graphics and figures to be used in reporting
│   └── pages                       <- Intermediate data that has been transformed.
├── notebooks                   <- Jupyter notebooks. Naming convention is a number (for ordering),
│                                  the creator's initials, and a short `-` delimited description, e.g.
│                                  `1.0-jqp-initial-data-exploration`.
├── tests                       <- TODO
│   ├── test_hydra                  <- TODO
│   └── test_datamodule             <- TODO
│
├── .env.example                <- TODO
├── .editorconfig               <- file with format specification. You need to install
│                                   the required plugin for your IDE in order to enable it.
├── .gitignore                  <- file that specifies what should we commit into
│                                   the repository and we should not.
├── .pre-commit-config.yaml     <- TODO
├── CITATION.cff                <- TODO
├── LICENSE                     <- TODO
├── Makefile                    <- Makefile with commands like `make data_mnist`
├── poetry.toml                 <- poetry config file to install enviroment locally
├── poetry.lock                 <- lock file for dependencies. It is used to install exactly
│                                   the same versions of dependencies on each build
├── pyproject.toml              <- The project's dependencies for reproducing the
│                                   analysis environment
├── README.md                   <- The top-level README for developers using this project.
├── run_training.py             <- TODO
└── setup.cfg                   <- configuration file, that is used by most tools in this project

Guide

How To Get Started

First, you should probably get familiar with Tensorflow
Next, go through Hydra quick start guide and basic Hydra tutorial

How it works

By design, every run is initialized by run_training.py file. All modules are dynamically instantiated from module paths specified in config. Example model config:

_target_: cctest.model.modules.simple_dense_net.SimpleDenseNet
input_shape: [28,28,1]
lin1_size: 256
lin2_size: 256
lin3_size: 256
output_size: 10

Using this config we can instantiate the object with the following line:

model = hydra.utils.instantiate(config.model)

This allows you to easily iterate over new models!
Every time you create a new one, just specify its module path and parameters in appropriate config file.
The whole pipeline managing the instantiation logic is placed in cctest/executor/training.py.

Main Project Configuration

Location: configs/config.yaml
Main project config contains default training configuration.
It determines how config is composed when simply executing command python run_training.py.
It also specifies everything that shouldn't be managed by experiment configurations.

Show main project configuration

# specify here default training configuration
defaults:
  - trainer: default.yaml
  - model: mnist_model.yaml
  - datamodule: mnist_datamodule.yaml
  - callback: default.yaml # set this to null if you don't want to use callback
  - logger: null # set logger here or use command line (e.g. `python run_training.py logger=wandb`)

  - mode: default.yaml

  - experiment: null
  - hparams_search: null

# path to original working directory
# hydra hijacks working directory by changing it to the current log directory,
# so it's useful to have this path as a special variable
# learn more here: https://hydra.cc/docs/next/tutorials/basic/running_your_app/working_directory
work_dir: ${hydra:runtime.cwd}

# path to folder with data
data_dir: ${work_dir}/data/

# pretty print config at the start of the run using Rich library
print_config: True

# disable python warnings if they annoy you
ignore_warnings: True

Experiment Configuration

Location: configs/experiment
You should store all your experiment configurations in this folder.
Experiment configurations allow you to overwrite parameters from main project configuration.

Simple example

# @package _global_

# to execute this experiment run:
# python run_training.py experiment=mnist_example_simple.yaml

defaults:
  - override /trainer: default.yaml
  - override /model: mnist_model_conv.yaml
  - override /datamodule: mnist_datamodule.yaml
  - override /callback: default.yaml
  - override /logger: null

# all parameters below will be merged with parameters from default configurations set above
# this allows you to overwrite only specified parameters

seed: "OxCAFFEE"

trainer:
  epochs: 5

model:
  conv1_size: 32
  conv2_size: 64
  conv3_size: 128
  conv4_size: 256
  output_size: 10

datamodule:
  batch_size: 64

Workflow

Write your model (see simple_conv_net.py for example)
Write your datamodule (see mnist_datamodule.py for example)
Write your experiment config, containing paths to your model and datamodule
Run training with chosen experiment config: python run_training.py mode=exp experiment=[your_config_name]

Logs

Hydra creates new working directory for every executed run.
By default, logs have the following structure:

│
├── logs
│   ├── experiments                         # Folder for logs generated from single runs
│   │   ├── exp_test                        # Name of your experiment
│   │   │   ├── 2021-02-15_16-50-49         # Date and Hour of executing run
│   │   │   │   ├── hydra_training          # Hydra logs
│   │   │   │   ├── checkpoints             # Training checkpoints
│   │   │   │   └── ...                     # Any other thing saved during training
│   │   │   ├── ...
│   │   │   └── ...
│   │   ├── ...
│   │   └── multiruns                             # Folder for logs generated from multiruns (sweeps)
│   │           ├── exp_test                      # Name of your experiment
│   │           │   ├── 2021-02-15_16-50-49       # Date and Hour of executing run
│   │           │   │   ├── 0                     # Job number
│   │           │   │   │   ├── hydra_training    # Hydra logs
│   │           │   │   │   ├── checkpoints       # Training checkpoints
│   │           │   │   │   └── ...               # Any other thing saved during training
│   │           │   │   ├── 1
│   │           │   │   ├── 2
│   │           │   │   └── ...
│   │           │   ├── ...
│   │           │   └── ...
│   │           ├── ...
│   │           └── ...
│   ├── debug                         # Folder for logs generated from debug runs

Based on:

cookiecutter-deep-learning-template

cookiecutter-data-science

wemake-django-template

lightning-hydra-template

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Welcome to Cookiecutter Deep Learning Template

Description

Requirements to use this project

How to use

Project Organization

Guide

How To Get Started

How it works

Main Project Configuration

Experiment Configuration

Workflow

Logs

Based on:

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
bash		bash
cctest		cctest
configs		configs
data		data
docs		docs
notebooks		notebooks
tests		tests
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CITATION.cff		CITATION.cff
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml
run_training.py		run_training.py
setup.cfg		setup.cfg

License

Ivo-B/CC-DL-template-example

Folders and files

Latest commit

History

Repository files navigation

Welcome to Cookiecutter Deep Learning Template

Description

Requirements to use this project

How to use

Project Organization

Guide

How To Get Started

How it works

Main Project Configuration

Experiment Configuration

Workflow

Logs

Based on:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages