A Runtime-Based Computational Performance Predictor for Deep Neural Network Training
- Installation
- Building from source
- Usage example
- Development Environment Setup
- Release process
- Release history
- License
- Research paper
- Contributing
Habitat is a tool that predicts a deep neural network's training iteration execution time on a given GPU. It currently supports PyTorch. To learn more about how Habitat works, please see our research paper.
To run Habitat, you need:
- Python 3.6+
- Pytorch 1.1.0+
- A system equiped with an Nvidia GPU.
Currently, we have predictors for the following Nvidia GPUs:
| GPU | Generation | Memory | Mem. Type | SMs |
|---|---|---|---|---|
| P4000 | Pascal | 8 GB | GDDR5 | 14 |
| P100 | Pascal | 16 GB | HBM2 | 56 |
| V100 | Volta | 16 GB | HBM2 | 80 |
| 2070 | Turing | 8 GB | GDDR6 | 36 |
| 2080Ti | Turing | 11 GB | GDDR6 | 68 |
| T4 | Turing | 16 GB | GDDR6 | 40 |
| 3090 | Ampere | 24 GB | GDDR6X | 82 |
NOTE: Not implmented yet
python3 -m pip install habitat
python3 -c "import habitat"Prerequsites:
- A system equiped with an Nvidia GPU with properly configured CUDA
- CUDA Toolkit
- cmake v3.17+
- Note that Habitat does not build properly with cmake v3.24.0 and v3.24.1 due to a bug in cmake. This bug is fixed by this change which has been merged in v3.24.2
- Git Large File Storage - which contains pre-trained habitat models
git clone https://github.com/CentML/habitat.git && cd habitat
git submodule init && git submodule updateNote: Habitat needs access to your GPU's performance counters, which requires special permissions if you are running with a recent driver (418.43 or later). If you encounter a CUPTI_ERROR_INSUFFICIENT_PRIVILEGES error when running Habitat, please follow the instructions here and in issue #5.
Habitat has been tested to work on the latest version of NVIDIA NGC PyTorch containers.
- To build Habitat with Docker, first run the NGC container.
docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:22.08-py3- Inside the container, clone the repository then build and install the Habitat Python package:
git clone --recursive https://github.com/centml/habitat
./habitat/analyzer/install-dev.sh- Install CUPTI
CUPTI is a profiling interface required by Habitat. Select the correct version of CUDA here and following the instructions to add NVIDIA's repository. Then, install CUPTI with:
sudo apt-get install cuda-cupti-11-xwhere 11-x represents the version of CUDA you have installed.
- Install
CMake3.17+.
Follow these steps to download and install a precompiled version of CMake:
wget https://github.com/Kitware/CMake/releases/download/v3.24.2/cmake-3.24.2-linux-x86_64.sh
chmod +x cmake-3.24.2-linux-x86_64.sh
mkdir /opt/cmake
sh cmake-3.24.2-linux-x86_64.sh --prefix=/opt/cmake --skip-license
ln -s /opt/cmake/bin/cmake /usr/local/bin/cmakeYou can verify the version of CMake you installed with the following:
cmake --version- Build and install the Habitat Python package:
git clone https://github.com/centml/habitat
./habitat/analyzer/install-dev.shYou can verify your Habitat installation by running the simple usage example:
# example.py
import habitat
import torch
import torchvision.models as models
# Define model and sample inputs
model = models.resnet50().cuda()
image = torch.rand(8, 3, 224, 224).cuda()
# Measure a single inference
tracker = habitat.OperationTracker(device=habitat.Device.RTX2080Ti)
with tracker.track():
out = model(image)
trace = tracker.get_tracked_trace()
print("Run time on source:", trace.run_time_ms)
# Perform prediction to a single target device
pred = trace.to_device(habitat.Device.V100)
print("Predicted time on V100:", pred.run_time_ms)python3 example.pySee experiments/run_experiment.py for other examples of Habitat usage.
See Releases
The code in this repository is licensed under the Apache 2.0 license (see
LICENSE and NOTICE), with the exception of the files mentioned below.
This software contains source code provided by NVIDIA Corporation. These files are:
- The code under
cpp/external/cupti_profilerhost_util/(CUPTI sample code) cpp/src/cuda/cuda_occupancy.h
The code mentioned above is licensed under the NVIDIA Software Development Kit End User License Agreement.
We include the implementations of several deep neural networks under
experiments/ for our evaluation. These implementations are copyrighted by
their original authors and carry their original licenses. Please see the
corresponding README files and license files inside the subdirectories for
more information.
Habitat began as a research project in the EcoSystem Group at the University of Toronto. The accompanying research paper appeared in the proceedings of USENIX ATC'21. If you are interested, you can read a preprint of the paper here.
If you use Habitat in your research, please consider citing our paper:
@inproceedings{habitat-yu21,
author = {Yu, Geoffrey X. and Gao, Yubo and Golikov, Pavel and Pekhimenko,
Gennady},
title = {{Habitat: A Runtime-Based Computational Performance Predictor for
Deep Neural Network Training}},
booktitle = {{Proceedings of the 2021 USENIX Annual Technical Conference
(USENIX ATC'21)}},
year = {2021},
}Check out CONTRIBUTING.md for more information on how to help with Habitat.