Skip to content

Update branch #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Jan 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
c3868c8
Ship kernels.sqlite that was previously excluded
jimgao1 Nov 5, 2022
41b3235
Use LFS to track kernels.sqlite
jimgao1 Nov 5, 2022
1b96ed7
Merge pull request #9 from CentML/nov04-fix-kernels-sqlite
jimgao1 Nov 7, 2022
47d0c61
Updated cmake version in docker and readme
michaelshin Nov 17, 2022
6a02dd3
Add more details on cmake bug
michaelshin Nov 17, 2022
8eeace6
Fix Git LFS filters
anurlybayev Nov 18, 2022
3cb380c
analyzer/habitat/data/lstm/model.pth: convert to Git LFS
anurlybayev Nov 18, 2022
1ad504a
Merge pull request #11 from CentML/fix-git-lfs-pointers
anurlybayev Nov 18, 2022
156a4b4
Merge pull request #10 from CentML/update-cmake-version
anurlybayev Nov 18, 2022
3925719
Add kernels.sqlite to Git LFS
anurlybayev Nov 18, 2022
9ca8111
Upload habitat wheels to CentML servers
michaelshin Nov 22, 2022
2cf9565
Add build for cuda11.7
michaelshin Nov 22, 2022
87dba1d
Merge pull request #12 from CentML/update-github-workflows
michaelshin Nov 22, 2022
4abb214
Updated secrets to use Habitat specific secrets
michaelshin Nov 22, 2022
0638cba
Merge pull request #13 from CentML/update-github-workflows
michaelshin Nov 22, 2022
bab75bf
Update aws cp command to proper syntax
michaelshin Nov 23, 2022
99c7b8c
Changed habitat upload directory + add index.html
michaelshin Nov 24, 2022
27b15d0
Fixed syntax error
michaelshin Nov 24, 2022
d06b1c7
Fixed location of create_index script
michaelshin Nov 24, 2022
6e9baa8
Update folder location for S3 instance
michaelshin Nov 24, 2022
222b7d6
Removed redundant steps
michaelshin Nov 24, 2022
533c4ec
Fix typo for python executable
michaelshin Nov 24, 2022
42061d7
Get the actual wheel location when creating index
michaelshin Nov 24, 2022
24c066e
Update readme with correct install instructions
michaelshin Nov 28, 2022
1480962
Merge pull request #14 from CentML/update-github-workflows
michaelshin Nov 28, 2022
5b026a6
Add alternative CUPTI/NVPerf paths
jimgao1 Dec 18, 2022
421248c
Merge pull request #15 from CentML/dec18-cmake-path
jimgao1 Dec 19, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
167 changes: 151 additions & 16 deletions .github/workflows/whl-build-all.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,35 @@ on:
workflow_dispatch:

jobs:
create-habitat-index-html:
needs: [whl-build-cu102, whl-build-cu111, whl-build-cu113, whl-build-cu116, whl-build-cu117]
runs-on: [self-hosted]
steps:
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.HABITAT_AWS_ACCESS_KEY }}
aws-region: ${{ secrets.HABITAT_AWS_REGION }}
aws-secret-access-key: ${{ secrets.HABITAT_AWS_SECRET_ACCESS_KEY }}

- name: Create index.html
run: |
python3 /home/builduser/create_index.py habitat -f wheels-cu102 wheels-cu111 wheels-cu113 wheels-cu116 wheels-cu117

- name: Upload to S3
run: |
aws s3 cp index.html s3://centml-habitat/habitat/

whl-build-cu102:
runs-on: [self-hosted, cu102]
steps:
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.HABITAT_AWS_ACCESS_KEY }}
aws-region: ${{ secrets.HABITAT_AWS_REGION }}
aws-secret-access-key: ${{ secrets.HABITAT_AWS_SECRET_ACCESS_KEY }}

- name: Display host information
run: |
uname -a
Expand All @@ -33,17 +59,31 @@ jobs:
- name: Build Python3.10 wheel
run: |
/home/builduser/build_habitat.sh python3.10 py310

- name: Upload wheels to S3
run: |
aws s3 cp analyzer/dist/ s3://centml-habitat/habitat/wheels-cu102/ --recursive --exclude "*" --include "*.whl"

- name: Create index.html
run: |
file_list=(analyzer/dist/*.whl)
formatted_file_list=$(basename -a ${file_list[@]})
python3 /home/builduser/create_index.py habitat_cu102 -f $formatted_file_list

- name: Upload wheel artifact
uses: actions/upload-artifact@v3
with:
name: wheels-cu102
path: analyzer/dist/*.whl

- name: Upload index.html to S3
run: |
aws s3 cp index.html s3://centml-habitat/habitat/wheels-cu102/

whl-build-cu111:
runs-on: [self-hosted, cu111]
steps:
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.HABITAT_AWS_ACCESS_KEY }}
aws-region: ${{ secrets.HABITAT_AWS_REGION }}
aws-secret-access-key: ${{ secrets.HABITAT_AWS_SECRET_ACCESS_KEY }}

- name: Display host information
run: |
uname -a
Expand Down Expand Up @@ -71,15 +111,30 @@ jobs:
run: |
/home/builduser/build_habitat.sh python3.10 py310

- name: Upload wheel artifact
uses: actions/upload-artifact@v3
with:
name: wheels-cu111
path: analyzer/dist/*.whl
- name: Upload wheels to S3
run: |
aws s3 cp analyzer/dist/ s3://centml-habitat/habitat/wheels-cu111/ --recursive --exclude "*" --include "*.whl"

- name: Create index.html
run: |
file_list=(analyzer/dist/*.whl)
formatted_file_list=$(basename -a ${file_list[@]})
python3 /home/builduser/create_index.py habitat_cu111 -f $formatted_file_list

- name: Upload index.html to S3
run: |
aws s3 cp index.html s3://centml-habitat/habitat/wheels-cu111/

whl-build-cu113:
runs-on: [self-hosted, cu113]
steps:
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.HABITAT_AWS_ACCESS_KEY }}
aws-region: ${{ secrets.HABITAT_AWS_REGION }}
aws-secret-access-key: ${{ secrets.HABITAT_AWS_SECRET_ACCESS_KEY }}

- name: Display host information
run: |
uname -a
Expand All @@ -106,16 +161,31 @@ jobs:
- name: Build Python3.10 wheel
run: |
/home/builduser/build_habitat.sh python3.10 py310

- name: Upload wheels to S3
run: |
aws s3 cp analyzer/dist/ s3://centml-habitat/habitat/wheels-cu113/ --recursive --exclude "*" --include "*.whl"

- name: Create index.html
run: |
file_list=(analyzer/dist/*.whl)
formatted_file_list=$(basename -a ${file_list[@]})
python3 /home/builduser/create_index.py habitat_cu113 -f $formatted_file_list

- name: Upload wheel artifact
uses: actions/upload-artifact@v3
with:
name: wheels-cu113
path: analyzer/dist/*.whl
- name: Upload index.html to S3
run: |
aws s3 cp index.html s3://centml-habitat/habitat/wheels-cu113/

whl-build-cu116:
runs-on: [self-hosted, cu116]
steps:
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.HABITAT_AWS_ACCESS_KEY }}
aws-region: ${{ secrets.HABITAT_AWS_REGION }}
aws-secret-access-key: ${{ secrets.HABITAT_AWS_SECRET_ACCESS_KEY }}

- name: Display host information
run: |
uname -a
Expand Down Expand Up @@ -148,3 +218,68 @@ jobs:
with:
name: wheels-cu116
path: analyzer/dist/*.whl

- name: Upload wheels to S3
run: |
aws s3 cp analyzer/dist/ s3://centml-habitat/habitat/wheels-cu116/ --recursive --exclude "*" --include "*.whl"

- name: Create index.html
run: |
file_list=(analyzer/dist/*.whl)
formatted_file_list=$(basename -a ${file_list[@]})
python3 /home/builduser/create_index.py habitat_cu116 -f $formatted_file_list

- name: Upload index.html to S3
run: |
aws s3 cp index.html s3://centml-habitat/habitat/wheels-cu116/

whl-build-cu117:
runs-on: [self-hosted, cu117]
steps:
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.HABITAT_AWS_ACCESS_KEY }}
aws-region: ${{ secrets.HABITAT_AWS_REGION }}
aws-secret-access-key: ${{ secrets.HABITAT_AWS_SECRET_ACCESS_KEY }}

- name: Display host information
run: |
uname -a
hostname
pwd
id
ls -la

- name: Fetch repository
uses: actions/checkout@v3

- name: Build Python3.7 wheel
run: |
/home/builduser/build_habitat.sh python3.7 py37

- name: Build Python3.8 wheel
run: |
/home/builduser/build_habitat.sh python3.8 py38

- name: Build Python3.9 wheel
run: |
/home/builduser/build_habitat.sh python3.9 py39

- name: Build Python3.10 wheel
run: |
/home/builduser/build_habitat.sh python3.10 py310

- name: Upload wheels to S3
run: |
aws s3 cp analyzer/dist/ s3://centml-habitat/habitat/wheels-cu117/ --recursive --exclude "*" --include "*.whl"

- name: Create index.html
run: |
file_list=(analyzer/dist/*.whl)
formatted_file_list=$(basename -a ${file_list[@]})
python3 /home/builduser/create_index.py habitat_cu117 -f $formatted_file_list

- name: Upload index.html to S3
run: |
aws s3 cp index.html s3://centml-habitat/habitat/wheels-cu117/
125 changes: 75 additions & 50 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Habitat is a tool that predicts a deep neural network's training iteration execu
To run Habitat, you need:
- [Python 3.6+](https://www.python.org/)
- [Pytorch 1.1.0+](https://pytorch.org/)
- A system equiped with an Nvidia GPU.
- A system equiped with an Nvidia GPU with properly configured CUDA

Currently, we have predictors for the following Nvidia GPUs:

Expand All @@ -37,72 +37,101 @@ Currently, we have predictors for the following Nvidia GPUs:
| T4 | Turing | 16 GB | GDDR6 | 40 |
| 3090 | Ampere | 24 GB | GDDR6X | 82 |

**NOTE:** Not implmented yet
```zsh
python3 -m pip install habitat
python3 -c "import habitat"
```
<h2 id="building-locally">Building locally</h2>

<h2 id="build">Building from source</h2>
### 1. Install CUPTI

Prerequsites:
- A system equiped with an Nvidia GPU with properly configured CUDA
- [CUDA Toolkit](https://developer.nvidia.com/cuda-toolkit-archive)
- [cmake v3.17+](https://github.com/Kitware/CMake/releases)
- [Git Large File Storage](https://git-lfs.github.com/) - which contains pre-trained habitat models
- CUPTI is a profiling interface required by Habitat. Select your version of CUDA [here](https://developer.nvidia.com/cuda-toolkit-archive) and following the instructions to add NVIDIA's repository. Then, install CUPTI with:

```bash
sudo apt-get install cuda-cupti-xx-x
```

where `xx-x` represents the version of CUDA you have installed.

```zsh
git clone https://github.com/CentML/habitat.git && cd habitat
git submodule init && git submodule update
```
### 2. Install Habitat

**Note:** Habitat needs access to your GPU's performance counters, which requires special permissions if you are running with a recent driver (418.43 or later). If you encounter a `CUPTI_ERROR_INSUFFICIENT_PRIVILEGES` error when running Habitat, please follow the instructions [here](https://developer.nvidia.com/ERR_NVGPUCTRPERM) and in [issue #5](https://github.com/geoffxy/habitat/issues/5).
You can install via pip if you have the following versions of CUDA and Python

### Building with Docker
- CUDA: 10.2, 11.1, 11.3, 11.6, 11.7
- Python: 3.7 - 3.10

Habitat has been tested to work on the latest version of [NVIDIA NGC PyTorch containers](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch).
### Installing from pip

Install via pip with the following command

1. To build Habitat with Docker, first run the NGC container.
```bash
docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:22.08-py3
```
2. Inside the container, clone the repository then build and install the Habitat Python package:
```bash
git clone --recursive https://github.com/centml/habitat
./habitat/analyzer/install-dev.sh
pip install https://centml.ai/habitat/wheels/habitat_predict-1.0.0-20221123+cuYYY-pyZZ-none-any.whl
```

### Building without Docker
where YYY is your CUDA version and ZZ is your Python version.

1. Install CUPTI
For example, if you are using CUDA 10.2 and Python 3.7):

CUPTI is a profiling interface required by Habitat. Select the correct version of CUDA [here](https://developer.nvidia.com/cuda-toolkit-archive) and following the instructions to add NVIDIA's repository. Then, install CUPTI with:
```bash
sudo apt-get install cuda-cupti-11-x
pip install https://centml.ai/habitat/wheels/habitat_predict-1.0.0-20221123+cu102-py37-none-any.whl
```
where `11-x` represents the version of CUDA you have installed.

2. Install `CMake` 3.17+.
If you do not find matching version of CUDA and Python above, you need to build Habitat from source with the following instructions

### Installing from source

1. Install CMake 3.17+.
- Note that CMake 3.24.0 and 3.24.1 has a bug that breaks Habitat as it is not able to find the CUPTI directory and you should not use those versions
- [https://gitlab.kitware.com/cmake/cmake/-/merge_requests/7608/diffs](https://gitlab.kitware.com/cmake/cmake/-/merge_requests/7608/diffs)
- Run the following commands to download and install a precompiled version of CMake 3.24.2

```bash
wget https://github.com/Kitware/CMake/releases/download/v3.24.2/cmake-3.24.2-linux-x86_64.sh
chmod +x cmake-3.24.2-linux-x86_64.sh
mkdir /opt/cmake
sh cmake-3.24.2-linux-x86_64.sh --prefix=/opt/cmake --skip-license
ln -s /opt/cmake/bin/cmake /usr/local/bin/cmake
```

- You can verify the version of CMake you installed with the following command

```bash
cmake --version
```

2. Install [Git Large File Storage](https://git-lfs.github.com/)
3. Clone the Habitat package

```bash
git clone https://github.com/centml/habitat
```

4. Get the pre-trained models used by Habitat

```bash
git submodule init && git submodule update
git lfs pull
```

5. Finally build habitat with the following command

```bash
./analyzer/install-dev.sh
```

<h2 id="building-with-docker">Building with Docker</h2>

Follow these steps to download and install a precompiled version of CMake:
```bash
wget https://github.com/Kitware/CMake/releases/download/v3.24.0/cmake-3.24.0-linux-x86_64.sh
chmod +x cmake-3.24.0-linux-x86_64.sh
mkdir /opt/cmake
sh cmake-3.24.0-linux-x86_64.sh --prefix=/opt/cmake --skip-license
ln -s /opt/cmake/bin/cmake /usr/local/bin/cmake
```
You can verify the version of CMake you installed with the following:
Habitat has been tested to work on the latest version of [NVIDIA NGC PyTorch containers](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch).

1. To build Habitat with Docker, first run the NGC container where
```bash
cmake --version
docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:XX.XX-py3
```
3. Build and install the Habitat Python package:
2. Inside the container, clone the repository then build and install the Habitat Python package:
```bash
git clone https://github.com/centml/habitat
git clone --recursive https://github.com/centml/habitat
./habitat/analyzer/install-dev.sh
```

<h2 id="getting-started">Usage example</h2>
**Note:** Habitat needs access to your GPU's performance counters, which requires special permissions if you are running with a recent driver (418.43 or later). If you encounter a `CUPTI_ERROR_INSUFFICIENT_PRIVILEGES` error when running Habitat, please follow the instructions [here](https://developer.nvidia.com/ERR_NVGPUCTRPERM) and in [issue #5](https://github.com/geoffxy/habitat/issues/5).

<h2 id="usage-example">Usage example</h2>

You can verify your Habitat installation by running the simple usage example:
```python
Expand All @@ -128,16 +157,12 @@ pred = trace.to_device(habitat.Device.V100)
print("Predicted time on V100:", pred.run_time_ms)
```

```zsh
```bash
python3 example.py
```

See [experiments/run_experiment.py](https://github.com/CentML/habitat/tree/main/experiments) for other examples of Habitat usage.

<h2 id="dev-setup">Development Environment Setup</h2>

<h2 id="release-process">Release Process</h2>

<h2 id="release-history">Release History</h2>

See [Releases](https://github.com/UofT-EcoSystem/habitat/releases)
Expand Down
3 changes: 2 additions & 1 deletion analyzer/habitat/data/.gitattributes
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
*.pth filter=lfs diff=lfs merge=lfs -text
**/*.pth filter=lfs diff=lfs merge=lfs -text
**/kernels.sqlite filter=lfs diff=lfs merge=lfs -text
3 changes: 3 additions & 0 deletions analyzer/habitat/data/kernels.sqlite
Git LFS file not shown
Binary file modified analyzer/habitat/data/lstm/model.pth
Binary file not shown.
Loading