Skip to content

Update the README with latest instructions on how to build Habitat #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Sep 6, 2022

Conversation

anurlybayev
Copy link
Contributor

Removed the INSTALL.md, since most of it has been incorporated into the main README.md. Some sections need further updates as we refine how we build and release the habitat.


# Download the pre-trained models
cd analyzer
curl -O https://zenodo.org/record/4876277/files/habitat-models.tar.gz\?download\=1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use the newer trained models here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, I had the new link. However, given that you're hosting it under University resources, that could potentially cause legal problems for us. So I'll update the repo with Git LFS maybe after this change is approved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Sounds good.

1. Once inside the container, run `install-dev.sh` under `analyzer/` to build and install the Habitat package.
1. In your scripts, `import habitat` to get access to Habitat. See `experiments/run_experiment.py` for an example showing how to use Habitat.

**Note:** Habitat needs access to your GPU's performance counters, which requires special permissions if you are running with a recent driver (418.43 or later). If you encounter a `CUPTI_ERROR_INSUFFICIENT_PRIVILEGES` error when running Habitat, please follow the instructions [here](https://developer.nvidia.com/ERR_NVGPUCTRPERM) and in [issue #5](https://github.com/geoffxy/habitat/issues/5).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not only a problem when building with Docker. Would suggest moving this to the top.

Copy link

@wangshangsam wangshangsam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README itself LGTM, but I hope that some of the installation steps could be made a little simpler.

curl -O https://zenodo.org/record/4876277/files/habitat-models.tar.gz\?download\=1

# Install the models
./extract-models.sh

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't need to be addressed in this PR, but something to think about - is it possible to wrap these installation steps inside a setup.py so that the user can automate the installation via pip or conda?

README.md Outdated

### Building with Docker

1. Run `setup.sh` under `docker/` to build the Habitat container image.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, something to think about - I took a look at the Dockerfile under docker/. Would it be easier to use NGC PyTorch containers as your base image? This way you don't have to manually align the version (of CUDA and any other NVIDIA tools) in the image with the version that PyTorch needs going forward.
NGC PyTorch containers have a fair amount of enterprise users. It'd be nice if we have a list in the readme that says "Habitat is tested on NGC PyTorch containers from version X to version Y".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea. I'll try building Habitat using these containers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried Habitat with the NGC containers and it works out of the box. The user should either:

  • Clone the repository, initialize the submodules, run install-dev.sh then import the saved models, or
  • Install the correct wheel file (in the case of the latest NGC container, CUDA 11.7 + Python 3.8)

With that said, we should still use nvidia/cuda images for automated builds since would allow us to specify the version of CUDA that's linked to habitat_cuda.

README.md Outdated

1. Run `setup.sh` under `docker/` to build the Habitat container image.
1. Run `start.sh` to start a new container. By default, your home directory will be mounted inside the container under `~/home`.
1. Once inside the container, run `install-dev.sh` under `analyzer/` to build and install the Habitat package.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Include the actual commands of how you run setup.sh or start.sh and what bash arguments they expect. E.g.,

bash docker/setup.sh <X> <Y> <Z>

where <X> is a string that represents XXX, and its options are {a, b, c}.

Something like that. The same goes for the other places that says "Run xxx".

README.md Outdated
chmod +x cuda_11.7.1_515.65.01_linux.run

# Make sure to only install the toolkit
./cuda_11.7.1_515.65.01_linux.run

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have to install CUPTI from the runfile? Is CUPTI not available from rmp or deb? Similarly for cmake?
I just feel like the users could get lost easily with such a long installation instruction. From a user experience point of view, it'd be better if we have a list of scripts and a multiplexer to select between them depending on the user's distro. E.g., for Ubuntu 22.02, run this command. For RedHat, run that command, etc. And we don't need to support that many distros. I personally feel like Ubuntu is 90% of the Linux for DL market right now.

@anurlybayev anurlybayev merged commit ad4e2e2 into main Sep 6, 2022
@anurlybayev anurlybayev deleted the update-readme branch September 6, 2022 03:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants