From 012ffd38bc9e0390986d634592878eeddbd1f3a1 Mon Sep 17 00:00:00 2001 From: Yuxin Wu Date: Wed, 25 Dec 2019 17:46:59 -0800 Subject: [PATCH] update docs Summary: Pull Request resolved: https://github.com/fairinternal/detectron2/pull/352 Differential Revision: D19229170 Pulled By: ppwwyyxx fbshipit-source-id: 976adb38ea1939ce15070b68b046bb711bfe5690 --- .../unexpected-problems-bugs.md | 10 +- INSTALL.md | 124 +++++++++++++----- detectron2/layers/csrc/vision.cpp | 5 + detectron2/modeling/roi_heads/__init__.py | 1 + docs/notes/compatibility.md | 6 +- projects/TridentNet/train_net.py | 5 +- tests/README.md | 8 ++ 7 files changed, 115 insertions(+), 44 deletions(-) create mode 100644 tests/README.md diff --git a/.github/ISSUE_TEMPLATE/unexpected-problems-bugs.md b/.github/ISSUE_TEMPLATE/unexpected-problems-bugs.md index b554bd2dec..f1a3b4d5bd 100644 --- a/.github/ISSUE_TEMPLATE/unexpected-problems-bugs.md +++ b/.github/ISSUE_TEMPLATE/unexpected-problems-bugs.md @@ -7,7 +7,7 @@ about: Report unexpected problems or bugs in detectron2 If you do not know the root cause of the problem / bug, and wish someone to help you, please post according to this template: -## Instructions To Reproduce the Issue +## Instructions To Reproduce the Issue: 1. what changes you made (`git diff`) or what code you wrote ``` @@ -21,7 +21,7 @@ post according to this template: 4. please also simplify the steps as much as possible so they do not require additional resources to run, such as a private dataset. -## Expected behavior +## Expected behavior: If there are no obvious error in "what you observed" provided above, please tell us the expected behavior. @@ -32,7 +32,11 @@ Only in one of the two conditions we will help with it: (1) You're unable to reproduce the results in detectron2 model zoo. (2) It indicates a detectron2 bug. -## Environment +## Environment: Please paste the output of `python -m detectron2.utils.collect_env`. If detectron2 hasn't been successfully installed, use `python detectron2/utils/collect_env.py`. + +If your issue looks like an installation issue / environment issue, +please first try to solve it yourself with the instructions in +https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md#common-installation-issues diff --git a/INSTALL.md b/INSTALL.md index 412b1ac339..7af4c2c1d4 100644 --- a/INSTALL.md +++ b/INSTALL.md @@ -7,17 +7,17 @@ also installs detectron2 with a few simple commands. ### Requirements - Linux or macOS -- Python >= 3.6 -- PyTorch 1.3 +- Python ≥ 3.6 +- PyTorch ≥ 1.3 - [torchvision](https://github.com/pytorch/vision/) that matches the PyTorch installation. You can install them together at [pytorch.org](https://pytorch.org) to make sure of this. - OpenCV, needed by demo and visualization - [fvcore](https://github.com/facebookresearch/fvcore/): `pip install -U 'git+https://github.com/facebookresearch/fvcore'` - pycocotools: `pip install cython; pip install 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'` -- GCC >= 4.9 +- GCC ≥ 4.9 -### Build Detectron2 +### Build and Install Detectron2 After having the above dependencies, run: ``` @@ -35,43 +35,99 @@ Note: you often need to rebuild detectron2 after reinstalling PyTorch. ### Common Installation Issues -+ Undefined torch/aten symbols, or segmentation fault immediately when running the library. - This may be caused by the following reasons: +Click each issue for its solutions: + +
+ +Undefined torch/aten symbols, or segmentation fault immediately when running the library. + + +This can happen if detectron2 or torchvision is not +compiled with the version of PyTorch you're running. + +If you use a pre-built torchvision, uninstall torchvision & pytorch, and reinstall them +following [pytorch.org](http://pytorch.org). +If you manually build detectron2 or torchvision, remove the files you built (`build/`, `**/*.so`) +and rebuild them. + +If you cannot resolve the problem, please include the output of `gdb -ex "r" -ex "bt" -ex "quit" --args python -m detectron2.utils.collect_env` +in your issue. +
+ +
+ +Undefined C++ symbols in `detectron2/_C*.so`. + +Usually it's because the library is compiled with a newer C++ compiler but run with an old C++ run time. +This can happen with old anaconda. + +Try `conda update libgcc`. Then remove the files you built (`build/`, `**/*.so`) and rebuild them. +
+ +
+ +"Not compiled with GPU support" or "Detectron2 CUDA Compiler: not available". + +CUDA is not found when building detectron2. +You should make sure +``` +python -c 'import torch; from torch.utils.cpp_extension import CUDA_HOME; print(torch.cuda.is_available(), CUDA_HOME)' +``` +print valid outputs at the time you build detectron2. +
- * detectron2 or torchvision is not compiled with the version of PyTorch you're running. +
+ +"invalid device function" or "no kernel image is available for execution". + - If you use a pre-built torchvision, uninstall torchvision & pytorch, and reinstall them - following [pytorch.org](http://pytorch.org). - If you manually build detectron2 or torchvision, remove the files you built (`build/`, `**/*.so`) - and rebuild them. +Two possibilities: - * detectron2 or torchvision is not compiled using gcc >= 4.9. +* You build detectron2 with one version of CUDA but run it with a different version. - You'll see a warning message during compilation in this case. Please remove the files you built, - and rebuild them with a supported compiler. - Technically, you need the identical compiler that's used to build pytorch to guarantee - compatibility. But in practice, gcc >= 4.9 should work OK. + To check whether it is the case, + use `python -m detectron2.utils.collect_env` to find out inconsistent CUDA versions. + In the output of this command, you should expect "Detectron2 CUDA Compiler", "CUDA_HOME", "PyTorch built with - CUDA" + to contain cuda libraries of the same version. -+ Undefined C++ symbols in `detectron2/_C*.so`: + When they are inconsistent, + you need to either install a different build of PyTorch (or build by yourself) + to match your local CUDA installation, or install a different version of CUDA to match PyTorch. - * This can happen with old anaconda. Try `conda update libgcc`. Then remove the files you built and rebuild them. +* Detectron2 or PyTorch/torchvision is not built with the correct compute compatibility for the GPU model. -+ Undefined CUDA symbols. The version of NVCC you use to build detectron2 or torchvision does - not match the version of CUDA you are running with. - This often happens when using anaconda's CUDA runtime. + The compute compatibility for PyTorch is available in `python -m detectron2.utils.collect_env`. -+ "Not compiled with GPU support" or "Detectron2 CUDA Compiler: not available": make sure - ``` - python -c 'import torch; from torch.utils.cpp_extension import CUDA_HOME; print(torch.cuda.is_available(), CUDA_HOME)' - ``` - print valid outputs at the time you build detectron2. + The compute compatibility of detectron2/torchvision defaults to match the GPU found on the machine + during building, and can be controlled by `TORCH_CUDA_ARCH_LIST` environment variable during building. -+ "invalid device function" or "no kernel image is available for execution": two possibilities: - * You build detectron2 with one version of CUDA but run it with a different version. - * Detectron2 is not built with the correct compute compability for the GPU model. - The compute compability defaults to match the GPU found on the machine during building, - and can be controlled by `TORCH_CUDA_ARCH_LIST` environment variable during building. + Visit [developer.nvidia.com/cuda-gpus](https://developer.nvidia.com/cuda-gpus) to find out + the correct compute compatibility for your device. - You can use `python -m detectron2.utils.collect_env` to find out inconsistent CUDA versions. - In its output, you should expect "Detectron2 CUDA Compiler", "CUDA_HOME", "PyTorch built with - CUDA" - to contain cuda libraries of the same version. +
+ +
+ +Undefined CUDA symbols. + + +The version of NVCC you use to build detectron2 or torchvision does +not match the version of CUDA you are running with. +This often happens when using anaconda's CUDA runtime. + +Use `python -m detectron2.utils.collect_env` to find out inconsistent CUDA versions. +In the output of this command, you should expect "Detectron2 CUDA Compiler", "CUDA_HOME", "PyTorch built with - CUDA" +to contain cuda libraries of the same version. + +When they are inconsistent, +you need to either install a different build of PyTorch (or build by yourself) +to match your local CUDA installation, or install a different version of CUDA to match PyTorch. +
+ + +
+ +"ImportError: cannot import name '_C'". + +Please build and install detectron2 following the instructions above. +
diff --git a/detectron2/layers/csrc/vision.cpp b/detectron2/layers/csrc/vision.cpp index 83ea78f006..fa7942e881 100644 --- a/detectron2/layers/csrc/vision.cpp +++ b/detectron2/layers/csrc/vision.cpp @@ -38,6 +38,11 @@ std::string get_compiler_version() { std::ostringstream ss; #if defined(__GNUC__) #ifndef __clang__ + +#if ((__GNUC__ <= 4) && (__GNUC_MINOR__ <= 8)) +#error "GCC >= 4.9 is required!" +#endif + { ss << "GCC " << __GNUC__ << "." << __GNUC_MINOR__; } #endif #endif diff --git a/detectron2/modeling/roi_heads/__init__.py b/detectron2/modeling/roi_heads/__init__.py index 42ff4e72cc..645d0411d8 100644 --- a/detectron2/modeling/roi_heads/__init__.py +++ b/detectron2/modeling/roi_heads/__init__.py @@ -5,6 +5,7 @@ from .roi_heads import ( ROI_HEADS_REGISTRY, ROIHeads, + Res5ROIHeads, StandardROIHeads, build_roi_heads, select_foreground_proposals, diff --git a/docs/notes/compatibility.md b/docs/notes/compatibility.md index 894a7a549c..f5f879f9f4 100644 --- a/docs/notes/compatibility.md +++ b/docs/notes/compatibility.md @@ -2,7 +2,7 @@ ## Compatibility with Detectron (and maskrcnn-benchmark) -Detectron2 addresses some legacy issues left in Detectron, as a result, their models +Detectron2 addresses some legacy issues left in Detectron. As a result, their models are not compatible: running inference with the same model weights will produce different results in the two code bases. @@ -59,8 +59,8 @@ model-level compatibility. The major ones are: We have observed that this tends to slightly decrease box AP50 while improving box AP for higher overlap thresholds (and leading to a slight overall improvement in box AP). - We interpret the coordinates in COCO bounding box and segmentation annotations - as coordinates in range `[0, width]` or `[0, height]`, and the coordinates in - COCO keypoint annotations are pixel indices in range `[0, width - 1]` or `[0, height - 1]`. + as coordinates in range `[0, width]` or `[0, height]`. The coordinates in + COCO keypoint annotations are interpreted as pixel indices in range `[0, width - 1]` or `[0, height - 1]`. We will later share more details and rationale behind the above mentioned issues diff --git a/projects/TridentNet/train_net.py b/projects/TridentNet/train_net.py index 34cde48971..1a4ea9b174 100644 --- a/projects/TridentNet/train_net.py +++ b/projects/TridentNet/train_net.py @@ -7,11 +7,10 @@ import os -import detectron2.utils.comm as comm from detectron2.checkpoint import DetectionCheckpointer from detectron2.config import get_cfg from detectron2.engine import DefaultTrainer, default_argument_parser, default_setup, launch -from detectron2.evaluation import COCOEvaluator, verify_results +from detectron2.evaluation import COCOEvaluator from tridentnet import add_tridentnet_config @@ -46,8 +45,6 @@ def main(args): cfg.MODEL.WEIGHTS, resume=args.resume ) res = Trainer.test(cfg, model) - if comm.is_main_process(): - verify_results(cfg, res) return res trainer = Trainer(cfg) diff --git a/tests/README.md b/tests/README.md new file mode 100644 index 0000000000..02c267d1f6 --- /dev/null +++ b/tests/README.md @@ -0,0 +1,8 @@ +## Unit Tests + +To run the unittests, do: +``` +python -m unittest discover -v -s tests +``` + +There are also end-to-end inference & training tests, in [dev/run_*_tests.sh](../dev).