Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eval refactor step2 #53

Merged
merged 26 commits into from
Mar 14, 2019
Merged

Eval refactor step2 #53

merged 26 commits into from
Mar 14, 2019

Conversation

oscar-nutonomy
Copy link
Contributor

@oscar-nutonomy oscar-nutonomy commented Mar 6, 2019

This pull-requests re-organizes the detection eval code. There are no changes to the algorithm or the output. Changes include:

  • All evaluation configs are captured in a configuration file.
  • Added dataclasses for EvalBoxes and EvalConfigs.
  • The main class is much leaner. Loading happens in loaders.py and the main algorithms are in algo.py.
  • Misc clean-ups through using the power of the data-classes.
  • Renamed main variables to gt_boxes and pred_boxes (from all_annotations' and all_results`).

There is a bunch of stuff remaining, but I think this is a fine chunk to review. Next steps include

  • Add some more methods on the data-classes.
  • Add data-classes for all_metrics and raw_metrics
  • Store raw_metrics to disk.
  • Implement rendering method that operations on raw_metrics data-class.

I commented out the CLI to main for now until we know what it will look like. I also commented out the in-line plotting in average_precision and calc_tp_metrics. As mentioned above, once we can store raw_metrics to disk we can build new renderers.

@oscar-nutonomy oscar-nutonomy self-assigned this Mar 6, 2019
@oscar-nutonomy oscar-nutonomy changed the base branch from eval_refactor to release_v0.2 March 7, 2019 05:45
@holger-motional

This comment has been minimized.

@oscar-nutonomy oscar-nutonomy changed the title [WIP] Eval refactor step2 Eval refactor step2 Mar 9, 2019
python-sdk/nuscenes/eval/detection/data_classes.py Outdated Show resolved Hide resolved
rotation=sample_annotation['rotation'],
velocity=list(nusc.box_velocity(sample_annotation['token'])),
detection_name=detection_name,
detection_score=0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps nan or None for GT boxes?

python-sdk/nuscenes/eval/detection/utils.py Show resolved Hide resolved
holger-motional and others added 3 commits March 11, 2019 10:37
* Replaced attribute_scores by a single attribute_name

* Fixed attribute bug

* Updated weighted sum score

* Added a back-port for test fixture to make test reproducible.
Copy link

@varun-nutonomy varun-nutonomy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, left some very minor comments inline.

README.md Outdated
@@ -56,6 +56,10 @@ Finally, set NUSCENES env. variable that points to your data folder
export NUSCENES="/data/sets/nuscenes"
```

### Verify install
To verify your environment run `python -m unittest` in the `python-sdk` folder.
You can also run `assert_download.py` in the `nuscenes/scrips` folder.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: scripts.

"""

# Count the positives.
npos = len([1 for gt_box in gt_boxes.all if gt_box.detection_name == class_name])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be replaced with
npos = sum(gt_box.detection_name == class_name for gt_box in gt_boxes.all)

pred_confs = [box.detection_score for box in pred_boxes_list]

# Sort by confidence.
sortind = [i for (v, i) in sorted((v, i) for (i, v) in enumerate(pred_confs))][::-1]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be replaced with

# Sort by confidence in descending order.
sortind = np.argsort(pred_confs)[::-1]

if len(raw_metrics) == 0:
return tp_metrics

for metric_name in {key: [] for key in cfg.metric_names}:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why it is written like this? Why not just for metric_name in cfg.metric_names: ?

@@ -58,7 +58,7 @@ def is_on_mask(self, x, y, dilation=0) -> np.array:
Determine whether the given coordinates are on the (optionally dilated) map mask.
:param x: Global x coordinates. Can be a scalar, list or a numpy array of x coordinates.
:param y: Global y coordinates. Can be a scalar, list or a numpy array of x coordinates.
:param dilation: <float>. Optional dilation of map mask.
:param dilation: Optional dilation of map mask.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing typing for this method and the next one.

Copy link
Contributor

@holger-motional holger-motional left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@oscar-nutonomy oscar-nutonomy merged commit 54edf61 into release_v0.2 Mar 14, 2019
@oscar-nutonomy oscar-nutonomy deleted the eval_refactor_step2 branch March 14, 2019 02:28
Alex-nutonomy pushed a commit that referenced this pull request Mar 25, 2019
* Added setup.py for pip installation

* Added speedup to query many points with single function call in map mask

* Added missing methods for improved map mask

* Changes for v0.2

* Updated tutorial path

* Changed logic s.t. map table points to logs, not 1 to 1

* Fix indexing bug

* Reset paths to default

* Removed print statement from setup.py

* Fixed a bug for large map files

* Not excluding any logs from map rendering

* Fixed bug with missing "self" keyword in create_splits_logs(nusc), added dataroot parameter

* Address bug for classes without attributes

* Fixed misspelled motorcycle

* Added missing keyword self

* Added checks for nan values

* Removed teaser splits as they is not compatible with the new logs

* Added (preliminary) splits for v0.2

* Workaround to allow evaluating a subset of the data

* Splits hardcoded in create_splits_logs

* Fix bugs if a results file has zero TPs

* Simpler MapMask (#31)

* deleted distance_mask, binary_mask. Created dilate_mask().
* cleanup docstring

* Added missing Dict import (#35)

* Typing fix (#36)

* Added missing Dict import
* Fixed typing to be compatible where future annotations package cannot be loaded

* Updated code to new folder structure (#37)

* Devkit now accepts versions 0.3, 0.4 and 1.0

* Fix TPs analog to commit 94544b5 (#41)

* Fix yaw_diff calculation (#46)

* Added unittest to catch previous error in yaw_diff

* Updated dataset to v0.4, updated train/val/test splits

* Added assertion

* Cleaned up map API.

* Evaluation Fixes (#51)

* Correct recall edge case
* Apply distance filtering to xy only
* Return max error if recall is too low

* Added dataset version as command line argument to nuscenes_eval.py

* Function to get scenes in each split, script to check dataset completeness

* Added command line arguments to launch scripts

* Hotfix for call to create_splits_logs

* Structure re-org towards more tests. (#52)

- Moves files around to enable easier inject of tests.
- Enables running unit-tests per usual python -m unittest
- Requires env. variable $NUSCENES for test.
- Added trivial unit-tests for NuScenes & NuScenesEval. Currently uses v0.2. We should switch to mini when data is ready.
- Moved detection eval code to eval/detection to future-proof for next task.

This PR is only the first step.
The next PR will break out the detection eval code to enable testing.

* Updated dataset splits to v1.0

* Updated version to 1.0 and database folder to /data/sets/nuscenes

* Added explanation that visibility token may be empty

* Style update (#59)

* Changed format of data_classes.py and avoided default parameters in RadarPointCloud

* Typing and line breaks in long method signatures

* Reset dynprop_states to previous settings

* Spacing and line breaks

* Spacing

* Break circular import dependencies

* More flexible setup.py

* Cleanup setup.py

* Updated radar comments

* fix setup to include root folder

* Backporting bugfix from 873b30b until eval pr is merged

* new pip package

* Workaround to allow version string to include the split, e.g. v1.0-trainval

* Mini splits and modified split logic (#63)

* Added mini splits and updated overall logic, note: changed method signatures

* Added an assertion

* Docstrings

* Fixed typo

* Updated eval code to new splits function

* Eval refactor step2 (#53)

* reorganized repo to more allow for easier injection of tests. Moved detection eval code to eval/detection subfolder

* Added verify and test instructions to main README

* Added more thorough regression type test. This seem to have triggered an error in the evaluation code

* Standardized dataroot to /data/sets/nuscenes

* Fixed random seed. Passes unit-tests now

* Added data-classes for the EvalBoxes. Cleaned up loading. Same results verified on test_delta()

* Updated all tests to use the EvalBox data class. All tests passing now.

* Changes to calculating distance from ego_vehicle (#57)

* Changed result format to accept at most one attribute (#60)

* Replaced attribute_scores by a single attribute_name

* v0.2 CI preparation. (#65)

* updated to 1.0-mini split for all tests. Skipped those that require DB

* Misc Reorg (#68)

* reorganize geometry code

* add shortcuts to common classes

* move quaternion yaw back

* change default nuscenes to mini

* Integrate mini_split into continuous integration (#71)

* Enabled v1.0-mini to be used in CI for unit-tests.

* Updated detection metrics code (#66)

* Reorganized eval code to more allow for easier injection of tests.

* Make nuScenes compatible to Python3.5/Python3.6 (#70)

* Removed python 3.6 and 3.7 only features to make compatible with 3.5

* nuscenes-devkit pip package v0.3.4

* Added num_lidar_points and num_radar_points to schema

* Added instructions to include lidar and radar points

* update create_splits_logs to new database versions (#74)

* Some unittests for algo.py and util.py (#76)

* Added unittest for utils.py and algo.py

* Repeat unit tests for Python 3.6, 3.7 + bug fix for false negative (#77)

* Fixed false negative issue. Added support for 3.6 and 3.7.

* Create a configuration factory (#78)

* create a configuration factory

* add unittests

* fix the unittests

* added token field to Box class (#79)

* added token field to Box class

* fixed line-break

* Update README installation instructions. (#75)

* Cleaned up README. Changed to support for python 3.6 and 3.7.

* added change log message for release

* * Moved continuous integration and Docker dependencies to 'setup' folder. (#80)

* Updated readme files (#81)

* Added setup/instrucions.md, updated Debris class definition, tuned main README

* update setup.py to new repo structure

* Update tutorial.ipynb

* Simple Unittests for AP calculation (#82)

* Clean up old unit tests and added tests for AP calculation

* Revert setup changes to get CI working (#84)

* Revert "* Moved continuous integration and Docker dependencies to 'setup' folder. (#80)"

This reverts commit fa55433.

* revert setup.py changes

* Bikerack filtering (#83)

* spellcheck

* add points_in_box() method to check whether points are inside the box

* test points_in_box() method

* perform bike rack filtering

* docstring for filter_eval_boxes

* added unittests for filter_eval_boxes()

* update comment

* fix docstring

* can't use Box type because of a circular import. So switched to 'Box'

* can't use Box type because of a circular import. So switched to 'Box'

* change threshold after bikerack filtering

* get max_dist from eval_detection_configs

* Update eval renderers (#85)

* Updated renderers for detection results. 
* Fixed bugs in detection evaluation.

* Added color options for render_egoposes_on_map (#86)

* add color option for render maps

* Summary AP plot update (#87)

* updated the plots

* better legend text alignment

* updated legend positions and pretty names for attributes and TP metrics

* Tex table (#91)

* added reder tex table

* cleaned up tex render method

* minor changes

* add credits

* misc cleanup

* Instruction updates and misc (#89)

* Set default version for export egoposes, better outputs

* Added v1.0 to legal datasets

* Corrected visibility level

* Set face color to black, better status messages

* Updated face color

* Set set to trainval in misc scripts

* Bugfix to preserver black bg color

* Made close_dist a parameter

* Merged truck classes in the instructions

* Removed train from instructions

* Merged police officers into single class

* Reset changes to tutorial

* Updated path to requirements.txt

* Removed v1.0 which was used for setup purposes

* Cleanup

* Fix visibility levels in instructions

* Also print empty categories

* Fixed bug

* Reversed previous changes

* Updated schema comments for map

* Reverted previous changes

* misc cleanup

* next pip package

* added unittests for TP metrics

* Readme update (#93)

* OVerhauled eval detection page

* Overhauled main readme

* Removed deprecated links

* Rewording

* Addressed review comments

* fixed the visualize sample, need to test it

* added comments to unit tests

* [WIP] Jupyter notebook tutorial (#90)

* Updated tutorial

* Unittests for true positive error metrics calculation (#92)

* added unittests for TP metrics

* fixed visualize sample

* added comments to unit tests

* added option to save sample rendering to disk

* added back showing plot if savepath is not given

* Update README.md

Clarifications

* Update schema.md

added an example for subcategories

* update dates

* cleanup imports, fix readme typos, cleaner scripts

* cleanup evaluate

* fix splits

* update unittest to split logic

* update test delta
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants