Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metric Engine V2 (-> feat/no-disk) #565

Merged
merged 155 commits into from
Aug 4, 2023

Conversation

james-encord
Copy link
Contributor

@james-encord james-encord commented Aug 2, 2023

Initial WIP Metric Engine V2 (with non functional metrics commented out).

Remaining cleanup before merge into main is viable (All issues should be annotated with FIXME in the code):

  • Sequential Metrics (Video), may require changes of logic from (n-1,n,n+1) => (n, n+1, n+2) in engine and maybe extra stage for derived over video frame domain metrics?
  • KMean metrics (Image difficulty, Shape Similarity)
  • Masks are broken for polygons
  • Some bitmaps support needs to be added to metrics & parse functions (most metrics trivially support)
  • Metric results need testing
  • Sharpness metrics + a few others with fixmes, changed logic -> need verification on a decision for the new metric & testing to check it makes sense.
  • Clip embedding, currently justs clamps to bounding box => should we apply the mask, if so how & how will it change the embedding's usefullness.
  • Mask generation needs to be checked because sometimes we emit a fully empty mask, which can crash.

Note: the batch eval functions are stubs for planning out the future changes, the actual batch implementations can be ignored - just keep whatever functions we end up using in mind when editing the non-batch metric functions.

For testing: this is currently called but the result is discarded and suppressed by the migration script. Will be adding correct integration in follow-up PRs.

@james-encord james-encord requested a review from a team as a code owner August 2, 2023 13:01
@james-encord james-encord merged commit b623250 into feat/no-disk Aug 4, 2023
1 check passed
@james-encord james-encord deleted the feat/metric-restructure-3 branch August 4, 2023 14:50
james-encord added a commit that referenced this pull request Aug 16, 2023
* Start

* Model definition, testing db creation

* New API

* SQL queries

* Initial migration script

* Partially working metric migration script

* Add partial embedding support to global db storage.

* Extra search terms

* Database migrations support.

* Bug fixes

* Fixes from testing

* Misc

* Migrate tags

* Start of new api endpoint.

* Work on storing predictions and more data quality assertions in migration script.

* Process predictions into db, main metadata.

* Add some constraints to the database

* Cleanup of metric definitions

* Improve queries

* Bugfixes

* Bugfixes

* Minor prediction changes.

* Calculate prediction matches

* More prediction calculation support.

* Start of prediction metrics support.

* Add metric correlations support.

* Rename & unify object, classification.

* Mutual info regression, WIP

* Embedding search & more prediction support

* Split new router into 3.

* Fix all linting.

* Post-rebase regen poetry lock

* Fix some bugs with predictions from rebase & others

* Bug fixes and start of prediction explorer implementation.

* poetry lock fix post rebase

* Support metric dissimilarity.

* Move db migration script out as it is is needed to execute dynamically for mixed mode.

* wip - extra formatting improvements and start of extra backend apis needed.

* Rework stages

* Re-validate predictions, now fully correct handling of FP/FN/TP split.

* Optimize metric performance query.

* Summary query filter support, part 1.

* Summary query filter support, part 2.

* Summary query filter support, part 3. Prediction implementation is partial and broken for the filtered use-case.

* Bug fixes for GET filter query parameters.

* Change some defaults.

* wip: project2 actions

* Re-enable reduced embedding extension storage

* Improve queries

* feat: introduce new frontend

feat: init new frontend

feat: convert the project selection page to req res

feat: render old react explorer instead of search

feat: adds the project selection menu

delete uneeded file

remove node modules

* feat: adds project selector to the top right

* fix: scoped queries unique keys

* Backend implementation of synced create subset & upload project actions.

Missing attempt at front-end implementation.

* Update FE & Fix communication issues with legacy codebase from update. Almost working subset action, need filters to properly test.

* Add upload to encord action, still have df issues with subset creation.

* Pandas bug-fixes

* feat: introduce filters UI

feat: enable data filters

chore: add proper types for the filters send to BE

refactor: unite data and annotations explorer

refactor: adds new filters to the predictions page

chore: post rebase fixes

feat: adds tags filter

feat: adds prediction type select

* fix: project selection page

* fix: remove log

* fix: project card show image

* fix: add project hash to query keys

* Post rebase frontend updates & misc fixes. Partially fix migration script.

* Update migration script to work on new formatting changes for certain metrics.

* feat: auth on new FE

* feat: alembic migration versions.

* Tag sync, alembic debugging, & re-enable proper project migration behaviour.

* fix: id splitting in migration

* fixdxdsxFix db unique constraint & label row hash uniqueness for local subset creation.

* wip: serve FE build assets on server & migrate on startup.

* WIP: adds download sandbox and disables it

* fix: query invalidation & improved local fs serve logic.

* fix: proper subset creation logic.

* fix: disable upload to encord when filters applied

* upload project & create subset: misc improvements / fixes.

* WIP: fe cleanup

* fix: predictions explorer set selected project

* fix: display raw images for non-sandbox projects.

* wip - feat: filter impl improvement & first attempt at label class filter support.

* fix: use relative url only when built.

* fix: null api context edge case (rare)

* fix: build errors & upload to encord file path.

* fix: bring back startup

* fix: filter data by label class

* fix fe: style & misc bugs

* fix: encord upload, invalidate caches before reading updated data.

* chore: rename visualize to start

* chore: change default port to 8000

* fix: partially working encord-active upload, issue with local data uri only.

* fix: remove scroll jitter while explorer is loading.

* feat: favicon.ico

* fix: debounce filters for perf.

* chore: dev mode

* fix: prefer relative uri - fixes upload to encord & remote subset.

* chore: delete app folder

* chore: replace encord-active-components with new implementation

* chore: kill streamlit

* fix: misc db schema fixes, more correct metric normalisation & misc bugfixes.

* WIP: fix lintting

* fix: all mypy + other lint issues.

* fix: ea-components dev mode & error on missing components on non-dev mode.

* fix: project level statistics & ttl cache

* fix: black and isort

* fix: project description accessor

* fix: prediction migration bug.

* fix: project comparison project selection

* fix: classification predictions

* fix: misc style fixes

* fix: misc styling & logging on migration

* fix: ok style for modals

* fix: prediction metrics design & buckets

* fix: prediction selection improvements & rename comparison domain to scope.

* fix: hide prediction selector while only 1 prediction is present.

* fix: upload to encord error boundaries and state updates.

* fix: project name and folder missmatch

* fix: ts error

* fix: tagging query invalidation keys

* FE-side fix: generate feature hash mapping for recursive attributes.

* fix: FE quartile line overlay & naming of similarity.

* fix: bounding box prediction

* fix: old predictions folder structure

* wip-fix: classification prediction handling.

* fix: style project comparison.

* fix: implement legacy classification hash handling.

* fix: minor fe tweaks

* fix: better scaling for large prediction v-bars

* fix: react key errors

* fix: duplicate metrics on predictions page

* delete sreamlit file

* fix: subset migration version and reduced embeddings

* fix: hide metric_random, handle classification annotation_quality, do not move the folder when uploading a project to encord.

* WIP: Rework of embeddings and metrics to more abstract and computationally friendly format.

* WIP: More proposed cleanup for metric definitions

* More changes

* wip: metric refactor

* wip

* james: clean-up evaluation logic for new metric logic.

* james: Partly working metric executor engine.

* misc fixes

* Many fixes, stage 1 is not partly working (mps should be commented out).

* feat: finish MVP metric computation engine. Works - now need to define all metrics and maybe change db schema if needed.

* fix: misc fixes.

* wip: start of batch api for metrics.

* Add more metrics, 4 left to implement.

* Add alembic migration.

* WIP: Cleanup, bug-fixes & extra verification.

* fix: lint-fixes

* fix: add FIXMEs

* nb: add more fixme

* fix: mypy collision

* fix: rename executor project_dir to database_dir

* tuning: use same exhaustive search threshold used internally by umap module.

* fix: separate WIP migration script for metric engine.

* fix: pylint

---------

Co-authored-by: David Sapiro <david.sapiro@encord.com>
Co-authored-by: Frederik Hvilshøj <frederik@cord.tech>
Encord-davids added a commit that referenced this pull request Sep 18, 2023
* Start

* Model definition, testing db creation

* New API

* SQL queries

* Initial migration script

* Partially working metric migration script

* Add partial embedding support to global db storage.

* Extra search terms

* Database migrations support.

* Bug fixes

* Fixes from testing

* Misc

* Migrate tags

* Start of new api endpoint.

* Work on storing predictions and more data quality assertions in migration script.

* Process predictions into db, main metadata.

* Add some constraints to the database

* Cleanup of metric definitions

* Improve queries

* Bugfixes

* Bugfixes

* Minor prediction changes.

* Calculate prediction matches

* More prediction calculation support.

* Start of prediction metrics support.

* Add metric correlations support.

* Rename & unify object, classification.

* Mutual info regression, WIP

* Embedding search & more prediction support

* Split new router into 3.

* Fix all linting.

* Post-rebase regen poetry lock

* Fix some bugs with predictions from rebase & others

* Bug fixes and start of prediction explorer implementation.

* poetry lock fix post rebase

* Support metric dissimilarity.

* Move db migration script out as it is is needed to execute dynamically for mixed mode.

* wip - extra formatting improvements and start of extra backend apis needed.

* Rework stages

* Re-validate predictions, now fully correct handling of FP/FN/TP split.

* Optimize metric performance query.

* Summary query filter support, part 1.

* Summary query filter support, part 2.

* Summary query filter support, part 3. Prediction implementation is partial and broken for the filtered use-case.

* Bug fixes for GET filter query parameters.

* Change some defaults.

* wip: project2 actions

* Re-enable reduced embedding extension storage

* Improve queries

* feat: introduce new frontend

feat: init new frontend

feat: convert the project selection page to req res

feat: render old react explorer instead of search

feat: adds the project selection menu

delete uneeded file

remove node modules

* feat: adds project selector to the top right

* fix: scoped queries unique keys

* Backend implementation of synced create subset & upload project actions.

Missing attempt at front-end implementation.

* Update FE & Fix communication issues with legacy codebase from update. Almost working subset action, need filters to properly test.

* Add upload to encord action, still have df issues with subset creation.

* Pandas bug-fixes

* feat: introduce filters UI

feat: enable data filters

chore: add proper types for the filters send to BE

refactor: unite data and annotations explorer

refactor: adds new filters to the predictions page

chore: post rebase fixes

feat: adds tags filter

feat: adds prediction type select

* fix: project selection page

* fix: remove log

* fix: project card show image

* fix: add project hash to query keys

* Post rebase frontend updates & misc fixes. Partially fix migration script.

* Update migration script to work on new formatting changes for certain metrics.

* feat: auth on new FE

* feat: alembic migration versions.

* Tag sync, alembic debugging, & re-enable proper project migration behaviour.

* fix: id splitting in migration

* fixdxdsxFix db unique constraint & label row hash uniqueness for local subset creation.

* wip: serve FE build assets on server & migrate on startup.

* WIP: adds download sandbox and disables it

* fix: query invalidation & improved local fs serve logic.

* fix: proper subset creation logic.

* fix: disable upload to encord when filters applied

* upload project & create subset: misc improvements / fixes.

* WIP: fe cleanup

* fix: predictions explorer set selected project

* fix: display raw images for non-sandbox projects.

* wip - feat: filter impl improvement & first attempt at label class filter support.

* fix: use relative url only when built.

* fix: null api context edge case (rare)

* fix: build errors & upload to encord file path.

* fix: bring back startup

* fix: filter data by label class

* fix fe: style & misc bugs

* fix: encord upload, invalidate caches before reading updated data.

* chore: rename visualize to start

* chore: change default port to 8000

* fix: partially working encord-active upload, issue with local data uri only.

* fix: remove scroll jitter while explorer is loading.

* feat: favicon.ico

* fix: debounce filters for perf.

* chore: dev mode

* fix: prefer relative uri - fixes upload to encord & remote subset.

* chore: delete app folder

* chore: replace encord-active-components with new implementation

* chore: kill streamlit

* fix: misc db schema fixes, more correct metric normalisation & misc bugfixes.

* WIP: fix lintting

* fix: all mypy + other lint issues.

* fix: ea-components dev mode & error on missing components on non-dev mode.

* fix: project level statistics & ttl cache

* fix: black and isort

* fix: project description accessor

* fix: prediction migration bug.

* fix: project comparison project selection

* fix: classification predictions

* fix: misc style fixes

* fix: misc styling & logging on migration

* fix: ok style for modals

* fix: prediction metrics design & buckets

* fix: prediction selection improvements & rename comparison domain to scope.

* fix: hide prediction selector while only 1 prediction is present.

* fix: upload to encord error boundaries and state updates.

* fix: project name and folder missmatch

* fix: ts error

* fix: tagging query invalidation keys

* FE-side fix: generate feature hash mapping for recursive attributes.

* fix: FE quartile line overlay & naming of similarity.

* fix: bounding box prediction

* fix: old predictions folder structure

* wip-fix: classification prediction handling.

* fix: style project comparison.

* fix: implement legacy classification hash handling.

* fix: minor fe tweaks

* fix: better scaling for large prediction v-bars

* fix: react key errors

* fix: duplicate metrics on predictions page

* delete sreamlit file

* fix: subset migration version and reduced embeddings

* fix: hide metric_random, handle classification annotation_quality, do not move the folder when uploading a project to encord.

* WIP: Rework of embeddings and metrics to more abstract and computationally friendly format.

* WIP: More proposed cleanup for metric definitions

* More changes

* wip: metric refactor

* wip

* james: clean-up evaluation logic for new metric logic.

* james: Partly working metric executor engine.

* misc fixes

* Many fixes, stage 1 is not partly working (mps should be commented out).

* feat: finish MVP metric computation engine. Works - now need to define all metrics and maybe change db schema if needed.

* fix: misc fixes.

* wip: start of batch api for metrics.

* Add more metrics, 4 left to implement.

* Add alembic migration.

* WIP: Cleanup, bug-fixes & extra verification.

* fix: lint-fixes

* fix: add FIXMEs

* nb: add more fixme

* fix: mypy collision

* fix: rename executor project_dir to database_dir

* tuning: use same exhaustive search threshold used internally by umap module.

* fix: separate WIP migration script for metric engine.

* fix: pylint

---------

Co-authored-by: David Sapiro <david.sapiro@encord.com>
Co-authored-by: Frederik Hvilshøj <frederik@cord.tech>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants