Metric Engine V2 (-> feat/no-disk) #565

james-encord · 2023-08-02T13:01:34Z

Initial WIP Metric Engine V2 (with non functional metrics commented out).

Remaining cleanup before merge into main is viable (All issues should be annotated with FIXME in the code):

Sequential Metrics (Video), may require changes of logic from (n-1,n,n+1) => (n, n+1, n+2) in engine and maybe extra stage for derived over video frame domain metrics?
KMean metrics (Image difficulty, Shape Similarity)
Masks are broken for polygons
Some bitmaps support needs to be added to metrics & parse functions (most metrics trivially support)
Metric results need testing
Sharpness metrics + a few others with fixmes, changed logic -> need verification on a decision for the new metric & testing to check it makes sense.
Clip embedding, currently justs clamps to bounding box => should we apply the mask, if so how & how will it change the embedding's usefullness.
Mask generation needs to be checked because sometimes we emit a fully empty mask, which can crash.

Note: the batch eval functions are stubs for planning out the future changes, the actual batch implementations can be ignored - just keep whatever functions we end up using in mind when editing the non-batch metric functions.

For testing: this is currently called but the result is discarded and suppressed by the migration script. Will be adding correct integration in follow-up PRs.

…tion script.

…e all metrics and maybe change db schema if needed.

…module.

* Start * Model definition, testing db creation * New API * SQL queries * Initial migration script * Partially working metric migration script * Add partial embedding support to global db storage. * Extra search terms * Database migrations support. * Bug fixes * Fixes from testing * Misc * Migrate tags * Start of new api endpoint. * Work on storing predictions and more data quality assertions in migration script. * Process predictions into db, main metadata. * Add some constraints to the database * Cleanup of metric definitions * Improve queries * Bugfixes * Bugfixes * Minor prediction changes. * Calculate prediction matches * More prediction calculation support. * Start of prediction metrics support. * Add metric correlations support. * Rename & unify object, classification. * Mutual info regression, WIP * Embedding search & more prediction support * Split new router into 3. * Fix all linting. * Post-rebase regen poetry lock * Fix some bugs with predictions from rebase & others * Bug fixes and start of prediction explorer implementation. * poetry lock fix post rebase * Support metric dissimilarity. * Move db migration script out as it is is needed to execute dynamically for mixed mode. * wip - extra formatting improvements and start of extra backend apis needed. * Rework stages * Re-validate predictions, now fully correct handling of FP/FN/TP split. * Optimize metric performance query. * Summary query filter support, part 1. * Summary query filter support, part 2. * Summary query filter support, part 3. Prediction implementation is partial and broken for the filtered use-case. * Bug fixes for GET filter query parameters. * Change some defaults. * wip: project2 actions * Re-enable reduced embedding extension storage * Improve queries * feat: introduce new frontend feat: init new frontend feat: convert the project selection page to req res feat: render old react explorer instead of search feat: adds the project selection menu delete uneeded file remove node modules * feat: adds project selector to the top right * fix: scoped queries unique keys * Backend implementation of synced create subset & upload project actions. Missing attempt at front-end implementation. * Update FE & Fix communication issues with legacy codebase from update. Almost working subset action, need filters to properly test. * Add upload to encord action, still have df issues with subset creation. * Pandas bug-fixes * feat: introduce filters UI feat: enable data filters chore: add proper types for the filters send to BE refactor: unite data and annotations explorer refactor: adds new filters to the predictions page chore: post rebase fixes feat: adds tags filter feat: adds prediction type select * fix: project selection page * fix: remove log * fix: project card show image * fix: add project hash to query keys * Post rebase frontend updates & misc fixes. Partially fix migration script. * Update migration script to work on new formatting changes for certain metrics. * feat: auth on new FE * feat: alembic migration versions. * Tag sync, alembic debugging, & re-enable proper project migration behaviour. * fix: id splitting in migration * fixdxdsxFix db unique constraint & label row hash uniqueness for local subset creation. * wip: serve FE build assets on server & migrate on startup. * WIP: adds download sandbox and disables it * fix: query invalidation & improved local fs serve logic. * fix: proper subset creation logic. * fix: disable upload to encord when filters applied * upload project & create subset: misc improvements / fixes. * WIP: fe cleanup * fix: predictions explorer set selected project * fix: display raw images for non-sandbox projects. * wip - feat: filter impl improvement & first attempt at label class filter support. * fix: use relative url only when built. * fix: null api context edge case (rare) * fix: build errors & upload to encord file path. * fix: bring back startup * fix: filter data by label class * fix fe: style & misc bugs * fix: encord upload, invalidate caches before reading updated data. * chore: rename visualize to start * chore: change default port to 8000 * fix: partially working encord-active upload, issue with local data uri only. * fix: remove scroll jitter while explorer is loading. * feat: favicon.ico * fix: debounce filters for perf. * chore: dev mode * fix: prefer relative uri - fixes upload to encord & remote subset. * chore: delete app folder * chore: replace encord-active-components with new implementation * chore: kill streamlit * fix: misc db schema fixes, more correct metric normalisation & misc bugfixes. * WIP: fix lintting * fix: all mypy + other lint issues. * fix: ea-components dev mode & error on missing components on non-dev mode. * fix: project level statistics & ttl cache * fix: black and isort * fix: project description accessor * fix: prediction migration bug. * fix: project comparison project selection * fix: classification predictions * fix: misc style fixes * fix: misc styling & logging on migration * fix: ok style for modals * fix: prediction metrics design & buckets * fix: prediction selection improvements & rename comparison domain to scope. * fix: hide prediction selector while only 1 prediction is present. * fix: upload to encord error boundaries and state updates. * fix: project name and folder missmatch * fix: ts error * fix: tagging query invalidation keys * FE-side fix: generate feature hash mapping for recursive attributes. * fix: FE quartile line overlay & naming of similarity. * fix: bounding box prediction * fix: old predictions folder structure * wip-fix: classification prediction handling. * fix: style project comparison. * fix: implement legacy classification hash handling. * fix: minor fe tweaks * fix: better scaling for large prediction v-bars * fix: react key errors * fix: duplicate metrics on predictions page * delete sreamlit file * fix: subset migration version and reduced embeddings * fix: hide metric_random, handle classification annotation_quality, do not move the folder when uploading a project to encord. * WIP: Rework of embeddings and metrics to more abstract and computationally friendly format. * WIP: More proposed cleanup for metric definitions * More changes * wip: metric refactor * wip * james: clean-up evaluation logic for new metric logic. * james: Partly working metric executor engine. * misc fixes * Many fixes, stage 1 is not partly working (mps should be commented out). * feat: finish MVP metric computation engine. Works - now need to define all metrics and maybe change db schema if needed. * fix: misc fixes. * wip: start of batch api for metrics. * Add more metrics, 4 left to implement. * Add alembic migration. * WIP: Cleanup, bug-fixes & extra verification. * fix: lint-fixes * fix: add FIXMEs * nb: add more fixme * fix: mypy collision * fix: rename executor project_dir to database_dir * tuning: use same exhaustive search threshold used internally by umap module. * fix: separate WIP migration script for metric engine. * fix: pylint --------- Co-authored-by: David Sapiro <david.sapiro@encord.com> Co-authored-by: Frederik Hvilshøj <frederik@cord.tech>

james-encord added 30 commits July 21, 2023 17:39

Start

523159b

Model definition, testing db creation

6cc68f5

New API

8eeaaf8

SQL queries

7748c20

Initial migration script

7722f17

Partially working metric migration script

d9a5fa6

Add partial embedding support to global db storage.

ce2d65d

Extra search terms

fdb3017

Database migrations support.

18d25c6

Bug fixes

7e853cb

Fixes from testing

d90ebd0

Misc

797edf8

Migrate tags

7a73f22

Start of new api endpoint.

9c31424

Work on storing predictions and more data quality assertions in migra…

b05ea7d

…tion script.

Process predictions into db, main metadata.

d9209b0

Add some constraints to the database

8ad861b

Cleanup of metric definitions

66ddbdf

Improve queries

bc130c4

Bugfixes

0353316

Bugfixes

2e26071

Minor prediction changes.

fc899c3

Calculate prediction matches

974615e

More prediction calculation support.

be0ba02

Start of prediction metrics support.

36f537d

Add metric correlations support.

97947a1

Rename & unify object, classification.

c0b321e

Mutual info regression, WIP

21b48f8

Embedding search & more prediction support

73c3686

Split new router into 3.

e110c06

james-encord and others added 17 commits July 27, 2023 13:12

WIP: More proposed cleanup for metric definitions

a2376e5

More changes

ffce2e1

wip: metric refactor

d532eb3

wip

31c414e

james: clean-up evaluation logic for new metric logic.

891db92

james: Partly working metric executor engine.

5e9f7c3

misc fixes

3e5ebcb

Many fixes, stage 1 is not partly working (mps should be commented out).

4705ead

feat: finish MVP metric computation engine. Works - now need to defin…

5bba94a

…e all metrics and maybe change db schema if needed.

fix: misc fixes.

1e66908

wip: start of batch api for metrics.

2341f4e

Merge branch 'main' into feat/metric-restructure-3

8da864d

Add more metrics, 4 left to implement.

2137a52

Add alembic migration.

1d996b9

WIP: Cleanup, bug-fixes & extra verification.

5b80632

fix: lint-fixes

3a5245f

fix: add FIXMEs

2c42b0b

james-encord requested a review from a team as a code owner August 2, 2023 13:01

james-encord added 4 commits August 2, 2023 14:19

nb: add more fixme

211a549

fix: mypy collision

90f6524

fix: rename executor project_dir to database_dir

af248e3

tuning: use same exhaustive search threshold used internally by umap …

3b0ac24

…module.

Encord-davids approved these changes Aug 3, 2023

View reviewed changes

james-encord added 3 commits August 4, 2023 14:15

Merge branch 'main' into feat/metric-restructure-3

ae9bea0

fix: separate WIP migration script for metric engine.

ab7adca

fix: pylint

6f1d681

james-encord merged commit b623250 into feat/no-disk Aug 4, 2023
1 check passed

james-encord deleted the feat/metric-restructure-3 branch August 4, 2023 14:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metric Engine V2 (-> feat/no-disk) #565

Metric Engine V2 (-> feat/no-disk) #565

james-encord commented Aug 2, 2023 •

edited

Loading

Metric Engine V2 (-> feat/no-disk) #565

Metric Engine V2 (-> feat/no-disk) #565

Conversation

james-encord commented Aug 2, 2023 • edited Loading

Remaining cleanup before merge into main is viable (All issues should be annotated with FIXME in the code):

james-encord commented Aug 2, 2023 •

edited

Loading