Batch evaluation intervals into a single request and a single evaluation process #554

XianzheMa · 2024-06-30T08:37:58Z

Per the title, this PR enables batching evaluations on many intervals into a single evaluation request. This resolves #536. The integration test is adjusted to cover this new functionality.

How to review

The best way to review is to first take a look at modyn/protos/evaluator.proto to see on the API level what is changed.

Miscellaneous

After this PR, batching is only enabled on server side, on client side still one interval is passed at a time, as I am confused on the current way to generate EvalRequest.

After this PR, @robinholzi could you make a PR to collect the intervals associated with the same id_model and dataset_id, and pack them in one evaluation round? I think just looking at the function _single_batched_evaluation in this file modyn/supervisor/internal/pipeline_executor/evaluation_executor.py should be enough to understand the change / how to make that PR. It should be a very easy and straightforward PR. I am just confused with the data models there. Thank you so much!

github-actions · 2024-06-30T09:08:50Z

^{( % to main)}
^{( % to main)}

codecov · 2024-06-30T20:47:23Z

Codecov Report

Attention: Patch coverage is 93.79845% with 8 lines in your changes missing coverage. Please review.

Project coverage is 82.63%. Comparing base (a4f167a) to head (d043e5d).

Files	Patch %	Lines
modyn/evaluator/internal/pytorch_evaluator.py	90.00%	4 Missing ⚠️
.../internal/pipeline_executor/evaluation_executor.py	92.30%	2 Missing ⚠️
modyn/supervisor/internal/grpc_handler.py	96.66%	1 Missing ⚠️
...or/internal/pipeline_executor/pipeline_executor.py	50.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #554      +/-   ##
==========================================
- Coverage   82.98%   82.63%   -0.35%     
==========================================
  Files         221      218       -3     
  Lines       10342    10235     -107     
==========================================
- Hits         8582     8458     -124     
- Misses       1760     1777      +17

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

MaxiBoether · 2024-06-30T22:05:53Z

Thank you so much! I will go through this tomorrow and also ping @robinholzi to review this since it affects his evaluation stuff

modyn/supervisor/internal/pipeline_executor/evaluation_executor.py

modyn/evaluator/internal/metrics/abstract_evaluation_metric.py

modyn/protos/evaluator.proto

modyn/supervisor/internal/pipeline_executor/pipeline_executor.py

robinholzi

Thanks so much @XianzheMa for putting some thought into the evaluator side of the evaluation logic! I can surely adjust the batching in supervisor to comply with the new interface! Just a couple of questions before we finalise the grpc interface changes

modyn/tests/supervisor/internal/test_grpc_handler.py

modyn/protos/evaluator.proto

MaxiBoether · 2024-07-01T08:14:29Z

Sorry, now I understand your comment. This changes the server interface to support batching but the client does not yet send batched requests :D Now I get it, sorry

modyn/evaluator/internal/grpc/evaluator_grpc_servicer.py

MaxiBoether

Thank you so much Xianzhe! I hope the changes are not too much work. We can chat with Robin about the status queue thingy, but I think we need a way of e.g. transferring exceptions to the supervisor because we should have those in the logs (I've used them before for debugging, helpful to have it in a single place). I think there are some minor issues in the interface where we need to adjust the evaluation started request, but the overall flow looks nice!

modyn/evaluator/internal/grpc/evaluator_grpc_servicer.py

modyn/evaluator/internal/metrics/abstract_evaluation_metric.py

modyn/protos/evaluator.proto

modyn/tests/evaluator/internal/test_pytorch_evaluator.py

XianzheMa · 2024-07-02T21:01:07Z

The integration test definitely passed on our server. Will check tomorrow why

MaxiBoether · 2024-07-02T21:57:47Z

The integration test definitely passed on our server. Will check tomorrow why

I think there probably is some flakyness somewhere. Sorry, I am too tired to review this properly tonight, I will do it tomorrow morning as well. But I don't think it's because of release/debug/... mode, I think this is rather some concurrency issue somewhere

MaxiBoether

Thank you! My only big comment besides nitpicky stuff is on the interval_idx. I think we should switch to using the interval boundaries cause looking at the supervisor, it would simplify the logic for the follow up PR if we directly have the interval bounds instead of some idx. I did not immediately find the reason for the integrationtest failure, though :(

modyn/evaluator/internal/grpc/evaluator_grpc_servicer.py

modyn/evaluator/internal/pytorch_evaluator.py

modyn/supervisor/internal/pipeline_executor/evaluation_executor.py

Co-authored-by: Maximilian Böther <2116466+MaxiBoether@users.noreply.github.com>

XianzheMa · 2024-07-03T09:29:48Z

The integration test definitely passed on our server. Will check tomorrow why

I think there probably is some flakyness somewhere. Sorry, I am too tired to review this properly tonight, I will do it tomorrow morning as well. But I don't think it's because of release/debug/... mode, I think this is rather some concurrency issue somewhere

everything should be fixed now. Let's see

MaxiBoether

LGTM! Let's wait for @robinholzi's review (and clarification regarding the docstring) and the integration tests, but I do not have further comments. Thank you so much!

XianzheMa · 2024-07-03T12:10:41Z

The last commit only involves a comment, and before the commit the CI passed. So I just merged without waiting more.

first commit

710eeae

XianzheMa added 5 commits June 30, 2024 15:12

finish with business code

8ee0819

format

6c3be06

remove result writer

9cd5ec7

fix & add tests

49d456e

fix test

45cdd7e

add integration test

5d3d842

XianzheMa changed the title ~~Batch evaluation requests~~ Batch evaluation intervals into a single request and a single evaluation process Jun 30, 2024

XianzheMa requested review from MaxiBoether and robinholzi June 30, 2024 21:06

XianzheMa marked this pull request as ready for review June 30, 2024 21:06

MaxiBoether reviewed Jul 1, 2024

View reviewed changes

modyn/supervisor/internal/pipeline_executor/evaluation_executor.py Show resolved Hide resolved

robinholzi reviewed Jul 1, 2024

View reviewed changes

modyn/evaluator/internal/metrics/abstract_evaluation_metric.py Outdated Show resolved Hide resolved

robinholzi reviewed Jul 1, 2024

View reviewed changes

modyn/protos/evaluator.proto Show resolved Hide resolved

robinholzi reviewed Jul 1, 2024

View reviewed changes

modyn/supervisor/internal/pipeline_executor/pipeline_executor.py Show resolved Hide resolved

robinholzi requested changes Jul 1, 2024

View reviewed changes

modyn/tests/supervisor/internal/test_grpc_handler.py Show resolved Hide resolved

MaxiBoether reviewed Jul 1, 2024

View reviewed changes

modyn/protos/evaluator.proto Show resolved Hide resolved

MaxiBoether reviewed Jul 1, 2024

View reviewed changes

modyn/evaluator/internal/grpc/evaluator_grpc_servicer.py Outdated Show resolved Hide resolved

MaxiBoether requested changes Jul 1, 2024

View reviewed changes

modyn/evaluator/internal/grpc/evaluator_grpc_servicer.py Outdated Show resolved Hide resolved

modyn/evaluator/internal/metrics/abstract_evaluation_metric.py Outdated Show resolved Hide resolved

robinholzi reviewed Jul 1, 2024

View reviewed changes

modyn/protos/evaluator.proto Outdated Show resolved Hide resolved

XianzheMa added 6 commits July 1, 2024 11:00

Merge branch 'main' into XianzheMa/feature/batch-eval-req

90320d3

review evaluator log

2d9d691

fix integration test

fe9d65f

Merge branch 'main' into XianzheMa/feature/batch-eval-req

5da5cb4

how come it fails again?

02136b6

address comments; add incomplete tests

637d2b1

XianzheMa added 2 commits July 2, 2024 13:09

remove resetting state

795cd75

Merge branch 'main' into XianzheMa/feature/batch-eval-req

64aa05f

robinholzi reviewed Jul 2, 2024

View reviewed changes

modyn/protos/evaluator.proto Outdated Show resolved Hide resolved

modyn/protos/evaluator.proto Outdated Show resolved Hide resolved

XianzheMa added 4 commits July 2, 2024 16:23

add one test in evaluator

7910e7c

fix many; but not finished

7150d6c

fix linter and unit tests

02925f4

add integration test

d292707

XianzheMa commented Jul 2, 2024

View reviewed changes

modyn/tests/evaluator/internal/test_pytorch_evaluator.py Show resolved Hide resolved

beautify

e5976b2

XianzheMa requested review from MaxiBoether and robinholzi July 2, 2024 18:43

add unit test

5a20cb5

MaxiBoether requested changes Jul 3, 2024

View reviewed changes

XianzheMa and others added 3 commits July 3, 2024 09:49

Update modyn/evaluator/internal/grpc/evaluator_grpc_servicer.py

94efd22

Co-authored-by: Maximilian Böther <2116466+MaxiBoether@users.noreply.github.com>

make client parsing better

0214c77

wait for dataset

d043e5d

XianzheMa requested a review from MaxiBoether July 3, 2024 09:29

MaxiBoether approved these changes Jul 3, 2024

View reviewed changes

add comment on interval_index

658b92a

robinholzi approved these changes Jul 3, 2024

View reviewed changes

XianzheMa merged commit b6284f8 into main Jul 3, 2024
18 checks passed

XianzheMa deleted the XianzheMa/feature/batch-eval-req branch July 3, 2024 12:10

robinholzi mentioned this pull request Jul 3, 2024

feat!: Batch evaluations in evaluator #559

Merged

XianzheMa restored the XianzheMa/feature/batch-eval-req branch July 4, 2024 08:43

XianzheMa deleted the XianzheMa/feature/batch-eval-req branch July 7, 2024 06:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch evaluation intervals into a single request and a single evaluation process #554

Batch evaluation intervals into a single request and a single evaluation process #554

XianzheMa commented Jun 30, 2024 •

edited

Loading

github-actions bot commented Jun 30, 2024

codecov bot commented Jun 30, 2024 •

edited

Loading

MaxiBoether commented Jun 30, 2024

robinholzi left a comment

MaxiBoether commented Jul 1, 2024

MaxiBoether left a comment

XianzheMa commented Jul 2, 2024

MaxiBoether commented Jul 2, 2024

MaxiBoether left a comment

XianzheMa commented Jul 3, 2024

MaxiBoether left a comment •

edited

Loading

XianzheMa commented Jul 3, 2024

Batch evaluation intervals into a single request and a single evaluation process #554

Batch evaluation intervals into a single request and a single evaluation process #554

Conversation

XianzheMa commented Jun 30, 2024 • edited Loading

How to review

Miscellaneous

github-actions bot commented Jun 30, 2024

codecov bot commented Jun 30, 2024 • edited Loading

Codecov Report

MaxiBoether commented Jun 30, 2024

robinholzi left a comment

Choose a reason for hiding this comment

MaxiBoether commented Jul 1, 2024

MaxiBoether left a comment

Choose a reason for hiding this comment

XianzheMa commented Jul 2, 2024

MaxiBoether commented Jul 2, 2024

MaxiBoether left a comment

Choose a reason for hiding this comment

XianzheMa commented Jul 3, 2024

MaxiBoether left a comment • edited Loading

Choose a reason for hiding this comment

XianzheMa commented Jul 3, 2024

XianzheMa commented Jun 30, 2024 •

edited

Loading

codecov bot commented Jun 30, 2024 •

edited

Loading

MaxiBoether left a comment •

edited

Loading