Twin RHO Model Step 1: create the Twin RHO Model #547

XianzheMa · 2024-06-24T18:00:51Z

This PR is the first PR to implement another way of producing holdout set, il model and irreducible loss (typically suitable for small datasets):

Split training set into two halves; train two IL models, each on one half. Each model provides the irreducible loss for samples that it was not trained on. The main model is still trained on the original training set (CIFAR10, CIFAR100, CINIC-10).

Our current architecture only allow one trigger id to correspond to one model id. To accommodate two il models within one trigger, I create a "twin model" which internally consists of two il models. During training, each il model will memorize the sample ids it has seen, so that during evaluation each il model will be used for the samples the model hasn't seen.

How it works

At selector, RHOLossDownsamplingStrategy randomly samples half of the training set and mark the used column in selector_state_metadata table of those samples as True. The strategy issues a request to train a RHOLOSSTwinModel on this TSS. (unimplemented)
At trainer server, RHOLOSSTwinModel is instantiated. Only the 0th model is trained on this dataset (implemented in this PR).
At selector, RHOLossDownsamplingStrategy produces the other half of the training set by selecting the samples with used==False. The strategy issues a request to finetune this twin model. (unimplemented)
At trainer server, RHOLOSSTwinModel is instantiated again. Only the 1th model is trained on this dataset (implemented in this PR).
At selector, (optionally) clear the used flags.
At trainer server when training the main model: nothing needed to change as the logic is handled internally in the twin model.

Apparently it is not the optimal way to train a twin RHO model, but it's a very straightforward way and we can optimize it depending on how well it performs.

Current drawbacks

Due to used RHOLoss will currently be not compatible with some presampling strategies that also use used fields such as FreshnessSamplingStrategy.

Next PR

Implementing step 1 and 3: preparing the split holdout set.

How to review

All the main logic is in modyn/models/rho_loss_twin_model/rho_loss_twin_model.py

github-actions · 2024-06-24T18:31:36Z

^{( % to main)}
^{( % to main)}

codecov · 2024-06-24T21:12:59Z

Codecov Report

Attention: Patch coverage is 97.36842% with 2 lines in your changes missing coverage. Please review.

Project coverage is 82.92%. Comparing base (59ea026) to head (2992df2).

❗ Current head 2992df2 differs from pull request most recent head cddc9ea

Please upload reports for the commit cddc9ea to get more accurate results.

Files	Patch %	Lines
...ig/schema/pipeline/sampling/downsampling_config.py	83.33%	1 Missing ⚠️
...pling_strategies/rho_loss_downsampling_strategy.py	87.50%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #547      +/-   ##
==========================================
+ Coverage   82.84%   82.92%   +0.08%     
==========================================
  Files         220      221       +1     
  Lines       10235    10298      +63     
==========================================
+ Hits         8479     8540      +61     
- Misses       1756     1758       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

MaxiBoether

The changes mostly look good, but I am a bit confused about who calls set_extra_state/updates the current model and it whether we can make this a bit less convoluted. It's not clear to me yet why this dict is necessary. Also I wonder whether we can avoid calling both models :)

modyn/config/schema/pipeline/sampling/downsampling_config.py

modyn/models/dlrm/dlrm.py

modyn/models/rho_loss_twin_model/rho_loss_twin_model.py

MaxiBoether

Thanks! The updated logic looks good and avoids the double model call when possible. My only comment left is regarding the nolintS

first commit

3274c16

XianzheMa added 4 commits June 24, 2024 20:33

change interface

7f2b802

Merge branch 'main' into XianzheMa/feature/twin-rho-model

209830e

fix ci

972cbf7

fix a logit

cff027d

XianzheMa added 6 commits June 25, 2024 10:52

Merge branch 'main' into XianzheMa/feature/twin-rho-model

dd5e2f3

modify

ea1d6de

add notimplemented exception

b0a0b8a

Merge branch 'main' into XianzheMa/feature/twin-rho-model

f1163a3

add unit tests

7857c6e

add test

00301ac

XianzheMa changed the title ~~Twin RHO Model~~ Twin RHO Model Step 1: create the Twin RHO Model Jun 25, 2024

XianzheMa requested a review from MaxiBoether June 25, 2024 14:41

linter

1971af1

XianzheMa marked this pull request as ready for review June 25, 2024 14:42

MaxiBoether requested changes Jun 26, 2024

View reviewed changes

add tests

6b6e52f

XianzheMa requested a review from MaxiBoether June 26, 2024 22:13

XianzheMa added 2 commits June 27, 2024 08:20

add type

a402d99

Merge branch 'main' into XianzheMa/feature/twin-rho-model

2992df2

XianzheMa mentioned this pull request Jun 27, 2024

Twin RHO Model Step 2: split the training set and train the twin model #552

Merged

MaxiBoether approved these changes Jul 1, 2024

View reviewed changes

XianzheMa added 3 commits July 1, 2024 10:13

add del

3e1e6fd

yearbook

cddc9ea

add unit test

3a465ff

XianzheMa merged commit 60d3baa into main Jul 1, 2024
11 checks passed

XianzheMa deleted the XianzheMa/feature/twin-rho-model branch July 3, 2024 21:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Twin RHO Model Step 1: create the Twin RHO Model #547

Twin RHO Model Step 1: create the Twin RHO Model #547

XianzheMa commented Jun 24, 2024 •

edited

Loading

github-actions bot commented Jun 24, 2024

codecov bot commented Jun 24, 2024 •

edited

Loading

MaxiBoether left a comment

MaxiBoether left a comment

Twin RHO Model Step 1: create the Twin RHO Model #547

Twin RHO Model Step 1: create the Twin RHO Model #547

Conversation

XianzheMa commented Jun 24, 2024 • edited Loading

How it works

Current drawbacks

Next PR

How to review

github-actions bot commented Jun 24, 2024

codecov bot commented Jun 24, 2024 • edited Loading

Codecov Report

MaxiBoether left a comment

Choose a reason for hiding this comment

MaxiBoether left a comment

Choose a reason for hiding this comment

XianzheMa commented Jun 24, 2024 •

edited

Loading

codecov bot commented Jun 24, 2024 •

edited

Loading