Online executor for large scale datasets #1252

thekej · 2024-09-17T22:08:10Z

The current brainscore implementation is optimized to handle small datasets to study neural alignment or behavior on small datasets. While this is great, it does not give user the ability to test their model activations on larger scale datasets.
We have tested the practicality of using brainscore for getting metrics on larger scale datasets (7k videos) but we have ended up with the following challenges:

No behavioral readout adapted for videos, or more generally any data that depends on time. Current pipeline only supports logistic regression readouts.
Inability to do data augmentation during the training of the readout. According to our experiments, data augmentation is essential in this readout task.
Extraction is handled such that everything is put on ram during execution. while this is fine for image models with limited datasets, it becomes a huge limitation when you are working with large scale datasets.

What we propose in this PR:

A transformer based readout mechanism added as a new task in behavior. Supports usual training mechanism as well as online training.
An online executor that handles, data augmentation, feature extractions and readout training.
A new inferencer is added to use this executor.

This implementation does not affect any original functionality of brainscore.

mike-ferguson · 2024-09-18T12:34:49Z

@thekej wow - this is great! Thanks so much for opening a PR; our testing suite is being ported to AWS currently, so things might be a bit rocky while we sort this out. @mschrimpf I am tagging you for review here, as support for video has been a long standing goal of Brain-Score.

mschrimpf · 2024-10-28T12:15:30Z

@YingtianDt

YingtianDt · 2024-10-28T13:22:17Z

In brainscore_vision/model_helpers/activations/temporal/core/executor.py, a readout module is directly trained within the executor so that it gets trained online. The executor then returns "dummy_activations", which basically breaks the APIs currently used by the downstream modules.

This is an interesting construct, but I encourage the author to use it locally at this moment. We can talk in the future to see if there is a better way to incorporate these ideas.

Khaled Jedoui and others added 3 commits September 16, 2024 19:42

add online executor for video benchmarks + add video behavior model

80dc5c6

Merge branch 'brain-score:master' into online_executor

42d159c

Update pyproject.toml

1987dfc

mike-ferguson requested a review from mschrimpf September 18, 2024 12:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Online executor for large scale datasets #1252

Online executor for large scale datasets #1252

Uh oh!

thekej commented Sep 17, 2024

Uh oh!

mike-ferguson commented Sep 18, 2024

Uh oh!

mschrimpf commented Oct 28, 2024

Uh oh!

YingtianDt commented Oct 28, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Online executor for large scale datasets #1252

Are you sure you want to change the base?

Online executor for large scale datasets #1252

Uh oh!

Conversation

thekej commented Sep 17, 2024

Uh oh!

mike-ferguson commented Sep 18, 2024

Uh oh!

mschrimpf commented Oct 28, 2024

Uh oh!

YingtianDt commented Oct 28, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants