Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Examples] refactor examples and add github action to automatically validate the scripts. #138

Merged
merged 14 commits into from
Sep 14, 2024

Conversation

TheaperDeng
Copy link
Collaborator

@TheaperDeng TheaperDeng commented Sep 13, 2024

Description

1. Motivation and Context

Our example collection is kind of messy and hard to navigate. This PR aims to make the organized and add github action to test them to enable long-term usability.

This PR does not add any examples. New example scripts will be updated according to the necessity and request.

2. Summary of the change

  1. Rename the folder /example -> /examples
  2. Reorganize current examples into three catagories (noisy label detection, use pretrained benchmark, and estimate brittleness). See here for a detailed guide.
  3. Fix all 6 examples to ensure that both CPU and GPU users could run through them in a reasonable time (<2min), delete some redundant examples.
  4. Add a github action to run the examples for each PR, so that broken examples will be found out timely. The github action will only check the completion of example running. Accuracy monitoring will be added in later PRs.
  5. Some minor changes in the dattri repo (add cifar10 support, AttributionTask will load the checkpoints in the same device as the model, auc will transform the score to cpu to avoid user redundant code).

3. What tests have been added/updated for the change?

  • Unit test: Typically, this should be included if you implemented a new function/fixed a bug.
  • Application test: If you wrote an example for the toolkit, this test should be added.

@TheaperDeng TheaperDeng changed the title [examplesrefactor examples [examples] refactor examples Sep 13, 2024
@TheaperDeng TheaperDeng changed the title [examples] refactor examples [WIP] [examples] refactor examples Sep 13, 2024
@TheaperDeng TheaperDeng changed the title [WIP] [examples] refactor examples [Examples] refactor examples and add github action to automatically validate the scripts. Sep 14, 2024
Copy link
Contributor

@jiaqima jiaqima left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some minor comments. Otherwise LGTM

parser = argparse.ArgumentParser()
parser.add_argument("--method", type=str, default="explicit")
parser.add_argument("--device", type=str, default="cuda")
args = parser.parse_args()

# load the dataset, we only need the train dataset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this line is not grammatically correct

parser.add_argument("--device", type=str, default="cuda")
args = parser.parse_args()

# load the dataset, we only need the train dataset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same grammatical issue

@@ -42,17 +49,18 @@ def get_mnist_indices_and_adjust_labels(dataset):
sampler=SubsetSampler(range(1000)),
)

model = train_mnist_lr(train_loader_full)
model.cuda()
model = train_cifar2_resnet9(train_loader, num_epochs=3, num_classes=10)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe rename this function as train_cifar_resnet9?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. I have revised the module to be named as dattri.benchmark.datasets.cifar that contains both cifar2 and cifar10 functions.

@@ -0,0 +1,25 @@
# `dattri` examples
This folder contains bite-sized examples which can help users to build their own application by `dattri`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This folder contains bite-sized examples that can help users build their own applications with dattri.

## Noisy label detection
This section contains using different attributors to detect the noisy label in various datasets.

[Use influence function to detect noisy label in Mnist10 + Logistic regression.](./noisy_label_detection/influence_function_noisy_label.py)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

label -> labels


[Use influence function to detect noisy label in Mnist10 + Logistic regression.](./noisy_label_detection/influence_function_noisy_label.py)

[Use TracIN to detect noisy label in Mnist10 + MLP.](./noisy_label_detection/tracin_noisy_label.py)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

label -> labels


[Use TracIN to detect noisy label in Mnist10 + MLP.](./noisy_label_detection/tracin_noisy_label.py)

[Use TRAK to detect noisy label in CIFAR10 + ResNet-9.](./noisy_label_detection/trak_noisy_label.py)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

label -> labels


## Use pretrained checkpoints and pre-calculated ground truth

This section contains examples to use the pretrained checkpoints and pre-calculated ground truth provided by `dattri` to evaluate the data attribution methods.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to use -> using


## Estimate the brittleness

This section contains examples to use attribution score to estimate the brittleness of a model.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to use -> using

attribution score -> attribution scores

@TheaperDeng TheaperDeng merged commit b653675 into TRAIS-Lab:main Sep 14, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants