[Examples] refactor examples and add github action to automatically validate the scripts. #138

TheaperDeng · 2024-09-13T19:14:54Z

Description

1. Motivation and Context

Our example collection is kind of messy and hard to navigate. This PR aims to make the organized and add github action to test them to enable long-term usability.

This PR does not add any examples. New example scripts will be updated according to the necessity and request.

2. Summary of the change

Rename the folder /example -> /examples
Reorganize current examples into three catagories (noisy label detection, use pretrained benchmark, and estimate brittleness). See here for a detailed guide.
Fix all 6 examples to ensure that both CPU and GPU users could run through them in a reasonable time (<2min), delete some redundant examples.
Add a github action to run the examples for each PR, so that broken examples will be found out timely. The github action will only check the completion of example running. Accuracy monitoring will be added in later PRs.
Some minor changes in the dattri repo (add cifar10 support, AttributionTask will load the checkpoints in the same device as the model, auc will transform the score to cpu to avoid user redundant code).

3. What tests have been added/updated for the change?

Unit test: Typically, this should be included if you implemented a new function/fixed a bug.
Application test: If you wrote an example for the toolkit, this test should be added.

jiaqima

Left some minor comments. Otherwise LGTM

jiaqima · 2024-09-14T04:03:20Z

examples/noisy_label_detection/influence_function_noisy_label.py

    parser = argparse.ArgumentParser()
    parser.add_argument("--method", type=str, default="explicit")
    parser.add_argument("--device", type=str, default="cuda")
    args = parser.parse_args()

+    # load the dataset, we only need the train dataset


nit: this line is not grammatically correct

jiaqima · 2024-09-14T04:04:32Z

examples/noisy_label_detection/tracin_noisy_label.py

+    parser.add_argument("--device", type=str, default="cuda")
+    args = parser.parse_args()
+
+    # load the dataset, we only need the train dataset


same grammatical issue

examples/noisy_label_detection/trak_noisy_label.py

jiaqima · 2024-09-14T04:05:43Z

examples/noisy_label_detection/trak_noisy_label.py

@@ -42,17 +49,18 @@ def get_mnist_indices_and_adjust_labels(dataset):
        sampler=SubsetSampler(range(1000)),
    )

-    model = train_mnist_lr(train_loader_full)
-    model.cuda()
+    model = train_cifar2_resnet9(train_loader, num_epochs=3, num_classes=10)


maybe rename this function as train_cifar_resnet9?

good catch. I have revised the module to be named as dattri.benchmark.datasets.cifar that contains both cifar2 and cifar10 functions.

jiaqima · 2024-09-14T04:06:58Z

examples/readme.md

@@ -0,0 +1,25 @@
+# `dattri` examples
+This folder contains bite-sized examples which can help users to build their own application by `dattri`.


This folder contains bite-sized examples that can help users build their own applications with dattri.

jiaqima · 2024-09-14T04:08:52Z

examples/readme.md

+## Noisy label detection
+This section contains using different attributors to detect the noisy label in various datasets.
+
+[Use influence function to detect noisy label in Mnist10 + Logistic regression.](./noisy_label_detection/influence_function_noisy_label.py)


label -> labels

jiaqima · 2024-09-14T04:09:01Z

examples/readme.md

+
+[Use influence function to detect noisy label in Mnist10 + Logistic regression.](./noisy_label_detection/influence_function_noisy_label.py)
+
+[Use TracIN to detect noisy label in Mnist10 + MLP.](./noisy_label_detection/tracin_noisy_label.py)


label -> labels

jiaqima · 2024-09-14T04:09:11Z

examples/readme.md

+
+[Use TracIN to detect noisy label in Mnist10 + MLP.](./noisy_label_detection/tracin_noisy_label.py)
+
+[Use TRAK to detect noisy label in CIFAR10 + ResNet-9.](./noisy_label_detection/trak_noisy_label.py)


label -> labels

jiaqima · 2024-09-14T04:10:14Z

examples/readme.md

+
+## Use pretrained checkpoints and pre-calculated ground truth
+
+This section contains examples to use the pretrained checkpoints and pre-calculated ground truth provided by `dattri` to evaluate the data attribution methods.


to use -> using

jiaqima · 2024-09-14T04:10:38Z

examples/readme.md

+
+## Estimate the brittleness
+
+This section contains examples to use attribution score to estimate the brittleness of a model.


to use -> using

attribution score -> attribution scores

TheaperDeng changed the title ~~[examplesrefactor examples~~ [examples] refactor examples Sep 13, 2024

TheaperDeng added the work-in-progress label Sep 13, 2024

TheaperDeng changed the title ~~[examples] refactor examples~~ [WIP] [examples] refactor examples Sep 13, 2024

TheaperDeng force-pushed the example-refactor branch from e517a1c to 586c344 Compare September 14, 2024 00:05

TheaperDeng removed the work-in-progress label Sep 14, 2024

TheaperDeng changed the title ~~[WIP] [examples] refactor examples~~ [Examples] refactor examples and add github action to automatically validate the scripts. Sep 14, 2024

TheaperDeng requested review from tingwl0122 and jiaqima September 14, 2024 01:58

jiaqima approved these changes Sep 14, 2024

View reviewed changes

TheaperDeng added 14 commits September 14, 2024 12:17

refactor examples

7dfb559

add cicd examples

7197393

add cicd examples

74517b1

add cicd examples

8adb4da

add cicd examples

70507a6

add cicd examples

00de6dc

add cicd examples

862f447

add cicd examples

9df5ad7

fix bug

34a7e3e

fix bug

8480eb7

fix bug

8867876

change import

119e9b9

fix cifar

0cacef0

fix according to comments

1561cbb

TheaperDeng force-pushed the example-refactor branch from 586c344 to 1561cbb Compare September 14, 2024 19:19

TheaperDeng merged commit b653675 into TRAIS-Lab:main Sep 14, 2024
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Examples] refactor examples and add github action to automatically validate the scripts. #138

[Examples] refactor examples and add github action to automatically validate the scripts. #138

TheaperDeng commented Sep 13, 2024 •

edited

Loading

jiaqima left a comment

jiaqima Sep 14, 2024

jiaqima Sep 14, 2024

jiaqima Sep 14, 2024

TheaperDeng Sep 14, 2024

jiaqima Sep 14, 2024

jiaqima Sep 14, 2024

jiaqima Sep 14, 2024

jiaqima Sep 14, 2024

jiaqima Sep 14, 2024

jiaqima Sep 14, 2024

		@@ -0,0 +1,25 @@
		# `dattri` examples
		This folder contains bite-sized examples which can help users to build their own application by `dattri`.


		[Use influence function to detect noisy label in Mnist10 + Logistic regression.](./noisy_label_detection/influence_function_noisy_label.py)

		[Use TracIN to detect noisy label in Mnist10 + MLP.](./noisy_label_detection/tracin_noisy_label.py)


		[Use TracIN to detect noisy label in Mnist10 + MLP.](./noisy_label_detection/tracin_noisy_label.py)

		[Use TRAK to detect noisy label in CIFAR10 + ResNet-9.](./noisy_label_detection/trak_noisy_label.py)


		## Use pretrained checkpoints and pre-calculated ground truth

		This section contains examples to use the pretrained checkpoints and pre-calculated ground truth provided by `dattri` to evaluate the data attribution methods.


		## Estimate the brittleness

		This section contains examples to use attribution score to estimate the brittleness of a model.

[Examples] refactor examples and add github action to automatically validate the scripts. #138

[Examples] refactor examples and add github action to automatically validate the scripts. #138

Conversation

TheaperDeng commented Sep 13, 2024 • edited Loading

Description

1. Motivation and Context

2. Summary of the change

3. What tests have been added/updated for the change?

jiaqima left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheaperDeng commented Sep 13, 2024 •

edited

Loading