Skip to content

Commit c491a19

Browse files
authored
Update README.md
1 parent e5d3995 commit c491a19

File tree

1 file changed

+21
-0
lines changed

1 file changed

+21
-0
lines changed

README.md

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,24 @@ Definition : Given a source domain Ds and a learning task Ts, a target domain Dt
88

99
A good explanation of how to use transfer learning practically is explained in http://cs231n.github.io/transfer-learning/
1010

11+
When and how to fine-tune? How do you decide what type of transfer learning you should perform on a new dataset?
12+
This is a function of several factors, but the two most important ones are the size of the new dataset (small or big), and its similarity
13+
to the original dataset (e.g. ImageNet-like in terms of the content of images and the classes, or very different, such as microscope images).
14+
Keeping in mind that ConvNet features are more generic in early layers and more original-dataset-specific in later layers,
15+
here are some common rules of thumb for navigating the 4 major scenarios:
16+
17+
 New dataset is small and similar to original dataset. Since the data is small, it is not a good idea to fine-tune the ConvNet
18+
due to overfitting concerns. Since the data is similar to the original data, we expect higher-level features in the ConvNet to be
19+
relevant to this dataset as well. Hence, the best idea might be to train a linear classifier on the CNN codes.
20+
21+
 New dataset is large and similar to the original dataset. Since we have more data, we can have more confidence that we won’t
22+
overfit if we were to try to fine-tune through the full network.
23+
24+
 New dataset is small but very different from the original dataset. Since the data is small, it is likely best to only train a
25+
linear classifier. Since the dataset is very different, it might not be best to train the classifier form the top of the network,
26+
which contains more dataset-specific features. Instead, it might work better to train the SVM classifier from activations somewhere
27+
earlier in the network.
28+
29+
 New dataset is large and very different from the original dataset. Since the dataset is very large, we may expect that we can
30+
afford to train a ConvNet from scratch. However, in practice it is very often still beneficial to initialize with weights from a
31+
pretrained model. In this case, we would have enough data and confidence to fine-tune through the entire network.

0 commit comments

Comments
 (0)