You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
supervised tutorial and autotune documentation with python tabs
Summary: Docusaurus now allows multiple language tabs. This commit adds a python tab for supervised and autotune examples. It also includes a snippet that activates the same language tab for the whole page.
Reviewed By: EdouardGrave
Differential Revision: D17091834
fbshipit-source-id: 6e6f76aa9408baa08fcd6c0bfd011de2cb477dfb
Copy file name to clipboardExpand all lines: docs/autotune.md
+65-3Lines changed: 65 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -13,28 +13,55 @@ In order to activate hyperparameter optimization, we must provide a validation f
13
13
14
14
For example, using the same data as our [tutorial example](/docs/en/supervised-tutorial.html#our-first-classifier), the autotune can be used in the following way:
>> ./fasttext test model_cooking.bin data/cooking.valid
38
+
>> ./fasttext test model_cooking.bin cooking.valid
28
39
N 3000
29
40
P@1 0.666
30
41
R@1 0.288
31
42
```
43
+
<!--Python-->
44
+
```py
45
+
>>> model.test("cooking.valid")
46
+
(3000L, 0.666, 0.288)
47
+
```
48
+
<!--END_DOCUSAURUS_CODE_TABS-->
49
+
32
50
33
51
By default, the search will take 5 minutes. You can set the timeout in seconds with the `-autotune-duration` argument. For example, if you want to set the limit to 10 minutes:
>>> model = fasttext.train_supervised(input='cooking.train', autotuneValidationFile='cooking.valid', autotuneDuration=600)
62
+
```
63
+
<!--END_DOCUSAURUS_CODE_TABS-->
64
+
38
65
39
66
While autotuning, fastText displays the best f1-score found so far. If we decide to stop the tuning before the time limit, we can send one `SIGINT` signal (via `CTLR-C` for example). FastText will then finish the current training, and retrain with the best parameters found so far.
40
67
@@ -46,23 +73,42 @@ As you may know, fastText can compress the model with [quantization](/docs/en/ch
46
73
47
74
Fortunately, autotune can also find the hyperparameters for this compression task while targeting the desired model size. To this end, we can set the `-autotune-modelsize` argument:
This will produce a `.ftz` file with the best accuracy having the desired size:
54
82
```sh
55
83
>> ls -la model_cooking.ftz
56
84
-rw-r--r--. 1 celebio users 1990862 Aug 25 05:39 model_cooking.ftz
57
-
>> ./fasttext test model_cooking.ftz data/cooking.valid
85
+
>> ./fasttext test model_cooking.ftz cooking.valid
58
86
N 3000
59
87
P@1 0.57
60
88
R@1 0.246
61
89
```
90
+
<!--Python-->
91
+
```py
92
+
>>>import fasttext
93
+
>>> model = fasttext.train_supervised(input='cooking.train', autotuneValidationFile='cooking.valid', autotuneModelSize="2M")
94
+
```
95
+
If you save the model, you will obtain a model file with the desired size:
96
+
```py
97
+
>>> model.save_model("model_cooking.ftz")
98
+
>>>import os
99
+
>>> os.stat("model_cooking.ftz").st_size
100
+
1990862
101
+
>>> model.test("cooking.valid")
102
+
(3000L, 0.57, 0.246)
103
+
```
104
+
<!--END_DOCUSAURUS_CODE_TABS-->
62
105
63
106
64
107
# How to set the optimization metric?
65
108
109
+
<!--DOCUSAURUS_CODE_TABS-->
110
+
<!--Command line-->
111
+
<br />
66
112
By default, autotune will test the validation file you provide, exactly the same way as `./fasttext test model_cooking.bin cooking.valid` and try to optimize to get the highest [f1-score](https://en.wikipedia.org/wiki/F1_score).
67
113
68
114
But, if we want to optimize the score of a specific label, say `__label__baking`, we can set the `-autotune-metric` argument:
@@ -74,3 +120,19 @@ But, if we want to optimize the score of a specific label, say `__label__baking`
74
120
This is equivalent to manually optimize the f1-score we get when we test with `./fasttext test-label model_cooking.bin cooking.valid | grep __label__baking` in command line.
75
121
76
122
Sometimes, you may be interested in predicting more than one label. For example, if you were optimizing the hyperparameters manually to get the best score to predict two labels, you would test with `./fasttext test model_cooking.bin cooking.valid 2`. You can also tell autotune to optimize the parameters by testing two labels with the `-autotune-predictions` argument.
123
+
<!--Python-->
124
+
<br />
125
+
By default, autotune will test the validation file you provide, exactly the same way as `model.test("cooking.valid")` and try to optimize to get the highest [f1-score](https://en.wikipedia.org/wiki/F1_score).
126
+
127
+
But, if we want to optimize the score of a specific label, say `__label__baking`, we can set the `autotuneMetric` argument:
128
+
129
+
```py
130
+
>>>import fasttext
131
+
>>> model = fasttext.train_supervised(input='cooking.train', autotuneValidationFile='cooking.valid', autotuneMetric="f1:__label__baking")
132
+
```
133
+
134
+
This is equivalent to manually optimize the f1-score we get when we test with `model.test_label('cooking.valid')['__label__baking']`.
135
+
136
+
Sometimes, you may be interested in predicting more than one label. For example, if you were optimizing the hyperparameters manually to get the best score to predict two labels, you would test with `model.test("cooking.valid", k=2)`. You can also tell autotune to optimize the parameters by testing two labels with the `autotunePredictions` argument.
0 commit comments