forked from fengdu78/Coursera-ML-AndrewNg-Notes
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path10 - 1 - Deciding What to Try Next (6 min).srt
890 lines (712 loc) · 16.9 KB
/
10 - 1 - Deciding What to Try Next (6 min).srt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
1
00:00:00,300 --> 00:00:02,290
By now you have seen a lot of different learning algorithms.
到目前为止 我们已经介绍了许多不同的学习算法
(字幕整理:中国海洋大学 黄海广,haiguang2000@qq.com )
2
00:00:03,330 --> 00:00:04,450
And if you've been following along
如果你一直跟着这些视频的进度学习
3
00:00:04,770 --> 00:00:06,030
these videos you should consider
你会发现自己已经不知不觉地
4
00:00:06,770 --> 00:00:09,530
yourself an expert on many state-of-the-art machine learning techniques.
成为一个了解许多先进机器学习技术的专家了
5
00:00:09,730 --> 00:00:12,310
But even among
然而 在懂机器学习的人当中
6
00:00:12,560 --> 00:00:14,460
people that know a certain learning algorithm.
依然存在着很大的差距
7
00:00:15,250 --> 00:00:16,830
There's often a huge difference between
一部分人确实掌握了
8
00:00:17,090 --> 00:00:18,240
someone that really knows how
怎样高效有力地
9
00:00:18,410 --> 00:00:20,130
to powerfully and effectively apply
运用这些学习算法
10
00:00:20,450 --> 00:00:22,270
that algorithm, versus someone that's
而另一些人
11
00:00:22,950 --> 00:00:24,090
less familiar with some of
他们可能对我马上要讲的东西
12
00:00:24,160 --> 00:00:25,080
the material that I'm about
就不是那么熟悉了
13
00:00:25,420 --> 00:00:26,900
to teach and who doesn't really
他们可能没有完全理解
14
00:00:27,080 --> 00:00:28,090
understand how to apply these
怎样运用这些算法
15
00:00:28,250 --> 00:00:29,180
algorithms and can end up
因此总是
16
00:00:29,570 --> 00:00:30,760
wasting a lot of
把时间浪费在
17
00:00:30,870 --> 00:00:33,320
their time trying things out that don't really make sense.
毫无意义的尝试上
18
00:00:34,380 --> 00:00:35,180
What I would like to do is
我想做的是
19
00:00:35,340 --> 00:00:36,350
make sure that if you
确保你在设计
20
00:00:36,560 --> 00:00:37,830
are developing machine learning systems,
机器学习的系统时
21
00:00:38,600 --> 00:00:39,780
that you know how to choose
你能够明白怎样选择
22
00:00:40,400 --> 00:00:42,900
one of the most promising avenues to spend your time pursuing.
一条最合适 最正确的道路
23
00:00:43,890 --> 00:00:45,050
And on this and the next
因此 在这节视频
24
00:00:45,190 --> 00:00:46,530
few videos I'm going to
和之后的几段视频中
25
00:00:46,750 --> 00:00:47,890
give a number of practical
我将向你介绍一些实用的
26
00:00:48,380 --> 00:00:51,150
suggestions, advice, guidelines on how to do that.
建议和指导 帮助你明白怎样进行选择
27
00:00:51,520 --> 00:00:53,410
And concretely what we'd
具体来讲
28
00:00:53,600 --> 00:00:54,460
focus on is the problem
我将重点关注的问题是
29
00:00:54,940 --> 00:00:56,380
of, suppose you are
假如你在开发
30
00:00:56,580 --> 00:00:57,760
developing a machine learning system
一个机器学习系统
31
00:00:58,390 --> 00:00:59,390
or trying to improve the performance
或者想试着改进
32
00:00:59,950 --> 00:01:01,810
of a machine learning system, how
一个机器学习系统的性能
33
00:01:02,000 --> 00:01:03,630
do you go about deciding what are
你应如何决定
34
00:01:03,700 --> 00:01:05,260
the proxy avenues to try
接下来应该
35
00:01:07,620 --> 00:01:07,620
next?
选择哪条道路?
36
00:01:09,300 --> 00:01:11,200
To explain this, let's continue using
为了解释这一问题
37
00:01:11,670 --> 00:01:13,210
our example of learning to
我想仍然使用
38
00:01:13,350 --> 00:01:15,280
predict housing prices.
预测房价的学习例子
39
00:01:15,570 --> 00:01:17,760
And let's say you've implement and regularize linear regression.
假如你已经完成了正则化线性回归
40
00:01:18,700 --> 00:01:20,090
Thus minimizing that cost function
也就是最小化代价函数J的值
41
00:01:20,520 --> 00:01:22,870
j. Now suppose that
假如
42
00:01:23,130 --> 00:01:24,310
after you take your learn parameters,
在你得到你的学习参数以后
43
00:01:24,820 --> 00:01:26,570
if you test your hypothesis on
如果你要将你的假设函数
44
00:01:26,720 --> 00:01:28,360
the new set of houses, suppose you
放到一组新的房屋样本上进行测试
45
00:01:28,540 --> 00:01:29,470
find that this is making huge
假如说你发现在预测房价时
46
00:01:29,860 --> 00:01:31,770
errors in this prediction of the housing prices.
产生了巨大的误差
47
00:01:33,220 --> 00:01:34,490
The question is what should
现在你的问题是 要想改进这个算法
48
00:01:34,670 --> 00:01:37,600
you then try mixing in order to improve the learning algorithm?
接下来应该怎么办?
49
00:01:39,000 --> 00:01:40,000
There are many things that one
实际上你可以想出
50
00:01:40,210 --> 00:01:41,460
can think of that could improve
很多种方法来改进
51
00:01:41,950 --> 00:01:43,660
the performance of the learning algorithm.
这个算法的性能
52
00:01:44,800 --> 00:01:47,510
One thing they could try, is to get more training examples.
其中一种办法是 使用更多的训练样本
53
00:01:48,060 --> 00:01:49,240
And concretely, you can imagine, maybe, you
具体来讲 也许你能想到
54
00:01:49,600 --> 00:01:51,150
know, setting up phone surveys, going
通过电话调查
55
00:01:51,570 --> 00:01:52,820
door to door, to try to
或上门调查
56
00:01:52,930 --> 00:01:54,050
get more data on how much
来获取更多的
57
00:01:54,310 --> 00:01:56,660
different houses sell for.
不同的房屋出售数据
58
00:01:57,730 --> 00:01:58,770
And the sad thing is I've seen
遗憾的是
59
00:01:59,010 --> 00:02:00,060
a lot of people spend a
我看到好多人花费了好多时间
60
00:02:00,200 --> 00:02:01,400
lot of time collecting more training
想收集更多的训练样本
61
00:02:01,760 --> 00:02:03,270
examples, thinking oh, if we have
他们总认为 噢 要是我有
62
00:02:03,760 --> 00:02:04,780
twice as much or ten times
两倍甚至十倍数量的训练数据
63
00:02:05,050 --> 00:02:07,250
as much training data, that is certainly going to help, right?
那就一定会解决问题的 是吧?
64
00:02:07,990 --> 00:02:09,020
But sometimes getting more training
但有时候 获得更多的训练数据
65
00:02:09,380 --> 00:02:10,680
data doesn't actually help
实际上并没有作用
66
00:02:11,240 --> 00:02:11,920
and in the next few videos
在接下来的几段视频中
67
00:02:12,430 --> 00:02:13,670
we will see why, and we
我们将解释原因
68
00:02:13,720 --> 00:02:15,270
will see how you
我们也将知道
69
00:02:15,500 --> 00:02:16,780
can avoid spending a lot
怎样避免把过多的时间
70
00:02:16,950 --> 00:02:18,160
of time collecting more training data
浪费在收集更多的训练数据上
71
00:02:18,910 --> 00:02:20,660
in settings where it is just not going to help.
这实际上是于事无补的
72
00:02:22,370 --> 00:02:23,510
Other things you might try are
另一个方法 你也许能想到的
73
00:02:23,730 --> 00:02:25,830
to well maybe try a smaller set of features.
是尝试选用更少的特征集
74
00:02:26,470 --> 00:02:27,270
So if you have some set
因此如果你有一系列特征
75
00:02:27,450 --> 00:02:29,030
of features such as x1,
比如x1
76
00:02:29,270 --> 00:02:30,330
x2, x3 and so on,
x2 x3等等
77
00:02:30,680 --> 00:02:31,840
maybe a large number of features.
也许有很多特征
78
00:02:32,570 --> 00:02:33,460
Maybe you want to spend time
也许你可以花一点时间
79
00:02:33,860 --> 00:02:35,240
carefully selecting some small
从这些特征中
80
00:02:35,590 --> 00:02:37,410
subset of them to prevent overfitting.
仔细挑选一小部分来防止过拟合
81
00:02:38,670 --> 00:02:40,730
Or maybe you need to get additional features.
或者也许你需要用更多的特征
82
00:02:41,330 --> 00:02:42,390
Maybe the current set of features
也许目前的特征集
83
00:02:42,570 --> 00:02:44,740
aren't informative enough and you
对你来讲并不是很有帮助
84
00:02:44,840 --> 00:02:47,460
want to collect more data in the sense of getting more features.
你希望从获取更多特征的角度 来收集更多的数据
85
00:02:48,510 --> 00:02:49,590
And once again this is the
同样地
86
00:02:49,730 --> 00:02:50,900
sort of project that can scale
你可以把这个问题
87
00:02:51,180 --> 00:02:52,260
up the huge projects can you
扩展为一个很大的项目
88
00:02:52,580 --> 00:02:54,110
imagine getting phone
比如使用电话调查
89
00:02:54,350 --> 00:02:55,280
surveys to find out more
来得到更多的房屋案例
90
00:02:55,490 --> 00:02:57,230
houses, or extra land
或者再进行土地测量
91
00:02:57,640 --> 00:02:58,620
surveys to find out more
来获得更多有关
92
00:02:58,800 --> 00:03:01,130
about the pieces of land and so on, so a huge project.
这块土地的信息等等 因此这是一个复杂的问题
93
00:03:01,690 --> 00:03:02,820
And once again it would be
同样的道理
94
00:03:02,930 --> 00:03:04,140
nice to know in advance if
我们非常希望
95
00:03:04,330 --> 00:03:05,210
this is going to help before we
在花费大量时间完成这些工作之前
96
00:03:05,760 --> 00:03:07,690
spend a lot of time doing something like this.
我们就能知道其效果如何
97
00:03:07,920 --> 00:03:09,390
We can also try
我们也可以尝试
98
00:03:10,360 --> 00:03:12,100
adding polynomial features things like
增加多项式特征的方法
99
00:03:12,180 --> 00:03:13,100
x2 square x2 square and product
比如x1的平方 x2的平方
100
00:03:13,860 --> 00:03:14,700
features x1, x2.
x1 x2的乘积
101
00:03:14,930 --> 00:03:16,040
We can still spend quite a
我们可以花很多时间
102
00:03:16,180 --> 00:03:17,830
lot of time thinking about that and
来考虑这一方法
103
00:03:18,270 --> 00:03:19,340
we can also try other things like
我们也可以考虑其他方法
104
00:03:19,540 --> 00:03:21,390
decreasing lambda, the regularization parameter or increasing lambda.
减小或增大正则化参数lambda的值
105
00:03:23,840 --> 00:03:25,160
Given a menu of options
我们列出的这个单子
106
00:03:25,520 --> 00:03:26,680
like these, some of which
上面的很多方法
107
00:03:26,970 --> 00:03:28,240
can easily scale up to
都可以扩展开来
108
00:03:28,950 --> 00:03:30,000
six month or longer projects.
扩展成一个六个月或更长时间的项目
109
00:03:31,310 --> 00:03:32,660
Unfortunately, the most common
遗憾的是
110
00:03:32,760 --> 00:03:34,010
method that people use to
大多数人用来选择这些方法的标准
111
00:03:34,170 --> 00:03:36,040
pick one of these is to go by gut feeling.
是凭感觉
112
00:03:36,520 --> 00:03:37,670
In which what many people
也就是说
113
00:03:38,170 --> 00:03:39,520
will do is sort of randomly
大多数人的选择方法是
114
00:03:39,940 --> 00:03:41,100
pick one of these options and
随便从这些方法中选择一种
115
00:03:41,250 --> 00:03:43,050
maybe say, "Oh, lets go and get more training data."
比如他们会说 “噢 我们来多找点数据吧”
116
00:03:43,980 --> 00:03:45,480
And easily spend six months collecting
然后花上六个月的时间
117
00:03:45,880 --> 00:03:47,540
more training data or maybe someone
收集了一大堆数据
118
00:03:47,780 --> 00:03:48,860
else would rather be saying, "Well,
然后也许另一个人说
119
00:03:49,430 --> 00:03:51,810
let's go collect a lot more features on these houses in our data set."
“好吧 让我们来从这些房子的数据中多找点特征吧”
120
00:03:52,780 --> 00:03:54,010
And I have a lot
我很遗憾不止一次地看到
121
00:03:54,220 --> 00:03:55,870
of times, sadly seen people spend, you know,
很多人花了
122
00:03:56,630 --> 00:03:58,360
literally 6 months doing one
不夸张地说 至少六个月时间
123
00:03:58,530 --> 00:03:59,680
of these avenues that they have
来完成他们随便选择的一种方法
124
00:04:00,240 --> 00:04:01,810
sort of at random only to
而在六个月或者更长时间后
125
00:04:01,920 --> 00:04:03,220
discover six months later that
他们很遗憾地发现
126
00:04:03,460 --> 00:04:05,610
that really wasn't a promising avenue to pursue.
自己选择的是一条不归路
127
00:04:07,090 --> 00:04:08,170
Fortunately, there is a
幸运的是
128
00:04:08,310 --> 00:04:10,650
pretty simple technique that can
有一系列简单的方法
129
00:04:10,930 --> 00:04:12,640
let you very quickly rule
能让你事半功倍
130
00:04:12,900 --> 00:04:14,190
out half of the things
排除掉单子上的
131
00:04:14,500 --> 00:04:16,440
on this list as being potentially
至少一半的方法
132
00:04:16,970 --> 00:04:17,990
promising things to pursue.
留下那些确实有前途的方法
133
00:04:18,390 --> 00:04:19,310
And there is a very simple technique,
同时也有一种很简单的方法
134
00:04:19,830 --> 00:04:21,080
that if you run, can easily
只要你使用
135
00:04:21,710 --> 00:04:22,820
rule out many of these options,
就能很轻松地排除掉很多选择
136
00:04:24,120 --> 00:04:25,470
and potentially save you
从而为你节省
137
00:04:25,580 --> 00:04:28,600
a lot of time pursuing something that's just is not going to work.
大量不必要花费的时间
138
00:04:29,610 --> 00:04:30,950
In the next two videos
在接下来的两段视频中
139
00:04:31,320 --> 00:04:32,450
after this, I'm going to
我首先介绍
140
00:04:32,560 --> 00:04:35,420
first talk about how to evaluate learning algorithms.
怎样评估机器学习算法的性能
141
00:04:36,540 --> 00:04:37,810
And in the next few
然后在之后的几段视频中
142
00:04:38,080 --> 00:04:39,770
videos after that, I'm
我将开始
143
00:04:40,070 --> 00:04:41,130
going to talk about these techniques,
讨论这些方法
144
00:04:42,470 --> 00:04:44,270
which are called the machine learning diagnostics.
它们也被称为"机器学习诊断法"
145
00:04:46,690 --> 00:04:47,980
And what a diagnostic is, is
“诊断法”的意思是
146
00:04:48,120 --> 00:04:49,080
a test you can run,
这是一种测试法
147
00:04:49,900 --> 00:04:52,240
to get insight into what
你通过执行这种测试
148
00:04:52,430 --> 00:04:53,740
is or isn't working with
能够深入了解
149
00:04:54,130 --> 00:04:55,810
an algorithm, and which will
某种算法到底是否有用
150
00:04:56,070 --> 00:04:57,720
often give you insight as to
这通常也能够告诉你
151
00:04:57,940 --> 00:04:59,360
what are promising things to try
要想改进一种算法的效果
152
00:04:59,920 --> 00:05:01,100
to improve a learning algorithm's
什么样的尝试
153
00:05:03,910 --> 00:05:03,910
performance.
才是有意义的
154
00:05:04,730 --> 00:05:07,140
We'll talk about specific diagnostics later in this video sequence.
在这一系列的视频中我们将介绍具体的诊断法
155
00:05:08,050 --> 00:05:09,230
But I should mention in advance
但我要提前说明一点的是
156
00:05:09,440 --> 00:05:10,780
that diagnostics can take
这些诊断法的执行和实现
157
00:05:11,100 --> 00:05:12,280
time to implement and can sometimes,
是需要花些时间的
158
00:05:12,820 --> 00:05:14,300
you know, take quite a
有时候
159
00:05:14,340 --> 00:05:15,610
lot of time to implement and
确实需要花很多时间
160
00:05:15,740 --> 00:05:17,120
understand but doing so
来理解和实现
161
00:05:17,410 --> 00:05:18,330
can be a very good use
但这样做的确是
162
00:05:18,610 --> 00:05:19,380
of your time when you are
把时间用在了刀刃上
163
00:05:19,660 --> 00:05:21,460
developing learning algorithms because they
因为这些方法
164
00:05:21,560 --> 00:05:22,660
can often save you from
让你在开发学习算法时
165
00:05:22,880 --> 00:05:24,670
spending many months pursuing an
节省了几个月的时间
166
00:05:24,840 --> 00:05:26,580
avenue that you could
早点从不必要的尝试中解脱出来
167
00:05:26,870 --> 00:05:29,460
have found out much earlier just was not going to be fruitful.
早日脱离苦海
168
00:05:32,220 --> 00:05:33,070
So in the next few
因此
169
00:05:33,250 --> 00:05:34,250
videos, I'm going to first
在接下来几节课中
170
00:05:34,570 --> 00:05:36,220
talk about how evaluate your
我将先来介绍
171
00:05:36,450 --> 00:05:38,210
learning algorithms and after
如何评价你的学习算法
172
00:05:38,410 --> 00:05:39,210
that I'm going to talk
在此之后
173
00:05:39,300 --> 00:05:41,490
about some of these diagnostics which will hopefully
我将介绍一些诊断法
174
00:05:41,810 --> 00:05:42,950
let you much more
希望能让你更清楚
175
00:05:43,110 --> 00:05:44,470
effectively select more of the
在接下来的尝试中
176
00:05:44,770 --> 00:05:45,880
useful things to try mixing
如何选择更有意义的方法
177
00:05:46,560 --> 00:05:48,200
if your goal to improve
最终达到改进机器学习系统性能的目的
178
00:05:48,760 --> 00:05:50,430
the machine learning system.