forked from fengdu78/Coursera-ML-AndrewNg-Notes
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path10 - 6 - Learning Curves (12 min).srt
1770 lines (1416 loc) · 33.8 KB
/
10 - 6 - Learning Curves (12 min).srt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1
00:00:00,090 --> 00:00:02,040
In this video, I'd like to tell you about learning curves.
本节课我们介绍学习曲线
(字幕整理:中国海洋大学 黄海广,haiguang2000@qq.com )
2
00:00:03,310 --> 00:00:05,850
Learning curves is often a very useful thing to plot.
绘制学习曲线非常有用
3
00:00:06,710 --> 00:00:08,170
If either you wanted to sanity check
也许你想检查你的学习算法
4
00:00:08,430 --> 00:00:09,590
that your algorithm is working correctly,
运行是否一切正常
5
00:00:10,400 --> 00:00:12,730
or if you want to improve the performance of the algorithm.
或者你希望改进算法的表现或效果
6
00:00:13,950 --> 00:00:15,200
And learning curves is a
那么学习曲线
7
00:00:15,310 --> 00:00:16,410
tool that I actually use
就是一种很好的工具
8
00:00:16,820 --> 00:00:17,920
very often to try to
我经常使用学习曲线
9
00:00:18,290 --> 00:00:20,030
diagnose if a physical learning algorithm may be
来判断某一个学习算法
10
00:00:20,180 --> 00:00:23,220
suffering from bias, sort of variance problem or a bit of both.
是否处于偏差 方差问题 或是二者皆有
11
00:00:27,170 --> 00:00:28,070
Here's what a learning curve is.
下面我们就来介绍学习曲线
12
00:00:28,830 --> 00:00:30,550
To plot a learning curve, what
为了绘制一条学习曲线
13
00:00:30,700 --> 00:00:31,760
I usually do is plot
我通常先绘制出Jtrain
14
00:00:32,210 --> 00:00:33,950
j train which is, say,
也就是训练集数据的
15
00:00:35,030 --> 00:00:36,050
average squared error on my training
平均误差平方和
16
00:00:36,440 --> 00:00:39,090
set or Jcv which is
或者Jcv 也即交叉验证集数据的
17
00:00:39,340 --> 00:00:41,130
the average squared error on my cross validation set.
平均误差平方和
18
00:00:41,590 --> 00:00:42,900
And I'm going to plot
我要将其绘制成一个
19
00:00:43,140 --> 00:00:44,160
that as a function
关于参数m的函数
20
00:00:44,500 --> 00:00:46,380
of m, that is as a function
也就是一个关于训练集
21
00:00:47,230 --> 00:00:51,260
of the number of training examples I have.
样本总数的函数
22
00:00:51,950 --> 00:00:53,420
And so m is usually a constant like maybe I just have, you know, a 100
所以m一般都是一个常数 比如m等于100
23
00:00:53,650 --> 00:00:55,220
training examples but what I'm
表示100组训练样本
24
00:00:55,330 --> 00:00:57,670
going to do is artificially with
但我要自己取一些m的值
25
00:00:57,860 --> 00:00:59,280
use my training set exercise. So, I
也就是说我要自行对m的取值
26
00:00:59,500 --> 00:01:01,460
deliberately limit myself to using only,
做一点限制
27
00:01:01,840 --> 00:01:03,440
say, 10 or 20 or
比如说我取10 20或者
28
00:01:03,660 --> 00:01:06,040
30 or 40 training examples and
30 40组训练集
29
00:01:06,170 --> 00:01:07,610
plot what the training error is and
然后绘出训练集误差
30
00:01:07,740 --> 00:01:09,640
what the cross validation is for this
以及交叉验证集误差
31
00:01:10,040 --> 00:01:12,260
smallest training set exercises. So
好的 那么我们来看看
32
00:01:12,620 --> 00:01:14,090
let's see what these plots may look
这条曲线绘制出来是什么样子
33
00:01:14,270 --> 00:01:15,530
like. Suppose I have only
假设我只有一组训练样本
34
00:01:15,730 --> 00:01:17,210
one training example like that
也即m=1
35
00:01:17,390 --> 00:01:18,450
shown in this this first example
正如第一幅图中所示
36
00:01:18,860 --> 00:01:19,970
here and let's say I'm fitting a quadratic function. Well, I
并且假设使用二次函数来拟合模型
37
00:01:22,470 --> 00:01:24,490
have only one training example. I'm
那么由于我只有一个训练样本
38
00:01:25,040 --> 00:01:26,100
going to be able to fit it perfectly
拟合的结果很明显会很好
39
00:01:26,650 --> 00:01:28,590
right? You know, just fit the quadratic function. I'm
是吧 用二次函数来拟合
40
00:01:28,760 --> 00:01:30,000
going to have 0
对这一个训练样本拟合
41
00:01:30,150 --> 00:01:32,240
error on the one training example. If I
其误差一定为0
42
00:01:32,570 --> 00:01:34,170
have two training examples. Well the quadratic function can also fit that very well. So,
如果有两组训练样本 二次函数也能很好地拟合
43
00:01:37,050 --> 00:01:38,550
even if I am using regularization,
即使是使用正则化
44
00:01:38,750 --> 00:01:40,220
I can probably fit this quite well.
拟合的结果也会很好
45
00:01:41,080 --> 00:01:41,970
And if I am using no neural regularization,
而如果不使用正则化的话
46
00:01:42,030 --> 00:01:45,200
I'm going to fit this perfectly and
那么拟合效果绝对棒极了
47
00:01:45,440 --> 00:01:46,400
if I have three training examples
如果我用三组训练样本的话
48
00:01:47,260 --> 00:01:48,380
again. Yeah, I can fit a quadratic
好吧 看起来依然能很好地
49
00:01:48,660 --> 00:01:51,320
function perfectly so if
用二次函数拟合
50
00:01:51,550 --> 00:01:52,590
m equals 1 or m equals 2 or m equals 3,
也就是说 当m等于1 m=2 或m=3时
51
00:01:54,850 --> 00:01:56,770
my training error
对训练集数据进行预测
52
00:01:57,350 --> 00:01:58,870
on my training set is
得到的训练集误差
53
00:01:59,110 --> 00:02:01,180
going to be 0 assuming I'm
都将等于0
54
00:02:01,220 --> 00:02:02,760
not using regularization or it may
这里假设我不使用正则化
55
00:02:03,150 --> 00:02:04,290
slightly large in 0 if
当然如果使用正则化
56
00:02:04,560 --> 00:02:06,400
I'm using regularization and
那么误差就稍大于0
57
00:02:06,500 --> 00:02:07,350
by the way if I have
顺便提醒一下
58
00:02:07,740 --> 00:02:08,980
a large training set and I'm artificially
如果我的训练集样本很大
59
00:02:09,940 --> 00:02:11,040
restricting the size of my
而我要人为地限制训练集
60
00:02:11,120 --> 00:02:13,080
training set in order to J train.
样本的容量
61
00:02:13,830 --> 00:02:14,770
Here if I set
比如说这里
62
00:02:15,110 --> 00:02:16,720
M equals 3, say, and I
我将m值设为3
63
00:02:17,040 --> 00:02:18,290
train on only three examples,
然后我仅用这三组样本进行训练
64
00:02:19,270 --> 00:02:21,030
then, for this figure I
然后对应到这个图中
65
00:02:21,110 --> 00:02:22,430
am going to measure my training error
我只看对这三组训练样本
66
00:02:22,830 --> 00:02:24,450
only on the three examples that
进行预测得到的训练误差
67
00:02:24,550 --> 00:02:25,580
actually fit my data too
也是和我模型拟合的三组样本
68
00:02:27,150 --> 00:02:28,130
and so even I have to say
所以即使我有100组训练样本
69
00:02:28,290 --> 00:02:31,160
a 100 training examples but if I want to plot what my
而我还是想绘制
70
00:02:31,430 --> 00:02:32,620
training error is the m equals 3. What I'm going to do
当m等于3时的训练误差
71
00:02:34,270 --> 00:02:35,200
is to measure the
那么我要关注的仍然是
72
00:02:35,340 --> 00:02:36,660
training error on the
对这三组训练样本进行预测的误差
73
00:02:36,750 --> 00:02:39,870
three examples that I've actually fit to my hypothesis 2.
同样 这三组样本也是我们用来拟合模型的三组样本
74
00:02:41,290 --> 00:02:42,900
And not all the other examples that I have
所有其他的样本
75
00:02:43,010 --> 00:02:44,940
deliberately omitted from the training
我都在训练过程中选择性忽略了
76
00:02:45,140 --> 00:02:46,750
process. So just to summarize what we've
好的 总结一下
77
00:02:46,960 --> 00:02:48,460
seen is that if the training set
我们现在已经看到
78
00:02:48,820 --> 00:02:50,560
size is small then the
当训练样本容量m很小的时候
79
00:02:50,630 --> 00:02:52,630
training error is going to be small as well.
训练误差也会很小
80
00:02:52,960 --> 00:02:53,900
Because you know, we have a
因为很显然
81
00:02:53,930 --> 00:02:55,150
small training set is
如果我们训练集很小
82
00:02:55,350 --> 00:02:56,790
going to be very easy to
那么很容易就能把
83
00:02:56,900 --> 00:02:58,080
fit your training set
训练集拟合到很好
84
00:02:58,720 --> 00:02:59,490
very well may be even
甚至拟合得天衣无缝
85
00:02:59,790 --> 00:03:02,970
perfectly now say
现在我们来看
86
00:03:03,190 --> 00:03:04,460
we have m equals 4 for example. Well then
当m等于4的时候
87
00:03:04,680 --> 00:03:06,800
a quadratic function can be
好吧 二次函数似乎也能
88
00:03:06,920 --> 00:03:07,900
a longer fit this data set
对数据拟合得很好
89
00:03:08,100 --> 00:03:09,680
perfectly and if I
那我们再看
90
00:03:09,790 --> 00:03:11,350
have m equals 5 then you
当m等于5的情况
91
00:03:11,460 --> 00:03:13,830
know, maybe quadratic function will fit to stay there so
这时候再用二次函数来拟合
92
00:03:14,090 --> 00:03:15,940
so, then as my training set gets larger.
好像效果有下降但还是差强人意
93
00:03:16,980 --> 00:03:18,460
It becomes harder and harder to
而当我的训练集越来越大的时候
94
00:03:18,620 --> 00:03:19,860
ensure that I can
你不难发现 要保证使用二次函数
95
00:03:20,060 --> 00:03:21,820
find the quadratic function that process through
的拟合效果依然很好
96
00:03:21,960 --> 00:03:25,460
all my examples perfectly. So
就显得越来越困难了
97
00:03:25,840 --> 00:03:27,300
in fact as the training set size
因此 事实上随着训练集容量的增大
98
00:03:27,690 --> 00:03:28,770
grows what you find
我们不难发现
99
00:03:29,300 --> 00:03:30,960
is that my average training error
我们的平均训练误差
100
00:03:31,310 --> 00:03:33,080
actually increases and so if you plot
是逐渐增大的
101
00:03:33,500 --> 00:03:34,650
this figure what you find
因此如果你画出这条曲线
102
00:03:35,220 --> 00:03:36,860
is that the training set
你就会发现
103
00:03:37,130 --> 00:03:38,520
error that is the average
训练集误差 也就是
104
00:03:38,940 --> 00:03:40,660
error on your hypothesis grows
对假设进行预测的误差平均值
105
00:03:41,300 --> 00:03:44,730
as m grows and just to repeat when the intuition is that when
随着m的增大而增大
106
00:03:45,020 --> 00:03:46,200
m is small when you have very
再重复一遍对这一问题的理解
107
00:03:46,500 --> 00:03:48,070
few training examples. It's pretty
当训练样本很少的时候
108
00:03:48,350 --> 00:03:49,420
easy to fit every single
对每一个训练样本
109
00:03:49,790 --> 00:03:51,350
one of your training examples perfectly and
都能很容易地拟合到很好
110
00:03:51,610 --> 00:03:52,840
so your error is going
所以训练误差将会很小
111
00:03:52,940 --> 00:03:54,540
to be small whereas
而反过来
112
00:03:54,710 --> 00:03:56,100
when m is larger then gets
当m的值逐渐增大
113
00:03:56,460 --> 00:03:57,900
harder all the training
那么想对每一个训练样本都拟合到很好
114
00:03:58,220 --> 00:03:59,900
examples perfectly and so
就显得愈发的困难了
115
00:04:00,430 --> 00:04:01,830
your training set error becomes
因此训练集误差就会越来越大
116
00:04:02,370 --> 00:04:05,840
more larger now, how about the cross validation error.
那么交叉验证集误差的情况如何呢
117
00:04:06,720 --> 00:04:08,460
Well, the cross validation is
好的 交叉验证集误差
118
00:04:08,590 --> 00:04:10,100
my error on this cross
是对完全陌生的交叉验证集数据
119
00:04:10,350 --> 00:04:12,660
validation set that I haven't seen and
进行预测得到的误差
120
00:04:12,880 --> 00:04:14,600
so, you know, when I have
那么我们知道
121
00:04:14,720 --> 00:04:15,900
a very small training set, I'm
当训练集很小的时候
122
00:04:16,080 --> 00:04:16,890
not going to generalize well, just
泛化程度不会很好
123
00:04:17,020 --> 00:04:19,610
not going to do well on that.
意思是不能很好地适应新样本
124
00:04:19,850 --> 00:04:21,220
So, right, this hypothesis here doesn't
因此这个假设
125
00:04:21,620 --> 00:04:22,720
look like a good one, and
就不是一个理想的假设
126
00:04:23,020 --> 00:04:23,970
it's only when I get
只有当我使用
127
00:04:24,050 --> 00:04:25,270
a larger training set that,
一个更大的训练集时
128
00:04:25,500 --> 00:04:26,380
you know, I'm starting to get
我才有可能
129
00:04:26,890 --> 00:04:28,100
hypotheses that maybe fit
得到一个能够更好拟合数据的
130
00:04:28,480 --> 00:04:30,810
the data somewhat better.
可能的假设
131
00:04:31,380 --> 00:04:32,050
So your cross validation error and
因此 你的验证集误差和
132
00:04:32,260 --> 00:04:35,650
your test set error will tend
测试集误差
133
00:04:35,890 --> 00:04:37,160
to decrease as your training
都会随着训练集样本容量m的增加
134
00:04:37,470 --> 00:04:39,150
set size increases because the
而减小 因为你使用的数据越多
135
00:04:39,250 --> 00:04:40,700
more data you have, the better
你越能获得更好地泛化表现
136
00:04:40,990 --> 00:04:43,410
you do at generalizing to new examples.
或者说对新样本的适应能力更强
137
00:04:44,010 --> 00:04:46,730
So, just the more data you have, the better the hypothesis you fit.
因此 数据越多 越能拟合出合适的假设
138
00:04:47,560 --> 00:04:48,560
So if you plot j train,
所以 如果你把Jtrain和Jcv绘制出来
139
00:04:49,420 --> 00:04:51,670
and Jcv this is the sort of thing that you get.
就应该得到这样的曲线
140
00:04:52,490 --> 00:04:53,550
Now let's look at what
现在我们来看看
141
00:04:53,770 --> 00:04:54,940
the learning curves may look like
当处于高偏差或者高方差的情况时
142
00:04:55,360 --> 00:04:56,550
if we have either high
这些学习曲线
143
00:04:56,930 --> 00:04:58,210
bias or high variance problems.
又会变成什么样子
144
00:04:58,920 --> 00:05:00,530
Suppose your hypothesis has high
假如你的假设处于高偏差问题
145
00:05:00,830 --> 00:05:02,150
bias and to explain this
为了更清楚地解释这个问题
146
00:05:02,370 --> 00:05:03,780
I'm going to use a, set an
我要用一个简单的例子来说明
147
00:05:03,940 --> 00:05:05,250
example, of fitting a straight
也就是用一条直线
148
00:05:05,440 --> 00:05:06,500
line to data that, you
来拟合数据的例子
149
00:05:06,770 --> 00:05:08,240
know, can't really be fit well by a straight line.
很显然一条直线不能很好地拟合数据
150
00:05:09,540 --> 00:05:12,330
So we end up with a hypotheses that maybe looks like that.
所以最后得到的假设很有可能是这样的
151
00:05:13,910 --> 00:05:15,450
Now let's think what would
现在我们来想一想
152
00:05:15,750 --> 00:05:16,840
happen if we were to increase
如果我们增大训练集样本容量
153
00:05:17,470 --> 00:05:18,880
the training set size. So if
会发生什么情况呢
154
00:05:19,160 --> 00:05:20,480
instead of five examples like
所以现在不像画出的这样
155
00:05:20,590 --> 00:05:22,400
what I've drawn there, imagine that
只有这五组样本了
156
00:05:22,570 --> 00:05:24,080
we have a lot more training examples.
我们有了更多的训练样本
157
00:05:25,280 --> 00:05:27,230
Well what happens, if you fit a straight line to this.
那么如果你用一条直线来拟合
158
00:05:27,980 --> 00:05:29,700
What you find is that, you
不难发现
159
00:05:30,040 --> 00:05:31,360
end up with you know, pretty much the same straight line.
还是会得到类似的一条直线假设
160
00:05:31,690 --> 00:05:32,940
I mean a straight line that
我的意思是
161
00:05:33,530 --> 00:05:35,110
just cannot fit this
刚才的情况用一条直线不能很好地拟合
162
00:05:35,270 --> 00:05:37,320
data and getting a ton more data, well
而现在把样本容量扩大了
163
00:05:37,890 --> 00:05:39,460
the straight line isn't going to change that much.
这条直线也基本不会变化太大
164
00:05:40,230 --> 00:05:41,400
This is the best possible straight-line
因为这条直线是对这组数据
165
00:05:41,840 --> 00:05:42,770
fit to this data, but the
最可能也是最接近的拟合
166
00:05:42,890 --> 00:05:44,160
straight line just can't fit this
但一条直线再怎么接近
167
00:05:44,320 --> 00:05:45,630
data set that well. So,
也不可能对这组数据进行很好的拟合
168
00:05:45,870 --> 00:05:47,420
if you plot across validation error,
所以 如果你绘出交叉验证集误差
169
00:05:49,260 --> 00:05:50,170
this is what it will look like.
应该是这样子的
170
00:05:51,320 --> 00:05:54,470
Option on the left, if you have already a miniscule training set size like you know,
最左端表示训练集样本容量很小 比如说只有一组样本
171
00:05:55,410 --> 00:05:57,710
maybe just one training example and is not going to do well.
那么表现当然很不好
172
00:05:58,550 --> 00:05:59,470
But by the time you have
而随着你增大训练集样本数
173
00:05:59,660 --> 00:06:00,760
reached a certain number of training
当达到某一个容量值的时候
174
00:06:00,940 --> 00:06:02,350
examples, you have almost
你就会找到那条最有可能
175
00:06:02,810 --> 00:06:04,010
fit the best possible straight
拟合数据的那条直线
176
00:06:04,200 --> 00:06:05,400
line, and even if
并且此时即便
177
00:06:05,490 --> 00:06:06,260
you end up with a much
你继续增大训练集的
178
00:06:06,480 --> 00:06:07,790
larger training set size, a
样本容量
179
00:06:07,970 --> 00:06:09,170
much larger value of m,
即使你不断增大m的值
180
00:06:10,010 --> 00:06:12,040
you know, you're basically getting the same straight line,
你基本上还是会得到的一条差不多的直线
181
00:06:12,370 --> 00:06:14,190
and so, the cross-validation error
因此 交叉验证集误差
182
00:06:14,480 --> 00:06:15,420
- let me label that -
我把它标在这里
183
00:06:15,650 --> 00:06:17,040
or test set error or
或者测试集误差
184
00:06:17,140 --> 00:06:18,660
plateau out, or flatten out
将会很快变为水平而不再变化
185
00:06:18,990 --> 00:06:20,480
pretty soon, once you reached
只要训练集样本容量值达到
186
00:06:20,910 --> 00:06:22,920
beyond a certain the number
或超过了那个特定的数值
187
00:06:23,270 --> 00:06:24,700
of training examples, unless you
交叉验证集误差和测试集误差就趋于不变
188
00:06:25,130 --> 00:06:27,480
pretty much fit the best possible straight line.
这样你会得到最能拟合数据的那条直线
189
00:06:28,390 --> 00:06:29,540
And how about training error?
那么训练误差又如何呢
190
00:06:30,120 --> 00:06:33,050
Well, the training error will again be small.
同样 训练误差一开始也是很小的
191
00:06:34,620 --> 00:06:36,280
And what you find
而在高偏差的情形中
192
00:06:36,760 --> 00:06:38,080
in the high bias case is
你会发现训练集误差
193
00:06:38,210 --> 00:06:40,770
that the training error will end
会逐渐增大
194
00:06:41,000 --> 00:06:42,510
up close to the cross
一直趋于接近
195
00:06:42,830 --> 00:06:44,700
validation error, because you
交叉验证集误差
196
00:06:44,810 --> 00:06:46,370
have so few parameters and so
这是因为你的参数很少
197
00:06:46,590 --> 00:06:48,070
much data, at least when m is large.
但当m很大的时候 数据太多
198
00:06:48,900 --> 00:06:49,840
The performance on the training
此时训练集和交叉验证集的
199
00:06:50,220 --> 00:06:52,500
set and the cross validation set will be very similar.
预测效果将会非常接近
200
00:06:53,800 --> 00:06:54,750
And so, this is what your
这就是当你的学习算法处于