forked from fengdu78/Coursera-ML-AndrewNg-Notes
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path14 - 3 - Principal Component Analysis Problem Formulation (9 min).srt
1346 lines (1077 loc) · 24.4 KB
/
14 - 3 - Principal Component Analysis Problem Formulation (9 min).srt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1
00:00:00,090 --> 00:00:01,010
For the problem of dimensionality
对于降维问题来说
(字幕整理:中国海洋大学 黄海广,haiguang2000@qq.com )
2
00:00:01,920 --> 00:00:03,420
reduction, by far the
目前
3
00:00:03,490 --> 00:00:04,620
most popular, by far the
最流行
4
00:00:04,690 --> 00:00:06,180
most commonly used algorithm is
最常用的算法是
5
00:00:06,390 --> 00:00:08,460
something called principal components analysis or PCA.
主成分分析方法(Principal Componet Analysis, PCA)
6
00:00:10,200 --> 00:00:11,160
It In this video, I would
在这段视频中
7
00:00:11,220 --> 00:00:12,610
like to start talking about the
我想首先开始讨论
8
00:00:12,740 --> 00:00:14,240
problem formulation for PCA,
PCA问题的公式描述
9
00:00:14,910 --> 00:00:16,090
in other words let us try
也就是说
10
00:00:16,260 --> 00:00:18,630
to formulate precisely exactly what
我们用公式准确地精确地描述
11
00:00:18,900 --> 00:00:19,980
we would like PCA to do.
我们想让 PCA 来做什么
12
00:00:20,670 --> 00:00:21,820
Let's say we have a dataset like
假设 我们有这样的一个数据集
13
00:00:22,020 --> 00:00:23,050
this, so this is a dataset
这个数据集含有
14
00:00:23,360 --> 00:00:24,710
of example X in R2,
二维实数R^2空间内的样本X
15
00:00:25,040 --> 00:00:26,140
and let's say I want
假设我想
16
00:00:26,470 --> 00:00:27,640
to reduce the dimension of the
对数据进行降维
17
00:00:27,810 --> 00:00:29,850
data from two dimensional to one dimensional.
从二维降到一维
18
00:00:31,170 --> 00:00:32,130
In other words I would like to find
也就是说,我想找到
19
00:00:32,690 --> 00:00:34,400
a line onto which to project the data.
一条直线,将数据投影到这条直线上
20
00:00:35,140 --> 00:00:37,680
So what seems like a good line onto which to project the data?
那怎么找到一条好的直线来投影这些数据呢?
21
00:00:38,730 --> 00:00:40,760
A line like this might be a pretty good choice.
这样的一条直线也许是个不错的选择
22
00:00:41,510 --> 00:00:42,790
And the reason you think
你认为
23
00:00:43,020 --> 00:00:43,990
this might be a good choice is
这是一个不错的选择的原因是
24
00:00:44,150 --> 00:00:45,420
that if you look at
如果你观察
25
00:00:46,020 --> 00:00:48,230
where the projected versions of the points goes.
投影到直线上的点的位置
26
00:00:48,530 --> 00:00:51,180
I'm gonna take this point and project it down here and get that.
我将这个点 投影到直线上 得到这个点
27
00:00:51,640 --> 00:00:53,500
This point gets projected here, to
这点被投影到这里
28
00:00:53,640 --> 00:00:55,220
here, to here, to here
这里 这里 以及这里
29
00:00:56,120 --> 00:00:57,360
what we find is that the
我们发现
30
00:00:57,420 --> 00:00:58,860
distance between each point
每个点到它们对应的
31
00:00:59,460 --> 00:01:02,520
and the projected version is pretty small.
投影到直线上的点之间的距离非常小
32
00:01:03,790 --> 00:01:06,490
That is, these blue
也就是说 这些蓝色的
33
00:01:06,690 --> 00:01:08,210
line segments are pretty short.
线段非常的短
34
00:01:09,270 --> 00:01:10,260
So what PCA
所以 正式的说
35
00:01:10,430 --> 00:01:11,730
does, formally, is it tries
PCA所作的就是
36
00:01:12,180 --> 00:01:14,320
to find a lower-dimensional surface
寻找一个低维的面
37
00:01:14,340 --> 00:01:15,250
really a line in
在这个例子中
38
00:01:15,330 --> 00:01:16,660
this case, onto which to
其实是一条直线
39
00:01:16,740 --> 00:01:18,260
project the data, so that
数据投射在上面 使得
40
00:01:18,520 --> 00:01:20,130
the sum of squares of these
这些蓝色小线段的平方和
41
00:01:20,360 --> 00:01:22,570
little blue line segments is minimized.
达到最小化
42
00:01:23,550 --> 00:01:24,780
The length of those blue line
这些蓝色线段的长度
43
00:01:25,020 --> 00:01:26,530
segments, that's sometimes also called
时常被叫做
44
00:01:27,100 --> 00:01:29,710
the projection error, and so what PCA
投影误差
45
00:01:29,750 --> 00:01:30,480
does is it tries to find
所以PCA所做的就是寻找
46
00:01:30,770 --> 00:01:31,840
the surface onto which to
一个投影平面
47
00:01:32,010 --> 00:01:33,350
project the data so as to
对数据进行投影
48
00:01:33,480 --> 00:01:35,050
minimize that. As an a
使得这个能够最小化
49
00:01:35,090 --> 00:01:37,460
side, before applying PCA
另外,在应用PCA之前
50
00:01:37,960 --> 00:01:39,750
it's standard practice to
常规的做法是
51
00:01:39,960 --> 00:01:41,300
first perform mean normalization and
先进行均值归一化和
52
00:01:41,820 --> 00:01:43,190
feature scaling so that the
特征规范化,使得
53
00:01:43,560 --> 00:01:44,760
features, X1 and X2,
特征X1和X2
54
00:01:44,880 --> 00:01:46,770
should have zero mean and
均值为0
55
00:01:46,880 --> 00:01:48,740
should have comparable ranges of values.
数值在可比较的范围之内
56
00:01:49,110 --> 00:01:50,320
I've already done this for this
在这个例子里,我已经这么做了
57
00:01:50,490 --> 00:01:51,590
example, but I'll come
但是
58
00:01:51,680 --> 00:01:52,990
back to this later and talk more
在后面,我将回过来,讨论更多的
59
00:01:53,190 --> 00:01:54,960
about feature scaling and mean normalization in the context of PCA later.
PCA背景下的特征规范化和均值归一化问题
60
00:01:58,600 --> 00:01:59,420
Coming back to this example,
回到这个例子
61
00:02:00,260 --> 00:02:01,470
in contrast to the red
对比
62
00:02:01,710 --> 00:02:03,300
lines that I just drew here's
我刚画好的红线
63
00:02:03,530 --> 00:02:05,970
a different line onto which I could project my data.
这是我对数据进行投影的一条不同的直线
64
00:02:06,810 --> 00:02:08,260
This magenta line.
这条品红色的线
65
00:02:08,520 --> 00:02:09,260
And as you can see, you
如你所见
66
00:02:09,370 --> 00:02:10,660
know, this magenta line is a
你知道,这条品红色直线
67
00:02:10,810 --> 00:02:13,920
much worse direction onto which to project my data, right?
是一个非常糟糕的方向来投影我的数据,对吧?
68
00:02:14,090 --> 00:02:15,020
So if I were to project my
所以,如果我将数据
69
00:02:15,120 --> 00:02:16,430
data onto the magenta
投影到这条品红色的直线上
70
00:02:16,730 --> 00:02:18,050
line, like the other set of points like that.
像我们刚才做的那样
71
00:02:19,140 --> 00:02:21,240
And the projection errors, that
这样,投影误差
72
00:02:21,420 --> 00:02:24,460
is these blue line segments would be huge.
就是这些蓝色的线段,将会很大
73
00:02:24,910 --> 00:02:25,930
So these points have to
所以,这些点将会
74
00:02:26,010 --> 00:02:28,170
move a huge distance in
移动很长一段距离
75
00:02:28,320 --> 00:02:29,840
order to get onto
才能投影到
76
00:02:30,360 --> 00:02:31,760
the, in order to
才能
77
00:02:31,930 --> 00:02:33,440
get projected onto the magenta line.
投影到这条品红色直线上
78
00:02:33,740 --> 00:02:35,390
And so that's why PCA
因此,这就是为什么PCA
79
00:02:36,010 --> 00:02:37,540
principle component analysis would choose
主成分分析方法会选择
80
00:02:37,860 --> 00:02:38,840
something like the red line
红色的这条直线
81
00:02:39,230 --> 00:02:41,410
rather than like the magenta line down here.
而不是品红色的这条直线
82
00:02:42,870 --> 00:02:45,280
Let's write out the PCA problem a little more formally.
我们正式一点地写出PCA问题
83
00:02:46,140 --> 00:02:47,660
The goal of PCA if we
PCA的目标是
84
00:02:47,810 --> 00:02:49,150
want to reduce data from two-
将数据从二维
85
00:02:49,360 --> 00:02:50,580
dimensional to one-dimensional is
降到一维
86
00:02:51,450 --> 00:02:52,160
we're going to try to find
我们将试着寻找
87
00:02:52,640 --> 00:02:54,590
a vector, that is
一个向量
88
00:02:54,970 --> 00:02:56,160
a vector Ui
向量Ui
89
00:02:57,150 --> 00:02:58,250
which is going to be in Rn,
属于Rn空间中的向量
90
00:02:58,780 --> 00:03:00,170
so that would be in R2 in this case,
在这个例子中是R2空间中的
91
00:03:01,130 --> 00:03:02,300
going to find direction onto
寻找一个对数据进行投影的方向
92
00:03:02,600 --> 00:03:04,990
which to project the data so as to minimize the projection error.
使得投影误差能够最小
93
00:03:05,400 --> 00:03:06,710
So in this example I'm
在这个例子里
94
00:03:07,190 --> 00:03:09,180
hoping that PCA will find this
希望PCA寻找到
95
00:03:09,380 --> 00:03:10,590
vector, which I'm going
向量,我将它叫做
96
00:03:10,720 --> 00:03:12,960
to call U1, so that
U1,所以
97
00:03:13,120 --> 00:03:14,340
when I project the data onto
当我把数据投影到
98
00:03:15,590 --> 00:03:17,620
the line that I defined
我定义的这条直线
99
00:03:18,170 --> 00:03:19,840
by extending out this vector,
通过延长这个向量得到的直线
100
00:03:20,370 --> 00:03:21,650
I end up with pretty small
最后我得到非常小的
101
00:03:22,100 --> 00:03:23,400
reconstruction errors and reference
重建误差
102
00:03:24,310 --> 00:03:25,220
data looks like this.
参考数据看上去是这样的
103
00:03:26,180 --> 00:03:26,640
And by the way,
此外
104
00:03:26,840 --> 00:03:28,310
I should mention that whether PCA
我应该指出的是,无论PCA
105
00:03:28,920 --> 00:03:32,150
gives me U1 or negative U1, it doesn't matter.
给出的是这个U1,还是负的U1,都没关系
106
00:03:32,650 --> 00:03:33,630
So if it gives me a positive
如果它给出的是正的向量
107
00:03:33,890 --> 00:03:35,530
vector in this direction that's fine, if it gives me,
在这个方向上,这没问题
108
00:03:35,950 --> 00:03:37,910
sort of the opposite vector facing
如果给出的是相反的向量
109
00:03:38,330 --> 00:03:40,160
in the opposite direction, so that
在相反的方向上
110
00:03:40,720 --> 00:03:43,150
would be -U1, draw that in
也就是-U1,用蓝色来画
111
00:03:43,300 --> 00:03:44,400
blue instead, whether it gives me positive
无论给的是正的
112
00:03:45,120 --> 00:03:46,310
U1 negative U1,
还是负的U1
113
00:03:46,440 --> 00:03:48,120
it doesn't matter, because each of
都没关系,因为
114
00:03:48,230 --> 00:03:50,030
these vectors defines the
每一个方向都定义了
115
00:03:50,110 --> 00:03:51,660
same red line onto which
相同的红色直线
116
00:03:51,870 --> 00:03:54,430
I'm projecting my data. So this
也就是我将投影的方向
117
00:03:54,610 --> 00:03:56,300
is a case of reducing data
这就是将
118
00:03:56,680 --> 00:03:58,120
from 2 dimensional to 1 dimensional.
2维数据降到1维的例子
119
00:03:58,920 --> 00:04:00,220
In the more general case we
更一般的情况是
120
00:04:00,350 --> 00:04:01,680
have N dimensional data and
我们有N维的数据
121
00:04:01,840 --> 00:04:03,790
we want to reduce it K dimensions.
想降到K维
122
00:04:04,970 --> 00:04:06,010
In that case, we want to
在这种情况下
123
00:04:06,160 --> 00:04:07,450
find not just a single vector
我们不仅仅只寻找单个的向量
124
00:04:07,940 --> 00:04:09,020
onto which to project the data
来对数据进行投影
125
00:04:09,320 --> 00:04:10,660
but we want to find K dimensions
我们想寻找K个方向
126
00:04:11,520 --> 00:04:12,420
onto which to project the data.
来对数据进行投影
127
00:04:13,290 --> 00:04:15,680
So as to minimize this projection error.
为了最小化投影误差
128
00:04:16,440 --> 00:04:17,100
So here's an example.
这是一个例子
129
00:04:17,480 --> 00:04:19,100
If I have a 3D point
如果我有3维数据点
130
00:04:19,390 --> 00:04:21,030
cloud like this then maybe
比如说像这样的
131
00:04:21,290 --> 00:04:22,620
what I want to do is find
我想要做的是
132
00:04:23,880 --> 00:04:26,120
vectors, sorry find a pair of vectors,
寻找向量,对不起,是寻找两个向量
133
00:04:27,020 --> 00:04:28,180
and I'm gonna call these vectors
我将这些向量叫做...
134
00:04:29,080 --> 00:04:30,530
lets draw these in red, I'm going
我们用红线画出来
135
00:04:30,710 --> 00:04:32,210
to find a pair of vectors,
我要寻找两个向量
136
00:04:32,580 --> 00:04:33,580
extending for the origin here's
从原点延伸出来
137
00:04:34,490 --> 00:04:37,280
U1, and here's
这是U1
138
00:04:37,580 --> 00:04:39,800
my second vector U2,
这是第二个向量U2
139
00:04:40,180 --> 00:04:42,110
and together these two
这两个向量一起
140
00:04:42,320 --> 00:04:43,850
vectors define a plane,
定义了一个平面
141
00:04:44,400 --> 00:04:45,590
or they define a 2D surface,
或者说,定义了一个2维平面
142
00:04:46,790 --> 00:04:47,900
kind of like this, sort of,
类似于这样的
143
00:04:48,270 --> 00:04:51,140
2D surface, onto which I'm going to project my data.
2维平面,我将把数据投影到上面
144
00:04:52,050 --> 00:04:52,900
For those of you that are
对于你们其中
145
00:04:53,080 --> 00:04:54,980
familiar with linear algebra, for
熟悉线性代数的人来说
146
00:04:55,170 --> 00:04:56,010
those of you that are really experts
对于你们其中真的
147
00:04:56,230 --> 00:04:57,380
in linear algebra, the formal
精通线性代数的人来说,
148
00:04:57,780 --> 00:04:58,820
definition of this is that
对这个正式的定义是
149
00:04:59,230 --> 00:05:00,500
we're going to find a set of
我们将寻找一组向量
150
00:05:00,610 --> 00:05:01,680
vectors, U1 U2 maybe up
U1,U2,也许
151
00:05:01,800 --> 00:05:03,370
to Uk, and what
一直到Uk
152
00:05:03,460 --> 00:05:04,490
we're going to do is project
我们将要做的是
153
00:05:04,980 --> 00:05:06,600
the data onto the linear
将数据投影到
154
00:05:06,830 --> 00:05:09,520
subspace spanned by this set of k vectors.
这k个向量张开的线性子空间上
155
00:05:10,520 --> 00:05:11,570
But if you are not familiar
但是如果你不熟悉
156
00:05:12,070 --> 00:05:13,200
with linear algebra, just think
线性代数,那就想成是
157
00:05:13,400 --> 00:05:14,790
of it as finding k directions
寻找k个方向
158
00:05:15,510 --> 00:05:18,380
instead of just one direction, onto which to project the data.
而不是只寻找一个方向,对数据进行投影
159
00:05:18,740 --> 00:05:19,950
So, finding a k-dimensional surface,
所以,寻找一个k维的平面
160
00:05:20,610 --> 00:05:21,560
really finding a 2D plane
在这里是寻找2维的平面
161
00:05:22,370 --> 00:05:23,870
in this case, shown in this
如图所示
162
00:05:24,040 --> 00:05:25,340
figure, where we can
这里我们用
163
00:05:26,800 --> 00:05:29,700
define the position of the points in the plane using k directions.
k个方向来定义平面中这些点的位置
164
00:05:30,410 --> 00:05:31,690
That's why for PCA, we
这就是为什么,对于PCA
165
00:05:31,950 --> 00:05:34,440
want to find k vectors onto which to project the data.
我们要寻找k个向量来对数据进行投影
166
00:05:35,030 --> 00:05:36,920
And so, more formally, in
因此,更正式一点的说
167
00:05:37,050 --> 00:05:38,430
PCA what we want
在PCA中,我们想做的就是
168
00:05:38,700 --> 00:05:40,400
to do is find this way
寻找到这种方式
169
00:05:40,590 --> 00:05:41,940
to project the data so as
对数据进行投影
170
00:05:42,040 --> 00:05:43,570
to minimize the, sort of,
进而最小化投影距离
171
00:05:43,850 --> 00:05:46,210
projection distance, which is distance between points and projections.
也就是数据点和投影后的点之间的距离
172
00:05:47,060 --> 00:05:48,060
In this so 3D example,
在这个3维的例子里
173
00:05:48,560 --> 00:05:50,100
too, given a point we
给定一个点
174
00:05:50,280 --> 00:05:51,450
would take the point and project
我们想将这个点
175
00:05:51,980 --> 00:05:53,950
it onto this 2D surface.
投影到2维平面上
176
00:05:55,560 --> 00:05:56,580
When you're done with that,
当你完成了那个
177
00:05:57,280 --> 00:05:58,690
and so the projection error would
因此投影误差就是
178
00:05:58,870 --> 00:06:00,830
be, you know, the distance between the
也就是
179
00:06:01,440 --> 00:06:03,160
point and where it gets projected down.
这点与投影到
180
00:06:03,970 --> 00:06:05,360
to my 2D surface,
2维平面之后的点之间的距离
181
00:06:05,880 --> 00:06:06,990
and so what PCA does is it'll
因此PCA做的是
182
00:06:07,070 --> 00:06:08,480
try to find a line or
寻找一条直线
183
00:06:08,620 --> 00:06:10,430
plane or whatever onto which
或者平面,诸如此类等等
184
00:06:10,660 --> 00:06:11,810
to project the data, to try
对数据进行投影
185
00:06:12,010 --> 00:06:14,160
to minimize that squared projection,
来最小化平方投影
186
00:06:15,100 --> 00:06:17,430
that 90 degree, or that orthogonal projection error.
90度的或者正交的投影误差
187
00:06:18,100 --> 00:06:19,240
Finally, one question that I
最后
188
00:06:19,280 --> 00:06:20,060
sometimes get asked is how
一个我有时会被问到的问题是
189
00:06:20,280 --> 00:06:22,100
does PCA relate to
PCA和线性回归有怎么样的关系?
190
00:06:22,350 --> 00:06:24,180
linear regression, because when explaining
因为当我解释PCA的时候
191
00:06:24,600 --> 00:06:25,780
PCA I sometimes end up
我有时候会以
192
00:06:26,190 --> 00:06:28,720
drawing diagrams like these and that looks a little bit like linear regression.
画这样的图表来结束,看上去有点像线性回归
193
00:06:30,790 --> 00:06:32,130
It turns out PCA is not
但是,事实是
194
00:06:32,370 --> 00:06:33,950
linear regression, and despite
PCA不是线性回国
195
00:06:34,350 --> 00:06:37,560
some cosmetic similarity these are actually totally different algorithms.
尽管看上去有一些相似性,但是它们确实是两种不同的算法
196
00:06:38,680 --> 00:06:39,680
If we were doing linear regression
如果我们做线性回国
197
00:06:40,770 --> 00:06:42,170
what we would do would be, on
我们 做的是
198
00:06:42,270 --> 00:06:42,940
the left we would be trying
看左边,我们想要
199
00:06:43,230 --> 00:06:44,400
to predict the value of some
在给定某个输入特征x的情况下
200
00:06:44,540 --> 00:06:45,830
variable y given some input
预测某个变量y的数值