forked from fengdu78/Coursera-ML-AndrewNg-Notes
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path2 - 1 - Model Representation (8 min).srt
431 lines (345 loc) · 15.6 KB
/
2 - 1 - Model Representation (8 min).srt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
1
00:00:00,338 --> 00:00:04,677
Our first learning algorithm will be linear regression. In this video, you'll see
我们的第一个学习算法将 线性回归。在这段视频中,你会看到
(字幕整理:中国海洋大学 黄海广,haiguang2000@qq.com )
2
00:00:04,678 --> 00:00:09,234
what the model looks like and more importantly you'll see what the overall
什么样的模型看起来像多 重要的是,你会看到什么整体
3
00:00:09,234 --> 00:00:14,801
process of supervised learning looks like. Let's use some motivating example of predicting
监督学习过程中的模样。让我们 使用一些激励的例子预测
4
00:00:14,801 --> 00:00:20,036
housing prices. We're going to use a data set of housing prices from the city of
住房价格上涨。我们将使用数据 从城市的住房价格设置
5
00:00:20,036 --> 00:00:25,205
Portland, Oregon. And here I'm gonna plot my data set of a number of houses
俄勒冈州波特兰市。在这里,我要去 绘制数据集的一些房屋
6
00:00:25,205 --> 00:00:30,833
that were different sizes that were sold for a range of different prices. Let's say
不同尺寸已售出 对于不同的价格范围内。比方说,
7
00:00:30,833 --> 00:00:35,872
that given this data set, you have a friend that's trying to sell a house and
这组数据,你有一个 朋友,试图把房子卖了,
8
00:00:35,872 --> 00:00:41,238
let's see if friend's house is size of 1250 square feet and you want to tell them
让我们来看看,如果朋友的房子是大小 1250平方英尺,你要告诉他们
9
00:00:41,238 --> 00:00:46,459
how much they might be able to sell the house for. Well one thing you could do is
他们也许能卖多少 房子。好了,你可以做的一件事是
10
00:00:46,648 --> 00:00:53,039
fit a model. Maybe fit a straight line to this data. Looks something like that and based
拟合模型。也许适合直线 此数据。看起来是这样的,根据
11
00:00:53,039 --> 00:00:59,168
on that, maybe you could tell your friend that let's say maybe he can sell the
,也许你可以告诉你的朋友 比方说,也许他可以卖
12
00:00:59,168 --> 00:01:03,575
house for around $220,000. So this is an example of a
房子周围220,000元。 因此,这是一个例子的
13
00:01:03,575 --> 00:01:08,834
supervised learning algorithm. And it's supervised learning because we're given
监督的学习算法。和它的 因为我们的监督学习
14
00:01:08,834 --> 00:01:14,287
the, quotes, "right answer" for each of our examples. Namely we're told what was
,报价,“正确的答案”为每个 我们的例子。即告诉我们是什么
15
00:01:14,287 --> 00:01:19,351
the actual house, what was the actual price of each of the houses in our data
实际的房子,什么是实际 每个房子的价格在我们的数据
16
00:01:19,351 --> 00:01:24,441
set were sold for and moreover, this is an example of a regression problem where
集已售出,而且,这是 的一个例子的回归问题中
17
00:01:24,441 --> 00:01:29,545
the term regression refers to the fact that we are predicting a real-valued output
回归一词是指这样的事实 我们预测一个真正的值输出
18
00:01:29,545 --> 00:01:34,586
namely the price. And just to remind you the other most common type of supervised
即价格。只是提醒你 其他监督的最常见的类型
19
00:01:34,586 --> 00:01:39,006
learning problem is called the classification problem where we predict
学习问题被称为 分类问题,我们预测
20
00:01:39,006 --> 00:01:45,202
discrete-valued outputs such as if we are looking at cancer tumors and trying to
比如,如果我们的离散值输出 看癌症肿瘤,并试图
21
00:01:45,202 --> 00:01:52,032
decide if a tumor is malignant or benign. So that's a zero-one valued discrete output. More
决定如果肿瘤是良性或恶性。 所以这是一个零一值离散输出。更多
22
00:01:52,032 --> 00:01:57,087
formally, in supervised learning, we have a data set and this data set is called a
正式监督学习,我们有 数据集并在这样的数据组被称为一个
23
00:01:57,087 --> 00:02:02,018
training set. So for housing prices example, we have a training set of
训练集。因此,住房价格 例如,我们有一个训练集
24
00:02:02,018 --> 00:02:07,386
different housing prices and our job is to learn from this data how to predict prices
不同的房价和我们的工作是 从这个数据中学习如何预测价格
25
00:02:07,386 --> 00:02:11,907
of the houses. Let's define some notation that we're using throughout this course.
的房子。让我们来定义一些符号 我们正在使用的整个过程。
26
00:02:11,907 --> 00:02:16,100
We're going to define quite a lot of symbols. It's okay if you don't remember
我们要定义颇多 符号。没关系,如果你不记得
27
00:02:16,100 --> 00:02:20,075
all the symbols right now but as the course progresses it will be useful
所有的符号,但现在作为 课程的进展,将是有益的
28
00:02:20,075 --> 00:02:24,267
[inaudible] convenient notation. So I'm gonna use lower case m throughout this course to
[听不清]方便的符号。所以,我会使用 整个本课程小写米
29
00:02:24,267 --> 00:02:28,897
denote the number of training examples. So in this data set, if I have, you know,
培训的例子的数字表示。所以 在这组数据中,如果我有,你知道,
30
00:02:28,897 --> 00:02:34,366
let's say 47 rows in this table. Then I have 47 training examples and m equals 47.
让我们说,在此表中的47列。然后我 有47个训练实例和m等于47。
31
00:02:34,366 --> 00:02:39,497
Let me use lowercase x to denote the input variables often also called the
让我用小写字母x表示 输入变量经常也被称为
32
00:02:39,497 --> 00:02:44,290
features. That would be the x is here, it would the input features. And I'm gonna
功能。这将是x是在这里,它会输入功能。我要去
33
00:02:44,290 --> 00:02:49,556
use y to denote my output variables or the target variable which I'm going to
用y来表示我的输出变量或 目标变量,我要去
34
00:02:49,556 --> 00:02:54,552
predict and so that's the second column here. [inaudible] notation, I'm
预测,所以这是第二 列在这里。 [听不清]符号,我
35
00:02:54,552 --> 00:03:05,749
going to use (x, y) to denote a single training example. So, a single row in this
要使用(X,Y)来表示一个单 培训的例子。所以,在此单排
36
00:03:05,749 --> 00:03:12,068
table corresponds to a single training example and to refer to a specific
表对应一个单一的培训 的例子,并参照特定的
37
00:03:12,068 --> 00:03:19,708
training example, I'm going to use this notation x(i) comma gives me y(i) And, we're
培训例子中,我将使用这个 符号X(I)逗号给了我Y(I),我们
38
00:03:25,322 --> 00:03:30,935
going to use this to refer to the ith training example. So this superscript i
要利用这个参阅第i个 培训的例子。所以这个下标i
39
00:03:30,935 --> 00:03:37,864
over here, this is not exponentiation right? This (x(i), y(i)), the superscript i in
在这里,这是不求幂 对不对? (X(I),Y(I)),上标i
40
00:03:37,864 --> 00:03:44,873
parentheses that's just an index into my training set and refers to the ith row in
那只是一个索引我的括号 培训是指第i行
41
00:03:44,873 --> 00:03:51,629
this table, okay? So this is not x to the power of i, y to the power of i. Instead
此表,好吗?所以这不是X到 电源I,Y i的功率。代替
42
00:03:51,629 --> 00:03:58,216
(x(i), y(i)) just refers to the ith row of this table. So for example, x(1) refers to the
(×(i)中,Y(I)),在此指的是第i行 表中。因此,例如,x(1)指的是
43
00:03:58,216 --> 00:04:04,972
input value for the first training example so that's 2104. That's this x in the first
输入值的第一次训练的例子,所以 那是2104。这是这个x在第一
44
00:04:04,972 --> 00:04:11,685
row. x(2) will be equal to 1416 right? That's the second x
一行。 ×(2)将等于 1416吧?这是第二个X
45
00:04:11,685 --> 00:04:17,385
and y(1) will be equal to 460. The first, the y value for my first
和y(1)将等于460。 第一,我的第一个y值
46
00:04:17,385 --> 00:04:24,526
training example, that's what that (1) refers to. So as mentioned, occasionally I'll ask you a
培训的例子,这就是:(1) 指。所以提到的,偶尔我会问你一个
47
00:04:24,526 --> 00:04:28,345
question to let you check your understanding and a few seconds in this
质疑让你检查你的 理解,在这几秒钟
48
00:04:28,345 --> 00:04:34,044
video a multiple-choice question will pop up in the video. When it does,
视频选择题 会弹出视频。当它,
49
00:04:34,044 --> 00:04:40,362
please use your mouse to select what you think is the right answer. What defined by
请使用鼠标来选择你 我认为是正确的答案。定义的
50
00:04:40,362 --> 00:04:45,124
the training set is. So here's how this supervised learning algorithm works.
训练集。因此,这里是如何 监督学习算法的工作原理。
51
00:04:45,124 --> 00:04:50,513
We saw that with the training set like our training set of housing prices and we feed
我们看到,像我们的训练集 训练集的住房价格,我们养活
52
00:04:50,513 --> 00:04:55,715
that to our learning algorithm. Is the job of a learning algorithm to then output a
就我们的学习算法。是对工作 学习算法,然后输出
53
00:04:55,715 --> 00:05:00,101
function which by convention is usually denoted lowercase h and h
按照约定的功能 通常表示小写h和h
54
00:05:00,101 --> 00:05:06,574
stands for hypothesis And what the job of the hypothesis is, is, is a function that
看台假说和什么样的工作, 的假设是,是,是一个函数,
55
00:05:06,574 --> 00:05:12,471
takes as input the size of a house like maybe the size of the new house your friend's
作为输入那样的房子的大小 也许你的朋友的新房子的大小
56
00:05:12,471 --> 00:05:18,368
trying to sell so it takes in the value of x and it tries to output the estimated
挂羊头卖狗肉,所以它需要的价值 x和它试图输出估计
57
00:05:18,368 --> 00:05:31,630
value of y for the corresponding house. So h is a function that maps from x's
y值对应的房子。 因此,h是从x的一个??函数,映射
58
00:05:31,630 --> 00:05:37,729
to y's. People often ask me, you know, why is this function called
y的。人们经常问我,你 知道了,这是为什么函数调用
59
00:05:37,729 --> 00:05:42,121
hypothesis. Some of you may know the meaning of the term hypothesis, from the
假说。你们有些人可能知道 这意味着,长期假设,从
60
00:05:42,121 --> 00:05:46,744
dictionary or from science or whatever. It turns out that in machine learning, this
字典或从科学或什么的。它 原来,在机器学习,这
61
00:05:46,744 --> 00:05:51,310
is a name that was used in the early days of machine learning and it kinda stuck. 'Cause
在初期使用的名称 机器学习和它有点卡住。因为
62
00:05:51,310 --> 00:05:55,239
maybe not a great name for this sort of function, for mapping from sizes of
也许不是一个伟大的名字为这种 功能,从尺寸的映射
63
00:05:55,239 --> 00:05:59,978
houses to the predictions, that you know.... I think the term hypothesis, maybe isn't
房子的预言,你知道.... 我认为长期假设,也许是不
64
00:05:59,978 --> 00:06:04,543
the best possible name for this, but this is the standard terminology that people use in
此最好的可能的名称,但是这是 标准术语的人使用
65
00:06:04,543 --> 00:06:09,362
machine learning. So don't worry too much about why people call it that. When
机器学习。所以不要太担心 人们为什么称呼它。何时
66
00:06:09,362 --> 00:06:14,332
designing a learning algorithm, the next thing we need to decide is how do we
设计一个学习算法,下一个 需要决定的事情,我们是怎么做的,我们
67
00:06:14,332 --> 00:06:20,540
represent this hypothesis h. For this and the next few videos, I'm going to choose
代表这假设h。为了这个以及 接下来的几个视频,我要选择
68
00:06:20,540 --> 00:06:26,978
our initial choice , for representing the hypothesis, will be the following. We're going to
我们最初的选择,代表 假设,会出现下面的。我们要
69
00:06:26,978 --> 00:06:33,009
represent h as follows. And we will write this as h<u>theta(x) equals theta<u>0</u></u>
表示h为如下。我们将这样写: ?<U>西塔(X)等于THETA <U> 0 </ U> </ U>
70
00:06:33,009 --> 00:06:39,254
plus theta<u>1 of x. And as a shorthand, sometimes instead of writing, you</u>
加θ<U>来的x 1。而作为一个缩写, 有时,而不是写作,你</ U>
71
00:06:39,254 --> 00:06:45,441
know, h subscript theta of x, sometimes there's a shorthand, I'll just write as a h of x.
知道标西塔,H,X,有时 有一个速记,我就写一个h的x。
72
00:06:45,441 --> 00:06:51,627
But more often I'll write it as a subscript theta over there. And plotting
但更多的时候我会写它作为一个 标西塔那边。和绘图
73
00:06:51,627 --> 00:06:58,210
this in the pictures, all this means is that, we are going to predict that y is a linear
这在图片中,所有这一切意味着, 我们将要预测,y是一个线性
74
00:06:58,210 --> 00:07:04,634
function of x. Right, so that's the data set and what this function is doing,
x的函数。没错,所以这是 数据集,这个功能是做什么,
75
00:07:04,634 --> 00:07:11,698
is predicting that y is some straight line function of x. That's h of x equals theta 0
预测y是一些直 直线x的函数。这是x的?等于THETA 0
76
00:07:11,698 --> 00:07:18,450
plus theta 1 x, okay? And why a linear function? Well, sometimes we'll want to
加THETA 1个,好吗?为什么线性 功能?嗯,有时候我们会想
77
00:07:18,450 --> 00:07:23,405
fit more complicated, perhaps non-linear functions as well. But since this linear
适应更加复杂,或许非线性 功能。但是,由于这种线性
78
00:07:23,405 --> 00:07:28,298
case is the simple building block, we will start with this example first of fitting
案件是简单的积木,我们将 从这个例子,首先拟合
79
00:07:28,298 --> 00:07:32,943
linear functions, and we will build on this to eventually have more complex
线性函数,我们将建立 这最终有更复杂的
80
00:07:32,943 --> 00:07:37,403
models, and more complex learning algorithms. Let me also give this
模型,以及更复杂的学习 算法。让我也给这个
81
00:07:37,403 --> 00:07:42,628
particular model a name. This model is called linear regression or this, for
特定型号的名称。这种模式是 称为线性回归,
82
00:07:42,628 --> 00:07:48,271
example, is actually linear regression with one variable, with the variable being
例如,实际上是线性回归 一个变量,该变量是
83
00:07:48,271 --> 00:07:53,914
x. Predicting all the prices as functions of one variable X. And another name for
x的所有的价格预测功能 一个变量X的另一个名字
84
00:07:53,914 --> 00:07:58,852
this model is univariate linear regression. And univariate is just a
这种模式是单变量线性 回归。单因素仅仅是一个
85
00:07:58,852 --> 00:08:04,400
fancy way of saying one variable. So, that's linear regression. In the next
奇特的方式说一个变量。因此, 这就是线性回归。在接下来的
86
00:08:04,400 --> 00:08:09,760
video we'll start to talk about just how we go about implementing this model.
我们将开始谈论多么视频 我们去实施这一模式。