forked from fengdu78/Coursera-ML-AndrewNg-Notes
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path1 - 1 - Welcome (7 min).srt
515 lines (430 loc) · 15 KB
/
1 - 1 - Welcome (7 min).srt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
1
00:00:00,000 --> 00:00:04,262
Welcome to this free online class on
machine learning. Machine learning is one
欢迎来到机器学习免费在线课程。机器学习是
(字幕整理:中国海洋大学 黄海广,haiguang2000@qq.com )
2
00:00:04,262 --> 00:00:08,579
of the most exciting recent technologies.
And in this class, you learn about the
目前最激动人心的技术之一。本课程中,你将学习
3
00:00:08,579 --> 00:00:13,115
state of the art and also gain practice
implementing and deploying these algorithms
机器学习的发展,并且实现这些算法。
4
00:00:13,115 --> 00:00:17,487
yourself. You've probably use a learning
algorithm dozens of times a day without
你每天都要多次使用学习,但并没有意识到。
5
00:00:17,487 --> 00:00:21,422
knowing it. Every time you use a web
search engine like Google or Bing to
每次当你使用Google或Bing等搜索引擎时,
6
00:00:21,422 --> 00:00:25,794
search the internet, one of the reasons
that works so well is because a learning
它能给出如此满意的结果,原因之一就是使用了学习算法。
7
00:00:25,794 --> 00:00:30,002
algorithm, one implemented by Google or
Microsoft, has learned how to rank web
由Google或微软实现的算法学会如何给网页排序。
8
00:00:30,002 --> 00:00:35,144
pages. Every time you use Facebook or
Apple's photo typing application and it
每次你使用Facebook或苹果的相片分类功能,
9
00:00:35,144 --> 00:00:40,595
recognizes your friends' photos, that's
also machine learning. Every time you read
它能识别出你朋友的相片,这也是机器学习。每次当你阅读邮件时,
10
00:00:40,595 --> 00:00:46,054
your email and your spam filter saves you
from having to wade through tons of spam
你的垃圾邮件过滤器帮助你过滤大量的垃圾邮件,
11
00:00:46,054 --> 00:00:50,980
email, that's also a learning algorithm.
For me one of the reasons I'm excited is
这也是学习算法。对我而言,我兴奋的原因之一是
12
00:00:50,980 --> 00:00:55,643
the AI dream of someday building machines
as intelligent as you or me. We're a long
AI的梦想就是有一天能建造像你我一样智能的机器。
13
00:00:55,643 --> 00:01:00,076
way away from that goal, but many AI
researchers believe that the best way to
我们离这个目标还很远,但是许多AI研究者相信实现这个目标的最好方法
14
00:01:00,076 --> 00:01:04,567
towards that goal is through learning
algorithms that try to mimic how the human
就是采用学习算法试图模拟人类大脑是如何学习的。
15
00:01:04,567 --> 00:01:08,994
brain learns. I'll tell you a little bit
about that too in this class. In this
在本课程中,我将会向你们介绍部分这方面的内容。
16
00:01:08,994 --> 00:01:13,542
class you learn about state-of-the-art
machine learning algorithms. But it turns
本课程中,你将学习机器学习算法的发展。
17
00:01:13,542 --> 00:01:17,919
out just knowing the algorithms and
knowing the math isn't that much good if
但是,仅知道算法以及算法的数学含义,
18
00:01:17,919 --> 00:01:22,466
you don't also know how to actually get
this stuff to work on problems that you
却不知道如何在你关心的问题上运用,是远远不够的。
19
00:01:22,466 --> 00:01:26,844
care about. So, we've also spent a lot
of time developing exercises for you to
因此,我们也会花很多时间让你练习如何
20
00:01:26,844 --> 00:01:32,088
implement each of these algorithms and
see how they work for yourself. So why is
实现每一个算法,如何被你所用。
21
00:01:32,088 --> 00:01:37,075
machine learning so prevalent today?
It turns out that machine learning is a
这就是为什么机器学习如此流行。机器学习
22
00:01:37,075 --> 00:01:41,713
field that had grown out of the field of
AI, or artificial intelligence. We wanted
是从AI发展出来的一个领域。我们想
23
00:01:41,713 --> 00:01:46,642
to build intelligent machines and it turns
out that there are a few basic things that
建造智能机器,那就是说我们要编程使机器能做很多基本的事情,
24
00:01:46,642 --> 00:01:51,454
we could program a machine to do such as
how to find the shortest path from A to B.
比如找到从A到B的最短路径。
25
00:01:51,454 --> 00:01:56,267
But for the most part we just did not know
how to write AI programs to do the more
但大多数情况下,我们不知道如何编写AI程序使机器做更多
26
00:01:56,267 --> 00:02:00,905
interesting things such as web search or
photo tagging or email anti-spam. There
有趣的事情,如网页搜索、相片标记、反垃圾邮件。
26
00:02:00,905 --> 00:02:05,718
was a realization that the only way to do
these things was to have a machine learn
人们认识到做到这些事情,唯一的方法就是使机器本事学习如何去做。
27
00:02:05,718 --> 00:02:11,237
to do it by itself. So, machine learning
was developed as a new capability for
因此,机器学习是计算机需要开发的一项新能力,
28
00:02:11,237 --> 00:02:16,950
computers and today it touches many
segments of industry and basic science.
并且它设计工业和基础科学中的许多内容。
29
00:02:16,950 --> 00:02:21,496
For me, I work on machine learning and
in a typical week I might end up talking to
对我而言,我研究机器学习,并且在有代表性的一周中,我可能会与
30
00:02:21,496 --> 00:02:25,698
helicopter pilots, biologists, a bunch
of computer systems people (so my
直升飞机飞行员,生物学家,很多计算机系统的人员交流
31
00:02:25,698 --> 00:02:30,590
colleagues here at Stanford) and averaging
two or three times a week I get email from
并且每周2~3次与
32
00:02:30,590 --> 00:02:35,021
people in industry from Silicon Valley
contacting me who have an interest in
硅谷工业界的人员互通email,他们对在他们的问题上
33
00:02:35,021 --> 00:02:39,741
applying learning algorithms to their own
problems. This is a sign of the range of
应用学习算法感兴趣。以下是一些
34
00:02:39,741 --> 00:02:44,000
problems that machine learning touches.
There is autonomous robotics, computational
机器学习涉及的领域,自主机器人,计算生物学,
35
00:02:44,000 --> 00:02:48,777
biology, tons of things in Silicon Valley
that machine learning is having an impact
以及其它一些被机器学习影响的领域。
36
00:02:48,777 --> 00:02:55,320
on. Here are some other examples of
machine learning. There's database mining.
还有一些其它的例子,如数据挖掘。
37
00:02:55,320 --> 00:03:00,063
One of the reasons machine learning has so
pervaded is the growth of the web and the
机器学习如此普遍的原因之一就是网络的快速发展和
38
00:03:00,063 --> 00:03:04,751
growth of automation. All this means that
we have much larger data sets than ever
自动化技术的快速发展。这意味着我们拥有了前所未有的大量的数据集。
39
00:03:04,751 --> 00:03:09,272
before. So, for example tons of Silicon
Valley companies are today collecting web
因此,现在大量硅谷公司收集网络点击数据,
40
00:03:09,272 --> 00:03:13,737
click data, also called clickstream data,
and are trying to use machine learning
被称为点击流数据,并试图采用机器学习算法
41
00:03:13,737 --> 00:03:18,481
algorithms to mine this data to understand
the users better and to serve the users
来挖掘数据,更好地理解用户,并更好地为用户服务。
42
00:03:18,481 --> 00:03:22,327
better, that's a huge segment of
Silicon Valley right now. Medical
这占目前硅谷工作的很大部分。
43
00:03:22,327 --> 00:03:27,483
records. With the advent of automation, we
now have electronic medical records, so if
医疗记录。随着自动化的出现,现在我们使用电子医疗记录,
44
00:03:27,483 --> 00:03:32,640
we can turn medical records into medical
knowledge, then we can start to understand
因此,假如我们能将医疗记录转化为医疗知识,那我们就能更好地理解疾病。
45
00:03:32,640 --> 00:03:37,238
disease better. Computational biology.
With automation again, biologists are
计算生物学。还是因为自动化,生物学家
46
00:03:37,238 --> 00:03:41,774
collecting lots of data about gene
sequences, DNA sequences, and so on, and
收集了大量的数据,关于基因序列,DNA序列等。
47
00:03:41,774 --> 00:03:46,931
machines learning algorithms are giving us
a much better understanding of the human
机器学习算法让我们更好地理解基因组,
48
00:03:46,931 --> 00:03:51,376
genome, and what it means to be human.
And in engineering as well, in all fields of
以及它对人类的意义。在工程学中,在工程学所有的领域,
49
00:03:51,376 --> 00:03:55,034
engineering, we have larger and larger,
and larger and larger data sets, that
我们有越来越大,越来越大的数据集,
50
00:03:55,034 --> 00:03:59,249
we're trying to understand using learning
algorithms. A second range of machinery
我们正设法采用学习算法来理解。
51
00:03:59,249 --> 00:04:03,440
applications is ones that we cannot
program by hand. So for example, I've
机器应用的第二个领域是我们无法手动编写程序。例如,
52
00:04:03,440 --> 00:04:08,328
worked on autonomous helicopters for many
years. We just did not know how to write a
我们已经在自动直升飞机领域研究了很多年,仍不知道如何编写
53
00:04:08,328 --> 00:04:18,023
computer program to make this helicopter
fly by itself. The only thing that worked
计算机程序使得直升飞机自己飞行。唯一有用的
54
00:04:18,023 --> 00:04:35,580
was having a computer learn by itself how
to fly this helicopter. [Helicopter whirling]
就是使计算机自己学习如何使直升飞机飞行。
55
00:04:37,120 --> 00:04:42,880
Handwriting recognition. It turns out one
of the reasons it's so inexpensive today to
手写体识别。今天,邮寄不再昂贵的原因之一,
56
00:04:42,880 --> 00:04:47,330
route a piece of mail across the
countries, in the US and internationally,
无论在美国或国际上,
57
00:04:47,330 --> 00:04:51,899
is that when you write an envelope like
this, it turns out there's a learning
就是当你写好信封后,有学习算法
58
00:04:51,899 --> 00:04:56,943
algorithm that has learned how to read your
handwriting so that it can automatically
来学习如何读取你的手写体,使得它能自动地
59
00:04:56,943 --> 00:05:01,749
route this envelope on its way, and so it
costs us a few cents to send this thing
给你的信规划路线。因此,邮寄几千公里之外的信也只需要花费几分钱。
60
00:05:01,749 --> 00:05:06,318
thousands of miles. And in fact if you've
seen the fields of natural language
事实上,假如你知道自然语言处理
61
00:05:06,318 --> 00:05:10,531
processing or computer vision,
these are the fields of AI pertaining to
或计算机视觉,这些都是AI中有关
62
00:05:10,531 --> 00:05:15,321
understanding language or understanding
images. Most of natural language processing
理解语言或理解图像的领域。今天,大部分的自然语言处理和
63
00:05:15,321 --> 00:05:20,689
and most of computer vision today is
applied machine learning. Learning
大部分的计算机视觉采用机器学习。
64
00:05:20,689 --> 00:05:25,576
algorithms are also widely used for self-
customizing programs. Every time you go to
学习算法还应用在自我定制程序中。每次当你使用
65
00:05:25,576 --> 00:05:30,286
Amazon or Netflix or iTunes Genius, and it
recommends the movies or products and
亚马逊或Netflix或iTunes天才,它就会推荐电影或产品或音乐给你,
66
00:05:30,286 --> 00:05:35,073
music to you, that's a learning algorithm.
If you think about it they have million
这就是学习算法。假如你想象一下,他们有百万用户,
67
00:05:35,073 --> 00:05:39,999
users; there is no way to write a million
different programs for your million users.
不可能为百万用户编写百万个不同的程序。
68
00:05:39,999 --> 00:05:44,807
The only way to have software give these
customized recommendations is to become
用软件给出这些客户推荐的唯一方法就是
69
00:05:44,807 --> 00:05:49,258
learn by itself to customize itself to
your preferences. Finally learning
自我学习并为你定制你偏爱的东西。
70
00:05:49,258 --> 00:05:53,294
algorithms are being used today to
understand human learning and to
另外,今天学习算法还被使用来理解人类的学习
71
00:05:53,294 --> 00:05:58,042
understand the brain. We'll talk about
how researches are using this to make
和大脑。我们将会讨论研究者是如何使用这些来
72
00:05:58,042 --> 00:06:03,182
progress towards the big AI dream. A few
months ago, a student showed me an article
朝着AI梦前进的。几个月前,一个学生给我看了篇文章,
73
00:06:03,182 --> 00:06:07,996
on the top twelve IT skills. The skills
that information technology hiring
列出了12个IT技术。这些技术是信息科技雇主
74
00:06:07,996 --> 00:06:13,006
managers cannot say no to. It was a
slightly older article, but at the top of
最喜爱的。这篇文章有点旧了,但是在这张表中,
75
00:06:13,006 --> 00:06:17,988
this list of the twelve most desirable IT
skills was machine learning. Here at
机器学习位列第一。
76
00:06:17,988 --> 00:06:21,793
Stanford, the number of recruiters
that contact me asking if I know any
在斯坦福,联系我的招聘人员,需要的机器学习毕业的学生
77
00:06:21,793 --> 00:06:25,920
graduating machine learning students
is far larger than the machine learning
的数量远大于每年毕业的机器学习的学生。
78
00:06:25,920 --> 00:06:30,047
students we graduate each year. So I
think there is a vast, unfulfilled demand
因此,我认为这项技术有大量的未完成的需求。
79
00:06:30,047 --> 00:06:34,280
for this skill set, and this is a great time to
be learning about machine learning, and I
这也是一个学习机器学习的好时机。
80
00:06:34,280 --> 00:06:38,454
hope to teach you a lot about machine
learning in this class. In the next video,
我希望能在这门课中教给你机器学习的知识。在下一个视频中,
81
00:06:38,454 --> 00:06:42,123
we'll start to give a more formal
definition of what is machine learning.
我们将开始给出机器学习的更正式的定义。
82
00:06:42,123 --> 00:06:46,044
And we'll begin to talk about the main
types of machine learning problems and
我们将开始谈到机器学习问题的主要类型,及算法。
83
00:06:46,044 --> 00:06:49,864
algorithms. You'll pick up some of the
main machine learning terminology, and
你们将会学到一些主要的机器学习术语,
84
00:06:49,864 --> 00:06:53,684
start to get a sense of what are the
different algorithms, and when each one
及开始理解不同的机器学习算法,及每个算法在什么时候是合适的。
85
00:06:53,684 --> 00:06:54,740
might be appropriate.
也许管用。