forked from fengdu78/Coursera-ML-AndrewNg-Notes
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path18 - 3 - Getting Lots of Data and Artificial Data (16 min).srt
2549 lines (2039 loc) · 48.6 KB
/
18 - 3 - Getting Lots of Data and Artificial Data (16 min).srt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1
00:00:00,090 --> 00:00:01,270
I've seen over and over that
我多次看到(字幕翻译:中国海洋大学,刘竞)
2
00:00:01,570 --> 00:00:03,160
one of the most reliable ways to
一个最可靠的得到
3
00:00:03,300 --> 00:00:04,800
get a high performance machine learning
一个高性能的机器学习
4
00:00:05,040 --> 00:00:06,170
system is to take
系统的方法是
5
00:00:06,550 --> 00:00:07,860
a low bias learning algorithm
采取一个低偏差的机器学习算法
6
00:00:08,750 --> 00:00:10,220
and to train it on a massive training set.
并且在用大量的数据集去训练它。
7
00:00:11,230 --> 00:00:12,830
But where did you get so much training data from?
但是你该从哪里去获得这么多的训练数据呢?
8
00:00:13,510 --> 00:00:14,440
Turns out that the machine earnings
机器收益证明了
9
00:00:14,820 --> 00:00:16,520
there's a fascinating idea called artificial
有一个非常吸引人的思想叫做
10
00:00:17,220 --> 00:00:19,000
data synthesis, this doesn't
人工数据合成,这种思想并不
11
00:00:19,370 --> 00:00:20,740
apply to every single problem, and
适用于每个单独的问题,当把它
12
00:00:20,980 --> 00:00:22,120
to apply to a specific
运用于一个具体的特定
13
00:00:22,360 --> 00:00:25,060
problem, often takes some thought and innovation and insight.
问题时,需要经过一些思考,改进和洞察力。
14
00:00:25,780 --> 00:00:27,170
But if this idea applies
但是假如这个思想应用在
15
00:00:27,580 --> 00:00:29,120
to your machine, only problem, it
在你的机器上,唯一的问题是,
16
00:00:29,230 --> 00:00:30,270
can sometimes be a an
有时,为你的学习算法
17
00:00:30,510 --> 00:00:31,600
easy way to get a
获得一个巨大的
18
00:00:31,680 --> 00:00:33,470
huge training set to give to your learning algorithm.
训练集将是很容易的。
19
00:00:33,900 --> 00:00:35,520
The idea of artificial
人工数据合成
20
00:00:36,230 --> 00:00:38,410
data synthesis comprises of two
包含两个
21
00:00:38,590 --> 00:00:40,210
variations, main the first
变化形式,第一个最主要的
22
00:00:40,650 --> 00:00:41,940
is if we are essentially creating
是我们是否有必要
23
00:00:42,520 --> 00:00:44,940
data from [xx], creating new data from scratch.
从[xx]中去生成数据,也就是从头开始去创新数据。
24
00:00:45,380 --> 00:00:46,700
And the second is if
第二种是我们
25
00:00:46,930 --> 00:00:48,350
we already have it's small
是否已经有了算法的一小
26
00:00:48,590 --> 00:00:49,970
label training set and we
部分标签训练集
27
00:00:50,210 --> 00:00:51,490
somehow have amplify that training
并且以某种方式扩充了训练集
28
00:00:51,840 --> 00:00:52,680
set or use a small training
或者是把一小部分训练集
29
00:00:52,980 --> 00:00:54,390
set to turn that into
转换成了
30
00:00:54,660 --> 00:00:56,290
a larger training set and in
一个较大的数据集
31
00:00:56,450 --> 00:00:58,120
this video we'll go over both those ideas.
在这一个视频中,我们将仔细学习这两种思路。
32
00:01:00,350 --> 00:01:02,220
To talk about the artificial data
在讲到人工数据
33
00:01:02,440 --> 00:01:04,030
synthesis idea, let's use
合成思想时,让我们
34
00:01:04,330 --> 00:01:06,930
the character portion of
借用一下成像光学字符识别
35
00:01:07,090 --> 00:01:08,470
the photo OCR pipeline, we
管道中的字符部分,
36
00:01:08,690 --> 00:01:09,710
want to take it's input image
我们想采用它的输入图像
37
00:01:10,060 --> 00:01:11,370
and recognize what character it is.
去识别出它是什么字符。
38
00:01:13,330 --> 00:01:14,690
If we go out and collect
假如我们出去收集到
39
00:01:14,880 --> 00:01:16,270
a large label data set,
一个大的标签数据集,
40
00:01:16,890 --> 00:01:17,980
here's what it is and what it look like.
也就是它是什么和它像什么。
41
00:01:18,580 --> 00:01:21,770
For this particular example, I've chosen a square aspect ratio.
在这一个特殊的例子中,我选择了一个正方形的长宽比。
42
00:01:22,130 --> 00:01:23,250
So we're taking square image patches.
所以我们采用正方形的图像块。
43
00:01:24,180 --> 00:01:25,110
And the goal is to take
我们的目标是获得
44
00:01:25,770 --> 00:01:27,420
an image patch and recognize the
一个图像块并且识别出
45
00:01:27,530 --> 00:01:29,270
character in the middle of that image patch.
图像块中央的字符。
46
00:01:31,090 --> 00:01:31,990
And for the sake of simplicity,
为了简单,
47
00:01:32,660 --> 00:01:33,740
I'm going to treat these images
我打算把这些图像当做
48
00:01:34,240 --> 00:01:36,380
as grey scale images, rather than color images.
灰度图像来处理,不是把它们当做彩色图像。
49
00:01:36,870 --> 00:01:38,550
It turns out that using color
事实证明把它们当做彩色图像来处理
50
00:01:38,930 --> 00:01:41,180
doesn't seem to help that much for this particular problem.
对于这个具体的问题而言看起来帮助不大。
51
00:01:42,190 --> 00:01:43,530
So given this image patch, we'd
对于这个给定的图像块,
52
00:01:43,660 --> 00:01:44,890
like to recognize that that's a
我们会把它识别为"T"。
53
00:01:45,010 --> 00:01:46,230
T. Given this image patch,
对于这个图像块,
54
00:01:46,550 --> 00:01:47,920
we'd like to recognize that it's an 'S'.
我们会把它识别为一个"S".
55
00:01:49,540 --> 00:01:50,740
Given that image patch we
对于这个图像块,我们
56
00:01:50,850 --> 00:01:52,950
would like to recognize that as an 'I' and so on.
会把它识别为一个“I”,等等。
57
00:01:54,110 --> 00:01:55,310
So all of these, our
因此,对于所有
58
00:01:55,450 --> 00:01:57,240
examples of row images, how
这些行图像的例子,
59
00:01:57,380 --> 00:01:59,460
can we come up with a much larger training set?
我们该如何得到一个更大的训练集呢?
60
00:02:00,000 --> 00:02:01,580
Modern computers often have a
现代计算机通常有一个
61
00:02:01,640 --> 00:02:03,700
huge font library and
庞大的字体库,
62
00:02:03,890 --> 00:02:05,330
if you use a word processing
并且假如你使用一个字处理
63
00:02:05,950 --> 00:02:07,090
software, depending on what word
软件,主要看你使用的
64
00:02:07,240 --> 00:02:08,580
processor you use, you might
是什么字处理软件,你可能
65
00:02:08,800 --> 00:02:09,980
have all of these fonts and
有所有这些字体,
66
00:02:10,120 --> 00:02:12,490
many, many more Already stored inside.
并且还有更多的已经存储在里面了。
67
00:02:12,950 --> 00:02:14,350
And, in fact, if you go different websites, there
并且,事实上,假如你去不同的网站,
68
00:02:14,680 --> 00:02:16,280
are, again, huge, free font
网上还有其它的大的,
69
00:02:16,690 --> 00:02:18,200
libraries on the internet we
免费的字体库,从那里
70
00:02:18,370 --> 00:02:19,960
can download many, many different
我们能下载许多许多不同
71
00:02:20,250 --> 00:02:22,580
types of fonts, hundreds or perhaps thousands of different fonts.
类型的字体,几百甚至是几千种不同的字体。
72
00:02:23,960 --> 00:02:25,180
So if you want more
所以,假如你想
73
00:02:25,570 --> 00:02:27,020
training examples, one thing you
得到更多训练实例,一件
74
00:02:27,100 --> 00:02:28,340
can do is just take
你可以做的事情正是
75
00:02:28,870 --> 00:02:30,220
characters from different fonts
得到不同的字体的字符,
76
00:02:31,240 --> 00:02:32,870
and paste these characters against
并且把这些字符粘贴到
77
00:02:33,290 --> 00:02:35,890
different random backgrounds.
任意不同的背景下。
78
00:02:36,730 --> 00:02:39,500
So you might take this ---- and paste that c against a random background.
所以,你可以得到这些字符C并且把它粘在任意的背景下。
79
00:02:40,680 --> 00:02:41,640
If you do that you now have
假如你做了这些,那么你现在
80
00:02:42,060 --> 00:02:43,830
a training example of an
就有了一个关于
81
00:02:44,080 --> 00:02:45,260
image of the character C.
字符C的图像的训练样例。
82
00:02:46,360 --> 00:02:47,500
So after some amount of
所以在完成一定数量的
83
00:02:47,570 --> 00:02:48,920
work, you know this,
工作之后,你就会发现
84
00:02:48,980 --> 00:02:49,710
and it is a little bit of
合成这些逼真的数据
85
00:02:49,830 --> 00:02:51,760
work to synthisize realistic looking data.
只有很少的工作要做。
86
00:02:52,180 --> 00:02:53,080
But after some amount of work,
在完成一定量的工作之后,
87
00:02:53,700 --> 00:02:56,130
you can get a synthetic training set like that.
你可以得到像那样的合成的训练集。
88
00:02:57,180 --> 00:02:59,910
Every image shown on the right was actually a synthesized image.
在右侧显示的图像实际上是一个合成的图像。
89
00:03:00,360 --> 00:03:02,080
Where you take a font,
当你采用一个字体时,
90
00:03:02,810 --> 00:03:04,240
maybe a random font downloaded off
可能是一个从网上下载的字体,
91
00:03:04,400 --> 00:03:05,680
the web and you paste
你把基于这种字体的
92
00:03:06,160 --> 00:03:07,320
an image of one character
一个字符的图像或者
93
00:03:07,800 --> 00:03:08,870
or a few characters from that font
是几个字符的图像
94
00:03:09,570 --> 00:03:11,440
against this other random background image.
粘贴到另一个任意的背景图像下。
95
00:03:12,140 --> 00:03:12,840
And then apply maybe a little
可以应用一点
96
00:03:13,540 --> 00:03:15,160
blurring operators -----of app
模糊算子,比如
97
00:03:15,680 --> 00:03:17,380
finder, distortions that app
应用程序取景器,失真,
98
00:03:17,620 --> 00:03:18,650
finder, meaning just the sharing
应用程序取景器,意味着共享,
99
00:03:19,350 --> 00:03:20,740
and scaling and little rotation
缩放和轻微的
100
00:03:21,000 --> 00:03:22,260
operations and if you
旋转操作,假如你
101
00:03:22,370 --> 00:03:23,330
do that you get a synthetic
做了这些,你得到一个
102
00:03:23,580 --> 00:03:25,520
training set, on what the one shown here.
合成的训练集,就是这里显示的这个。
103
00:03:26,510 --> 00:03:28,050
And this is work,
这种工作,
104
00:03:28,530 --> 00:03:29,640
grade, it is, it takes
它也是有好有坏的,
105
00:03:29,970 --> 00:03:31,460
thought at work, in order to
为了使合成的数据更逼真,
106
00:03:31,700 --> 00:03:33,250
make the synthetic data look realistic,
在工作中是需要花费心思的,
107
00:03:34,020 --> 00:03:34,710
and if you do a sloppy
如果在生成合成数据的工作中
108
00:03:35,120 --> 00:03:36,200
job in terms of how
你没有认真去做,
109
00:03:36,250 --> 00:03:38,910
you create the synthetic data then it actually won't work well.
那么所生成的合成数据实际上是不能有效工作的。
110
00:03:39,620 --> 00:03:40,600
But if you look at
但是,假如你看到合成的
111
00:03:40,790 --> 00:03:43,940
the synthetic data looks remarkably similar to the real data.
数据与真实数据非常相似,
112
00:03:45,120 --> 00:03:46,850
And so by using synthetic data
那么使用合成的数据,
113
00:03:47,340 --> 00:03:48,550
you have essentially an unlimited
你就必然能为
114
00:03:48,990 --> 00:03:50,970
supply of training examples for
你的人工训练合成提供
115
00:03:51,310 --> 00:03:53,060
artificial training synthesis And
无限的训练数据样例。
116
00:03:53,150 --> 00:03:54,110
so, if you use this
因此,假如你使用
117
00:03:54,330 --> 00:03:55,820
source synthetic data, you have
这个合成数据源,你就必然
118
00:03:56,150 --> 00:03:58,100
essentially unlimited supply of
就可以利用这些无限的
119
00:03:58,560 --> 00:04:00,000
label data to create
标签数据为字符识别
120
00:04:00,140 --> 00:04:01,610
a improvised learning algorithm
问题生成一个
121
00:04:02,300 --> 00:04:03,990
for the character recognition problem.
学习算法。
122
00:04:05,120 --> 00:04:06,540
So this is an example of
所以这是一个
123
00:04:07,000 --> 00:04:08,500
artificial data synthesis where youre
数据合成的例子,
124
00:04:09,040 --> 00:04:10,870
basically creating new data from
你基本是从零开始
125
00:04:11,080 --> 00:04:13,780
scratch, you just generating brand new images from scratch.
产生数据,也就是你从零开始产生新的图像。
126
00:04:14,880 --> 00:04:16,450
The other main approach to artificial data
另一个主要的人工
127
00:04:16,710 --> 00:04:18,210
synthesis is where you
数据合成方式是
128
00:04:18,370 --> 00:04:19,610
take a examples that you
你使用一个当前
129
00:04:19,740 --> 00:04:20,780
currently have, that we take
已经有的样例,也就是说
130
00:04:21,020 --> 00:04:22,430
a real example, maybe from
我们有一个真实的样例,
131
00:04:22,700 --> 00:04:24,130
real image, and you create
可能是一个真实的图像,
132
00:04:24,770 --> 00:04:26,130
additional data, so as to
你产生附加的数据,
133
00:04:26,380 --> 00:04:27,900
amplify your training set.
以扩充你的训练集。
134
00:04:28,070 --> 00:04:28,810
So here is an image of a compared
这是一个真实图像
135
00:04:28,910 --> 00:04:30,490
to a from a real image,
的对比图像,
136
00:04:31,410 --> 00:04:32,550
not a synthesized image, and
不是一个合成的图像,
137
00:04:32,630 --> 00:04:33,790
I have overlayed this with
我在上面覆盖了网格线
138
00:04:33,880 --> 00:04:35,750
the grid lines just for the purpose of illustration.
只是为了说明问题。
139
00:04:36,430 --> 00:04:36,880
Actually have these ----.
实际上有这许多。
140
00:04:36,970 --> 00:04:39,030
So what you
所以你能做
141
00:04:39,100 --> 00:04:40,110
can do is then take this
的是把字母放在
142
00:04:40,480 --> 00:04:41,500
alphabet here, take this image
这里,向这图像中
143
00:04:42,240 --> 00:04:43,760
and introduce artificial warpings[sp?]
引入一些人工的拉伸
144
00:04:44,290 --> 00:04:45,810
or artificial distortions into the
或者是一些
145
00:04:46,040 --> 00:04:47,030
image so they can
人工的失真,
146
00:04:47,220 --> 00:04:48,240
take the image a and turn
经过这些操作,可以把
147
00:04:48,430 --> 00:04:50,060
that into 16 new examples.
字母A变成这16个新的样例。
148
00:04:51,110 --> 00:04:52,000
So in this way you can
所以采用这种办法,
149
00:04:52,450 --> 00:04:53,740
take a small label training set
你可以得到一个小的标签训练集
150
00:04:54,090 --> 00:04:55,360
and amplify your training set
并且你扩充你的训练集,
151
00:04:56,180 --> 00:04:57,190
to suddenly get a lot
突然得到
152
00:04:57,300 --> 00:05:00,020
more examples, all of it.
更多样例,所有的这些图像。
153
00:05:01,210 --> 00:05:02,360
Again, in order to do
此外,在这一应用中
154
00:05:02,560 --> 00:05:03,940
this for application, it does
所做的这些,
155
00:05:04,120 --> 00:05:05,060
take thought and it does
需要花费心思,
156
00:05:05,140 --> 00:05:06,270
take insight to figure out
需要洞察力去
157
00:05:06,430 --> 00:05:07,840
what our reasonable sets of
判断出合理的失真
158
00:05:08,420 --> 00:05:09,460
distortions, or whether these
操作集,或者是这些操作
159
00:05:09,720 --> 00:05:11,000
are ways that amplify and multiply
是否是扩充和增加
160
00:05:11,470 --> 00:05:12,760
your training set, and for
训练集的方法,
161
00:05:13,070 --> 00:05:15,130
the specific example of
对于字符识别这一
162
00:05:15,260 --> 00:05:17,170
character recognition, introducing these
具体的例子,引入这些
163
00:05:17,480 --> 00:05:18,310
warping seems like a natural
拉伸看起来是一个
164
00:05:18,780 --> 00:05:19,910
choice, but for a
很自然的选择,但是对于不同
165
00:05:20,090 --> 00:05:21,970
different learning machine application, there may
机器学习应用来说,可能
166
00:05:22,080 --> 00:05:24,180
be different the distortions that might make more sense.
另外一些不同的失真将会更合理。
167
00:05:24,860 --> 00:05:25,600
Let me just show one example
让我给大家展示一个
168
00:05:26,180 --> 00:05:28,750
from the totally different domain of speech recognition.
完全不同的语音识别领域的问题。
169
00:05:30,230 --> 00:05:31,480
So the speech recognition, let's say
对于语音识别,假如说
170
00:05:31,580 --> 00:05:33,450
you have audio clips and you
你有音频片段,
171
00:05:33,600 --> 00:05:35,010
want to learn from the audio
你想从中
172
00:05:35,350 --> 00:05:37,240
clip to recognize what were
识别出哪些单词出现在了
173
00:05:37,460 --> 00:05:38,780
the words spoken in that clip.
语音片段中。
174
00:05:39,510 --> 00:05:41,340
So let's see how one labeled training example.
因此,让我们来看看是如何给训练样例加标签的。
175
00:05:42,290 --> 00:05:43,190
So let's say you have one
那么,让我们说假定你已经有了一个
176
00:05:43,400 --> 00:05:45,000
labeled training example, of someone
加了标签的训练样例,就是
177
00:05:45,330 --> 00:05:46,660
saying a few specific words.
某个人在说一些特定的单词。
178
00:05:46,860 --> 00:05:48,720
So let's play that audio clip here.
因此,让我们播放一下这个语音片段。
179
00:05:49,150 --> 00:05:51,230
0 -1-2-3-4-5.
0,1,2,3,4,5.
180
00:05:51,570 --> 00:05:53,810
Alright, so someone
好吧,有人在
181
00:05:54,220 --> 00:05:55,110
counting from 0 to 5,
从0数到5.
182
00:05:55,450 --> 00:05:57,180
and so you want to
然后你想要应用
183
00:05:57,290 --> 00:05:58,460
try to apply a learning algorithm
一个学习算法去试图
184
00:05:59,380 --> 00:06:01,320
to try to recognize the words said in that.
识别出那个人说了哪些单词。
185
00:06:02,040 --> 00:06:04,030
So, how can we amplify the data set?
那么,我们该如何扩充数据集呢?
186
00:06:04,390 --> 00:06:05,340
Well, one thing we do is
很好,我们可以做的一件事情就是
187
00:06:06,020 --> 00:06:09,180
introduce additional audio distortions into the data set.
引入附加的语音失真到数据集中。
188
00:06:09,970 --> 00:06:10,960
So here I'm going to
所以,我将加入
189
00:06:11,640 --> 00:06:14,700
add background sounds to simulate a bad cell phone connection.
加入一些背景声音去模拟一个较差的手机通话连接。
190
00:06:15,360 --> 00:06:16,800
When you hear beeping sounds, that's
当你听到蜂鸣声,实现上
191
00:06:16,980 --> 00:06:17,710
actually part of the audio
这是音频记录的一部分,
192
00:06:17,740 --> 00:06:20,350
track, that's nothing wrong with the speakers, I'm going to play this now.
不是说话者的错误,现在我开始播放了。
193
00:06:20,580 --> 00:06:21,379
0-1-2-3-4-5.
0,1,2,3,4,5.
194
00:06:21,380 --> 00:06:22,260
Right, so you can listen
好了,你只可以听到
195
00:06:22,640 --> 00:06:24,890
to that sort of audio clip and
那种音频片段, 并且
196
00:06:25,720 --> 00:06:28,600
recognize the sounds,
识别出声音,
197
00:06:28,960 --> 00:06:30,800
that seems like another useful training
这看起来像是另外一种
198
00:06:31,370 --> 00:06:33,230
example to have, here's another example, noisy background.
值得拥有的训练样例。这是另外一种例子,吵杂的背景。
199
00:06:34,890 --> 00:06:36,870
Zero, one, two, three
0,1,2,3
200
00:06:37,560 --> 00:06:39,060
four five you know
4,5,在背景中还有