srt/2 - 1 - Model Representation (8 min).srt

﻿1
00:00:00,338 --> 00:00:04,677
Our first learning algorithm will be linear regression. In this video, you'll see
我们的第一个学习算法将 线性回归。在这段视频中，你会看到
(字幕整理：中国海洋大学 黄海广,haiguang2000@qq.com )

2
00:00:04,678 --> 00:00:09,234
what the model looks like and more importantly you'll see what the overall
什么样的模型看起来像多 重要的是，你会看到什么整体

3
00:00:09,234 --> 00:00:14,801
process of supervised learning looks like. Let's use some motivating example of predicting
监督学习过程中的模样。让我们 使用一些激励的例子预测

4
00:00:14,801 --> 00:00:20,036
housing prices. We're going to use a data set of housing prices from the city of
住房价格上涨。我们将使用数据 从城市的住房价格设置

5
00:00:20,036 --> 00:00:25,205
Portland, Oregon. And here I'm gonna plot my data set of a number of houses
俄勒冈州波特兰市。在这里，我要去 绘制数据集的一些房屋

6
00:00:25,205 --> 00:00:30,833
that were different sizes that were sold for a range of different prices. Let's say
不同尺寸已售出 对于不同的价格范围内。比方说，

7
00:00:30,833 --> 00:00:35,872
that given this data set, you have a friend that's trying to sell a house and
这组数据，你有一个 朋友，试图把房子卖了，

8
00:00:35,872 --> 00:00:41,238
let's see if friend's house is size of 1250 square feet and you want to tell them
让我们来看看，如果朋友的房子是大小 1250平方英尺，你要告诉他们

9
00:00:41,238 --> 00:00:46,459
how much they might be able to sell the house for. Well one thing you could do is
他们也许能卖多少 房子。好了，你可以做的一件事是

10
00:00:46,648 --> 00:00:53,039
fit a model. Maybe fit a straight line to this data. Looks something like that and based
拟合模型。也许适合直线 此数据。看起来是这样的，根据

11
00:00:53,039 --> 00:00:59,168
on that, maybe you could tell your friend that let's say maybe he can sell the
，也许你可以告诉你的朋友 比方说，也许他可以卖

12
00:00:59,168 --> 00:01:03,575
house for around $220,000. So this is an example of a
房子周围220,000元。 因此，这是一个例子的

13
00:01:03,575 --> 00:01:08,834
supervised learning algorithm. And it's supervised learning because we're given
监督的学习算法。和它的 因为我们的监督学习

14
00:01:08,834 --> 00:01:14,287
the, quotes, "right answer" for each of our examples. Namely we're told what was
，报价，“正确的答案”为每个 我们的例子。即告诉我们是什么

15
00:01:14,287 --> 00:01:19,351
the actual house, what was the actual price of each of the houses in our data
实际的房子，什么是实际 每个房子的价格在我们的数据

16
00:01:19,351 --> 00:01:24,441
set were sold for and moreover, this is an example of a regression problem where
集已售出，而且，这是 的一个例子的回归问题中

17
00:01:24,441 --> 00:01:29,545
the term regression refers to the fact that we are predicting a real-valued output
回归一词是指这样的事实 我们预测一个真正的值输出

18
00:01:29,545 --> 00:01:34,586
namely the price. And just to remind you the other most common type of supervised
即价格。只是提醒你 其他监督的最常见的类型

19
00:01:34,586 --> 00:01:39,006
learning problem is called the classification problem where we predict
学习问题被称为 分类问题，我们预测

20
00:01:39,006 --> 00:01:45,202
discrete-valued outputs such as if we are looking at cancer tumors and trying to
比如，如果我们的离散值输出 看癌症肿瘤，并试图

21
00:01:45,202 --> 00:01:52,032
decide if a tumor is malignant or benign. So that's a zero-one valued discrete output. More
决定如果肿瘤是良性或恶性。 所以这是一个零一值离散输出。更多

22
00:01:52,032 --> 00:01:57,087
formally, in supervised learning, we have a data set and this data set is called a
正式监督学习，我们有 数据集并在这样的数据组被称为一个

23
00:01:57,087 --> 00:02:02,018
training set. So for housing prices example, we have a training set of
训练集。因此，住房价格 例如，我们有一个训练集

24
00:02:02,018 --> 00:02:07,386
different housing prices and our job is to learn from this data how to predict prices
不同的房价和我们的工作是 从这个数据中学习如何预测价格

25
00:02:07,386 --> 00:02:11,907
of the houses. Let's define some notation that we're using throughout this course.
的房子。让我们来定义一些符号 我们正在使用的整个过程。

26
00:02:11,907 --> 00:02:16,100
We're going to define quite a lot of symbols. It's okay if you don't remember
我们要定义颇多 符号。没关系，如果你不记得

27
00:02:16,100 --> 00:02:20,075
all the symbols right now but as the course progresses it will be useful
所有的符号，但现在作为 课程的进展，将是有益的

28
00:02:20,075 --> 00:02:24,267
[inaudible] convenient notation. So I'm gonna use lower case m throughout this course to
[听不清]方便的符号。所以，我会使用 整个本课程小写米

29
00:02:24,267 --> 00:02:28,897
denote the number of training examples. So in this data set, if I have, you know,
培训的例子的数字表示。所以 在这组数据中，如果我有，你知道，

30
00:02:28,897 --> 00:02:34,366
let's say 47 rows in this table. Then I have 47 training examples and m equals 47.
让我们说，在此表中的47列。然后我 有47个训练实例和m等于47。

31
00:02:34,366 --> 00:02:39,497
Let me use lowercase x  to denote the input variables often also called the
让我用小写字母x表示 输入变量经常也被称为

32
00:02:39,497 --> 00:02:44,290
features. That would be the x is here, it would the input features. And I'm gonna
功能。这将是x是在这里，它会输入功能。我要去

33
00:02:44,290 --> 00:02:49,556
use y to denote my output variables or the target variable which I'm going to
用y来表示我的输出变量或 目标变量，我要去

34
00:02:49,556 --> 00:02:54,552
predict and so that's the second column here. [inaudible] notation, I'm
预测，所以这是第二 列在这里。 [听不清]符号，我

35
00:02:54,552 --> 00:03:05,749
going to use (x, y) to denote a single training example. So, a single row in this
要使用（X，Y）来表示一个单 培训的例子。所以，在此单排

36
00:03:05,749 --> 00:03:12,068
table corresponds to a single training example and to refer to a specific
表对应一个单一的培训 的例子，并参照特定的

37
00:03:12,068 --> 00:03:19,708
training example, I'm going to use this notation x(i) comma gives me y(i) And, we're
培训例子中，我将使用这个 符号X（I）逗号给了我Y（I），我们

38
00:03:25,322 --> 00:03:30,935
going to use this to refer to the ith training example. So this superscript i
要利用这个参阅第i个 培训的例子。所以这个下标i

39
00:03:30,935 --> 00:03:37,864
over here, this is not exponentiation right? This (x(i), y(i)), the superscript i in
在这里，这是不求幂 对不对？ （X（I），Y（I）），上标i

40
00:03:37,864 --> 00:03:44,873
parentheses that's just an index into my training set and refers to the ith row in
那只是一个索引我的括号 培训是指第i行

41
00:03:44,873 --> 00:03:51,629
this table, okay? So this is not x to the power of i, y to the power of i. Instead
此表，好吗？所以这不是X到 电源I，Y i的功率。代替

42
00:03:51,629 --> 00:03:58,216
(x(i), y(i)) just refers to the ith row of this table. So for example, x(1) refers to the
（×（i）中，Y（I）），在此指的是第i行 表中。因此，例如，x（1）指的是

43
00:03:58,216 --> 00:04:04,972
input value for the first training example so that's 2104. That's this x in the first
输入值的第一次训练的例子，所以 那是2104。这是这个x在第一

44
00:04:04,972 --> 00:04:11,685
row. x(2) will be equal to 1416 right? That's the second x
一行。 ×（2）将等于 1416吧？这是第二个X

45
00:04:11,685 --> 00:04:17,385
and y(1) will be equal to 460. The first, the y value for my first
和y（1）将等于460。 第一，我的第一个y值

46
00:04:17,385 --> 00:04:24,526
training example, that's what that (1) refers to. So as mentioned, occasionally I'll ask you a
培训的例子，这就是：（1） 指。所以提到的，偶尔我会问你一个

47
00:04:24,526 --> 00:04:28,345
question to let you check your understanding and a few seconds in this
质疑让你检查你的 理解，在这几秒钟

48
00:04:28,345 --> 00:04:34,044
video a multiple-choice question will pop up in the video. When it does,
视频选择题 会弹出视频。当它，

49
00:04:34,044 --> 00:04:40,362
please use your mouse to select what you think is the right answer. What defined by
请使用鼠标来选择你 我认为是正确的答案。定义的

50
00:04:40,362 --> 00:04:45,124
the training set is. So here's how this supervised learning algorithm works.
训练集。因此，这里是如何 监督学习算法的工作原理。

51
00:04:45,124 --> 00:04:50,513
We saw that with the training set like our training set of housing prices and we feed
我们看到，像我们的训练集 训练集的住房价格，我们养活

52
00:04:50,513 --> 00:04:55,715
that to our learning algorithm. Is the job of a learning algorithm to then output a
就我们的学习算法。是对工作 学习算法，然后输出

53
00:04:55,715 --> 00:05:00,101
function which by convention is usually denoted lowercase h and h
按照约定的功能 通常表示小写h和h

54
00:05:00,101 --> 00:05:06,574
stands for hypothesis And what the job of the hypothesis is, is, is a function that
看台假说和什么样的工作， 的假设是，是，是一个函数，

55
00:05:06,574 --> 00:05:12,471
takes as input the size of a house like maybe the size of the new house your friend's
作为输入那样的房子的大小 也许你的朋友的新房子的大小

56
00:05:12,471 --> 00:05:18,368
trying to sell so it takes in the value of x and it tries to output the estimated
挂羊头卖狗肉，所以它需要的价值 x和它试图输出估计

57
00:05:18,368 --> 00:05:31,630
value of y for the corresponding house. So h is a function that maps from x's
y值对应的房子。 因此，h是从x的一个??函数，映射

58
00:05:31,630 --> 00:05:37,729
to y's. People often ask me, you know, why is this function called
y的。人们经常问我，你 知道了，这是为什么函数调用

59
00:05:37,729 --> 00:05:42,121
hypothesis. Some of you may know the meaning of the term hypothesis, from the
假说。你们有些人可能知道 这意味着，长期假设，从

60
00:05:42,121 --> 00:05:46,744
dictionary or from science or whatever. It turns out that in machine learning, this
字典或从科学或什么的。它 原来，在机器学习，这

61
00:05:46,744 --> 00:05:51,310
is a name that was used in the early days of machine learning and it kinda stuck. 'Cause
在初期使用的名称 机器学习和它有点卡住。因为

62
00:05:51,310 --> 00:05:55,239
maybe not a great name for this sort of function, for mapping from sizes of
也许不是一个伟大的名字为这种 功能，从尺寸的映射

63
00:05:55,239 --> 00:05:59,978
houses to the predictions, that you know.... I think the term hypothesis, maybe isn't
房子的预言，你知道.... 我认为长期假设，也许是不

64
00:05:59,978 --> 00:06:04,543
the best possible name for this, but this is the standard terminology that people use in
此最好的可能的名称，但是这是 标准术语的人使用

65
00:06:04,543 --> 00:06:09,362
machine learning. So don't worry too much about why people call it that. When
机器学习。所以不要太担心 人们为什么称呼它。何时

66
00:06:09,362 --> 00:06:14,332
designing a learning algorithm, the next thing we need to decide is how do we
设计一个学习算法，下一个 需要决定的事情，我们是怎么做的，我们

67
00:06:14,332 --> 00:06:20,540
represent this hypothesis h. For this and the next few videos, I'm going to choose
代表这假设h。为了这个以及 接下来的几个视频，我要选择

68
00:06:20,540 --> 00:06:26,978
our initial choice , for representing the hypothesis, will be the following. We're going to
我们最初的选择，代表 假设，会出现下面的。我们要

69
00:06:26,978 --> 00:06:33,009
represent h as follows. And we will write this as h<u>theta(x) equals theta<u>0</u></u>
表示h为如下。我们将这样写： ?<U>西塔（X）等于THETA <U> 0 </ U> </ U>

70
00:06:33,009 --> 00:06:39,254
plus theta<u>1 of x. And as a shorthand, sometimes instead of writing, you</u>
加θ<U>来的x 1。而作为一个缩写， 有时，而不是写作，你</ U>

71
00:06:39,254 --> 00:06:45,441
know, h subscript theta of x, sometimes there's a shorthand, I'll just write as a h of x.
知道标西塔，H，X，有时 有一个速记，我就写一个h的x。

72
00:06:45,441 --> 00:06:51,627
But more often I'll write it as a subscript theta over there. And plotting
但更多的时候我会写它作为一个 标西塔那边。和绘图

73
00:06:51,627 --> 00:06:58,210
this in the pictures, all this means is that, we are going to predict that y is a linear
这在图片中，所有这一切意味着， 我们将要预测，y是一个线性

74
00:06:58,210 --> 00:07:04,634
function of x. Right, so that's the data set and what this function is doing,
x的函数。没错，所以这是 数据集，这个功能是做什么，

75
00:07:04,634 --> 00:07:11,698
is predicting that y is some straight line function of x. That's h of x equals theta 0
预测y是一些直 直线x的函数。这是x的?等于THETA 0

76
00:07:11,698 --> 00:07:18,450
plus theta 1 x, okay? And why a linear function? Well, sometimes we'll want to
加THETA 1个，好吗？为什么线性 功能？嗯，有时候我们会想

77
00:07:18,450 --> 00:07:23,405
fit more complicated, perhaps non-linear functions as well. But since this linear
适应更加复杂，或许非线性 功能。但是，由于这种线性

78
00:07:23,405 --> 00:07:28,298
case is the simple building block, we will start with this example first of fitting
案件是简单的积木，我们将 从这个例子，首先拟合

79
00:07:28,298 --> 00:07:32,943
linear functions, and we will build on this to eventually have more complex
线性函数，我们将建立 这最终有更复杂的

80
00:07:32,943 --> 00:07:37,403
models, and more complex learning algorithms. Let me also give this
模型，以及更复杂的学习 算法。让我也给这个

81
00:07:37,403 --> 00:07:42,628
particular model a name. This model is called linear regression or this, for
特定型号的名称。这种模式是 称为线性回归，

82
00:07:42,628 --> 00:07:48,271
example, is actually linear regression with one variable, with the variable being
例如，实际上是线性回归 一个变量，该变量是

83
00:07:48,271 --> 00:07:53,914
x. Predicting all the prices as functions of one variable X. And another name for
x的所有的价格预测功能 一个变量X的另一个名字

84
00:07:53,914 --> 00:07:58,852
this model is univariate linear regression. And univariate is just a
这种模式是单变量线性 回归。单因素仅仅是一个

85
00:07:58,852 --> 00:08:04,400
fancy way of saying one variable. So, that's linear regression. In the next
奇特的方式说一个变量。因此， 这就是线性回归。在接下来的

86
00:08:04,400 --> 00:08:09,760
video we'll start to talk about just how we go about implementing this model.
我们将开始谈论多么视频 我们去实施这一模式。