Skip to content

Commit

Permalink
规范变量
Browse files Browse the repository at this point in the history
  • Loading branch information
fengdu78 committed Mar 27, 2018
1 parent ff009ef commit 69c5a6b
Show file tree
Hide file tree
Showing 23 changed files with 55 additions and 43 deletions.
2 changes: 1 addition & 1 deletion code/ex4-NN back propagation/1- NN back propagation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1847,7 +1847,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
"version": "3.6.2"
}
},
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion code/ex4-NN back propagation/ML-Exercise4.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -781,7 +781,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
"version": "3.6.2"
}
},
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion code/ex7-kmeans and PCA/2- 2D kmeans.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -745,7 +745,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
"version": "3.6.2"
}
},
"nbformat": 4,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -410,7 +410,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
"version": "3.6.2"
}
},
"nbformat": 4,
Expand Down
10 changes: 7 additions & 3 deletions code/ex7-kmeans and PCA/4- 2D PCA.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
Expand Down Expand Up @@ -283,7 +285,9 @@
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"U, S, V = pca(X_norm)"
Expand Down Expand Up @@ -497,7 +501,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
"version": "3.6.2"
}
},
"nbformat": 4,
Expand Down
6 changes: 4 additions & 2 deletions code/ex7-kmeans and PCA/5- PCA on face data.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -206,7 +206,9 @@
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"U, _, _ = pca(X)"
Expand Down Expand Up @@ -461,7 +463,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
"version": "3.6.2"
}
},
"nbformat": 4,
Expand Down
6 changes: 4 additions & 2 deletions code/ex7-kmeans and PCA/ML-Exercise7.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1225,7 +1225,9 @@
{
"cell_type": "code",
"execution_count": 48,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"U, S, V = pca(X)\n",
Expand Down Expand Up @@ -1297,7 +1299,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
"version": "3.6.2"
}
},
"nbformat": 4,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -586,7 +586,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
"version": "3.6.2"
}
},
"nbformat": 4,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -247,7 +247,9 @@
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"n_movie, n_user = Y.shape\n",
Expand Down Expand Up @@ -557,7 +559,9 @@
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"res = opt.minimize(fun=regularized_cost,\n",
Expand Down Expand Up @@ -737,7 +741,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
"version": "3.6.2"
}
},
"nbformat": 4,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1218,7 +1218,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
"version": "3.6.2"
}
},
"nbformat": 4,
Expand Down
Binary file added docx/week1.docx
Binary file not shown.
Binary file added docx/week10.docx
Binary file not shown.
Binary file added docx/week2.docx
Binary file not shown.
Binary file added docx/week3.docx
Binary file not shown.
Binary file added docx/week4.docx
Binary file not shown.
Binary file added docx/week5.docx
Binary file not shown.
Binary file added docx/week6.docx
Binary file not shown.
Binary file added docx/week7.docx
Binary file not shown.
Binary file added docx/week8.docx
Binary file not shown.
Binary file added docx/week9.docx
Binary file not shown.
36 changes: 18 additions & 18 deletions html/week9.html

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions markdown/week1.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@

因为这几个离散的输出分别对应良性,第一类第二类或者第三类癌症,在分类问题中我们可以用另一种方式绘制这些数据点。

现在我用不同的符号来表示这些数据。既然我们把肿瘤的尺寸看做区分恶性或良性的特征,那么我可以这么画,我用不同的符号来表示良性和恶性肿瘤。或者说是负样本和正样本现在我们不全部画X,良性的肿瘤改成用 O 表示,恶性的继续用 X 表示。来预测肿瘤的恶性与否。
现在我用不同的符号来表示这些数据。既然我们把肿瘤的尺寸看做区分恶性或良性的特征,那么我可以这么画,我用不同的符号来表示良性和恶性肿瘤。或者说是负样本和正样本现在我们不全部画**X**,良性的肿瘤改成用 **O** 表示,恶性的继续用 **X** 表示。来预测肿瘤的恶性与否。

在其它一些机器学习问题中,可能会遇到不止一种特征。举个例子,我们不仅知道肿瘤的尺寸,还知道对应患者的年龄。在其他机器学习问题中,我们通常有更多的特征,我朋友研究这个问题时,通常采用这些特征,比如肿块密度,肿瘤细胞尺寸的一致性和形状的一致性等等,还有一些其他的特征。这就是我们即将学到最有趣的学习算法之一。

Expand Down Expand Up @@ -150,7 +150,7 @@

![](../images/743c1d46d4288f8884f0981d437a15c1.png)

看看这个无监督学习算法,实现这个得要多么的复杂,是吧?它似乎是这样,为了构建这个应用,完成这个音频处理似乎需要你去写大量的代码或链接到一堆的合成器JAVA库,处理音频的库,看上去绝对是个复杂的程序,去完成这个从音频中分离出音频。事实上,这个算法对应你刚才知道的那个问题的算法可以就用一行代码来完成。
看看这个无监督学习算法,实现这个得要多么的复杂,是吧?它似乎是这样,为了构建这个应用,完成这个音频处理似乎需要你去写大量的代码或链接到一堆的合成器**JAVA**库,处理音频的库,看上去绝对是个复杂的程序,去完成这个从音频中分离出音频。事实上,这个算法对应你刚才知道的那个问题的算法可以就用一行代码来完成。

就是这里展示的代码:`[W,s,v] = svd((repmat(sum(x.*x,1),size(x,1),1).*x)*x');`

Expand Down
14 changes: 7 additions & 7 deletions markdown/week9.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

什么是异常检测呢?为了解释这个概念,让我举一个例子吧:

假想你是一个飞机引擎制造商,当你生产的飞机引擎从生产线上流出时,你需要进行QA(质量控制测试),而作为这个测试的一部分,你测量了飞机引擎的一些特征变量,比如引擎运转时产生的热量,或者引擎的振动等等。
假想你是一个飞机引擎制造商,当你生产的飞机引擎从生产线上流出时,你需要进行**QA**(质量控制测试),而作为这个测试的一部分,你测量了飞机引擎的一些特征变量,比如引擎运转时产生的热量,或者引擎的振动等等。

![](../images/93d6dfe7e5cb8a46923c178171889747.png)

Expand All @@ -33,8 +33,8 @@
$$
if \quad p(x)
\begin{cases}
\leq \varepsilon & anomaly \\\
\> \varepsilon & normal
< \varepsilon & anomaly \\
> =\varepsilon & normal
\end{cases}
$$

Expand Down Expand Up @@ -149,13 +149,13 @@ $p(x)=\prod\limits_{j=1}^np(x_j;\mu_j,\sigma_j^2)=\prod\limits_{j=1}^1\frac{1}{\

对于异常检测算法,我们使用的特征是至关重要的,下面谈谈如何选择特征:

异常检测假设特征符合高斯分布,如果数据的分布不是高斯分布,异常检测算法也能够工作,但是最好还是将数据转换成高斯分布,例如使用对数函数:$x= log(x+c)$,其中 $c$ 为非负常数; 或者 $x=x^c$,$c$为 0-1 之间的一个分数,等方法。
异常检测假设特征符合高斯分布,如果数据的分布不是高斯分布,异常检测算法也能够工作,但是最好还是将数据转换成高斯分布,例如使用对数函数:$x= log(x+c)$,其中 $c$ 为非负常数; 或者 $x=x^c$,$c$为 0-1 之间的一个分数,等方法。(编者注:在**python**中,通常用`np.log1p()`函数,$log1p$就是 $log(x+1)$,可以避免出现负数结果,反向函数就是`np.expm1()`)

![](../images/0990d6b7a5ab3c0036f42083fe2718c6.jpg)

误差分析:

一个常见的问题是一些异常的数据可能也会有较高的p(x)值,因而被算法认为是正常的。这种情况下误差分析能够帮助我们,我们可以分析那些被算法错误预测为正常的数据,观察能否找出一些问题。我们可能能从问题中发现我们需要增加一些新的特征,增加这些新特征后获得的新算法能够帮助我们更好地进行异常检测。
一个常见的问题是一些异常的数据可能也会有较高的$p(x)$值,因而被算法认为是正常的。这种情况下误差分析能够帮助我们,我们可以分析那些被算法错误预测为正常的数据,观察能否找出一些问题。我们可能能从问题中发现我们需要增加一些新的特征,增加这些新特征后获得的新算法能够帮助我们更好地进行异常检测。

异常检测误差分析:

Expand Down Expand Up @@ -327,7 +327,7 @@ $x^{(i)}$电影 $i$ 的特征向量

针对用户 $j$,该线性回归模型的代价为预测误差的平方和,加上正则化项:
$$
\min_{\theta (j)}\frac{1}{2}\sum_{i:r(i,j)=1}\left((\theta^{(j)})^Tx^{i}-y^{(i,j)}\right)^2+\frac{\lambda}{2}\left(\theta_{k}^{(j)}\right)^2
\min_{\theta (j)}\frac{1}{2}\sum_{i:r(i,j)=1}\left((\theta^{(j)})^Tx^{(i)}-y^{(i,j)}\right)^2+\frac{\lambda}{2}\left(\theta_{k}^{(j)}\right)^2
$$


Expand Down Expand Up @@ -381,7 +381,7 @@ $$
注:在协同过滤从算法中,我们通常不使用方差项,如果需要的话,算法会自动学得。
协同过滤算法使用步骤如下:

1. 初始 $x^{(1)},x^{(1)},...x^{(nm)},\ \theta^{(1)},\theta^{(2)},...,\theta^{(nu)}$为一些随机小值
1. 初始 $x^{(1)},x^{(1)},...x^{(nm)},\ \theta^{(1)},\theta^{(2)},...,\theta^{(n_u)}$为一些随机小值

2. 使用梯度下降算法最小化代价函数

Expand Down

0 comments on commit 69c5a6b

Please sign in to comment.