Skip to content

Commit

Permalink
class pca init
Browse files Browse the repository at this point in the history
  • Loading branch information
SmirkCao committed May 31, 2019
1 parent cdfca18 commit a7b09d7
Show file tree
Hide file tree
Showing 3 changed files with 51 additions and 1 deletion.
5 changes: 4 additions & 1 deletion CH16/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
- 书中有提到在实际问题中,不同变量可能有不同的量纲,直接求主成分有时会产生不合理的结果。**消除这个影响**常对各个随机变量实施规范化,使其均值为0,方差为1。
- 关于主成分的性质,规范化的变量总体主成分主要是围绕特征值和特征向量展开的。
- 关于总体和样本的说明可以参考一下Strang的书[^1]中第十二章部分说明。
- 关于$k$的选择,2000年有一个文章自动选择[^2]

## 内容

Expand Down Expand Up @@ -152,4 +153,6 @@ $$

## 参考

[^1]: [Introduction to Linear Algebra](https://github.com/J-Mourad/Introduction-to-Linear-Algebra-5th-Edition---EE16A/raw/master/Ed%205%2C%20Gilbert%20Strang%20-%20Introduction%20to%20Linear%20Algebra%20(2016%2C%20Wellesley-Cambridge%20Press).pdf)
[^1]: [Introduction to Linear Algebra](https://github.com/J-Mourad/Introduction-to-Linear-Algebra-5th-Edition---EE16A/raw/master/Ed%205%2C%20Gilbert%20Strang%20-%20Introduction%20to%20Linear%20Algebra%20(2016%2C%20Wellesley-Cambridge%20Press).pdf)
[^2]: [Automatic choice of dimensionality for PCA](https://papers.nips.cc/paper/1853-automatic-choice-of-dimensionality-for-pca.pdf)

27 changes: 27 additions & 0 deletions CH16/pca.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#! /usr/bin/env python
#! -*- coding=utf-8 -*-
# Project: Lihang
# Filename: pca
# Date: 5/31/19
# Author: 😏 <smirk dot cao at gmail dot com>
# from svd import SVD # someday
import numpy as np


class PCA(object):
def __init__(self, n_components=2):
self.n_components_ = n_components
self.explained_variance_ratio_ = None
self.singular_values_ = None

def __str__(self,):
rst = "PCA algorithms:\n"
rst += "n_components: " + str(self.n_components_)
return rst

def fit(self, x):
# check n_components and min(n_samples, n_features)
pass

def fit_transform(x):
return x
20 changes: 20 additions & 0 deletions CH16/unit_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@
# Ref to : https://code.visualstudio.com/docs/python/unit-testing
import unittest
from sklearn import datasets
from pca import PCA as smirkpca
from sklearn.decomposition import PCA as skpca
import matplotlib.pyplot as plt
import numpy as np

Expand Down Expand Up @@ -108,3 +110,21 @@ def test_ex1601(self):
# plt.scatter(rst[:, 0], rst[:, 1])
# plt.show()

def test_pca(self):
# raw data
x = np.array([[2, 3, 3, 4, 5, 7],
[2, 4, 5, 5, 6, 8]])
# for sklearn x.shape == (n_samples, n_features)
pca_sklearn = skpca(n_components=2)
pca_sklearn.fit(x.T)
print("\n")
print(40*"*"+"sklearn_pca"+40*"*")
print(pca_sklearn.explained_variance_ratio_)
print(pca_sklearn.singular_values_)

print(40*"*"+"smirk_pca"+40*"*")
pca_test = smirkpca(n_components=2)
print(pca_test)

def test_pca_get_fig(self):
pass

0 comments on commit a7b09d7

Please sign in to comment.