Skip to content

Latest commit

 

History

History
747 lines (664 loc) · 41.1 KB

Chapter 10 Unsupervised learning.md

File metadata and controls

747 lines (664 loc) · 41.1 KB

Chapter 10 Unsupervised learning

Principle Component

* Quiz
	You are analyzing a dataset where each observation is an age, height, length, and width of a particular turtle. You want to know if the data can be well described by fewer than four dimensions (maybe for plotting), so you decide to do Principal Component Analysis. Which of the following is most likely to be the loadings of the first Principal Component?

	answer: 
			[]	(1, 1, 1, 1)
			[x] (.5, .5, .5, .5) correct
			[]	(.71, -.71, 0, 0)
			[]	(1, -1, -1, -1)

Higher order Principal Component

* Quiz
	Suppose we a data set where each data point represents a single student's scores on a math test, a physics test, a reading comprehension test, and a vocabulary test.

	We find the first two principal components, which capture 90% of the variability in the data, and interpret their loadings. We conclude that the first principal component represents overall academic ability, and the second represents a contrast between quantitative ability and verbal ability.

	What loadings would be consistent with that interpretation? Choose all that apply.


	(0.5, 0.5, 0.5, 0.5) and (0.71, 0.71, 0, 0)
	(0.5, 0.5, 0.5, 0.5) and (0, 0, -0.71, -0.71)
	[x] (0.5, 0.5, 0.5, 0.5) and (0.5, 0.5, -0.5, -0.5)
	[x] (0.5, 0.5, 0.5, 0.5) and (-0.5, -0.5, 0.5, 0.5)
	(0.71, 0.71, 0, 0) and (0, 0, 0.71, 0.71)
	(0.71, 0, -0.71, 0) and (0, 0.71, 0, -0.71)

K-clustering

1/ True or False: If we use k-means clustering, will we get the same cluster assignments for each point, whether or not we standardize the variables.

	[] True
	[x] False correct

Hierarchical Clustering

* Quiz

	True or False: If we cut the dendrogram at a lower point, we will tend to get more clusters (and cannot get fewer clusters).

		[x] True correct
		[] False

	Breast cancer example
		In the heat map for breast cancer data, which of the following depended on the output of hierarchical clustering?


		[x] The ordering of the rows
		[x] The ordering of the columns
		[] The coloring of the cells as red or green

Unsupervised in R

* Quiz

	0/ Suppose we want to fit a linear regression, but the number of variables is much larger than the number of observations. In some cases, we may improve the fit by reducing the dimension of the features before.

	In this problem, we use a data set with n = 300 and p = 200, so we have more observations than variables, but not by much. Load the data x, y, x.test, and y.test from 10.R.RData.

	First, concatenate x and x.test using the rbind functions and perform a principal components analysis on the concatenated data frame (use the "scale=TRUE" option). To within 10% relative error, what proportion of the variance is explained by the first five 

		Answer: 0.3498565



	1/ The previous answer suggests that a relatively small number of "latent variables" account for a substantial fraction of the features' variability. We might believe that these latent variables are more important than linear combinations of the features that have low variance.
	We can try forgetting about the raw features and using the first five principal components (computed on rbind(x,x.test)) instead as low-dimensional derived features. What is the mean-squared test error if we regress y on the first five principal components, and use the resulting model to predict y.test?

		Answer: 1
			In the actual data generating model for this example, the features may be noisy proxies for a few latent variables that actually drive the response. This is not an uncommon situation when we have high-dimensional data.

	2/ Now, try an OLS linear regression of y on the matrix x. What is the mean squared predition error if we use the fitted model to predict y.test from x.test?

		Answer: 3.657  ; The mean squared error is worse because of the variance involved in fitting a very high dimensional model. As it turned out here, the large-variance directions of x turned out to be the important ones for predicting y. Note that this need not always be the case, but it often is.


		-- Run code:

		> load("/Users/tungthanhle/Box Sync/MOOCs/Statistical-Learning-Stanford/R session/10.R.RData")
		> ls()
		 [1] "alpha"           "alpha.fn"        "boot.out"       
		 [4] "cv.error"        "cv.error10"      "d"              
		 [7] "degree"          "df"              "df.melt"        
		[10] "df1"             "df2"             "df3"            
		[13] "df4"             "df9a1"           "df9a2"          
		[16] "Direction.2005"  "esoph_melt"      "fit1"           
		[19] "g"               "glm.fit"         "glm.pred"       
		[22] "glm.probs"       "hc.average"      "hc.complete"    
		[25] "hc.cut"          "hc.single"       "km.out"         
		[28] "legend.key"      "legend.key.size" "legend.title"   
		[31] "loocv"           "p"               "pca.out"        
		[34] "regplot"         "test"            "train"          
		[37] "which"           "x"               "x.test"         
		[40] "xmean"           "Xy"              "y"              
		[43] "y.test"         
		> xvars = rbind(x,x.test)  # bind the variables
		> dataset1 = data.frame(xvars)  #create a dataset
		> pca.out = prcomp(dataset1, scale = TRUE) # prinicipal component analysis model
		> pca.out$sdev   # gives standard deviations 
		  [1] 5.0564664 4.5965404 3.7229223 2.6971281 1.4630811 1.1682700
		  [7] 1.1584782 1.1554447 1.1459109 1.1393333 1.1361861 1.1318983
		 [13] 1.1104748 1.1081031 1.1022576 1.0996006 1.0925670 1.0862245
		 [19] 1.0851878 1.0747535 1.0694183 1.0626451 1.0596740 1.0579027
		 [25] 1.0543595 1.0497312 1.0461509 1.0451297 1.0355655 1.0293788
		 [31] 1.0246770 1.0236170 1.0212631 1.0174942 1.0143562 1.0111483
		 [37] 1.0051017 1.0002184 0.9971100 0.9942072 0.9913248 0.9854912
		 [43] 0.9803071 0.9796830 0.9771657 0.9738685 0.9681446 0.9576192
		 [49] 0.9560902 0.9531962 0.9516221 0.9496676 0.9464778 0.9408281
		 [55] 0.9393607 0.9318866 0.9263620 0.9257830 0.9229648 0.9195461
		 [61] 0.9177641 0.9152911 0.9101935 0.9082005 0.9019688 0.8981175
		 [67] 0.8965373 0.8946101 0.8870541 0.8824777 0.8798454 0.8781490
		 [73] 0.8760034 0.8751288 0.8733763 0.8693101 0.8651374 0.8610424
		 [79] 0.8569558 0.8562169 0.8548111 0.8498707 0.8464987 0.8451881
		 [85] 0.8424197 0.8376215 0.8363309 0.8338686 0.8289554 0.8241035
		 [91] 0.8221352 0.8177683 0.8173337 0.8110224 0.8101821 0.8086392
		 [97] 0.8031119 0.8009746 0.7976889 0.7960114 0.7931180 0.7894603
		[103] 0.7865821 0.7849213 0.7819922 0.7772328 0.7740821 0.7733873
		[109] 0.7677353 0.7668402 0.7639293 0.7607650 0.7591061 0.7551551
		[115] 0.7548301 0.7502366 0.7477301 0.7439305 0.7408162 0.7380059
		[121] 0.7335575 0.7320845 0.7276506 0.7266989 0.7233278 0.7218310
		[127] 0.7191747 0.7163773 0.7105583 0.7095616 0.7038146 0.7029900
		[133] 0.6993003 0.6973292 0.6937365 0.6912564 0.6903477 0.6885033
		[139] 0.6857440 0.6830917 0.6820117 0.6776269 0.6748846 0.6733988
		[145] 0.6660594 0.6600808 0.6579576 0.6551212 0.6498976 0.6485745
		[151] 0.6471284 0.6455747 0.6444500 0.6401032 0.6363138 0.6345960
		[157] 0.6314878 0.6304404 0.6255655 0.6201083 0.6182002 0.6150316
		[163] 0.6117353 0.6103122 0.6060084 0.6056634 0.6012351 0.5991488
		[169] 0.5948291 0.5906015 0.5859830 0.5836761 0.5811308 0.5745975
		[175] 0.5729187 0.5705985 0.5674488 0.5648463 0.5606856 0.5573744
		[181] 0.5502935 0.5491188 0.5467863 0.5449206 0.5413659 0.5396379
		[187] 0.5353621 0.5264635 0.5244417 0.5207019 0.5153204 0.5124382
		[193] 0.5106356 0.5030117 0.5011382 0.4911595 0.4872318 0.4848873
		[199] 0.4809934 0.4338230
		> screeplot(pca.out)   # Scree plot shows variance explained per principal component
		> (pca.out$sdev)^2/ sum(pca.out$sdev^2)
		  [1] 0.1278392623 0.1056409183 0.0693007523 0.0363725007
		  [5] 0.0107030317 0.0068242745 0.0067103583 0.0066752627
		  [9] 0.0065655588 0.0064904024 0.0064545937 0.0064059692
		 [13] 0.0061657709 0.0061394622 0.0060748595 0.0060456069
		 [17] 0.0059685132 0.0058994183 0.0058881627 0.0057754755
		 [21] 0.0057182773 0.0056460726 0.0056145445 0.0055957908
		 [25] 0.0055583696 0.0055096776 0.0054721590 0.0054614799
		 [29] 0.0053619795 0.0052981033 0.0052498150 0.0052389584
		 [33] 0.0052148915 0.0051764724 0.0051445927 0.0051121049
		 [37] 0.0050511467 0.0050021844 0.0049711415 0.0049422401
		 [41] 0.0049136244 0.0048559649 0.0048050097 0.0047988943
		 [45] 0.0047742642 0.0047420998 0.0046865195 0.0045851723
		 [49] 0.0045705424 0.0045429154 0.0045279231 0.0045093423
		 [53] 0.0044791009 0.0044257876 0.0044119930 0.0043420631
		 [57] 0.0042907331 0.0042853704 0.0042593199 0.0042278251
		 [61] 0.0042114544 0.0041887894 0.0041422607 0.0041241409
		 [65] 0.0040677386 0.0040330751 0.0040188957 0.0040016357
		 [69] 0.0039343249 0.0038938347 0.0038706398 0.0038557287
		 [73] 0.0038369096 0.0038292523 0.0038139310 0.0037785000
		 [77] 0.0037423137 0.0037069703 0.0036718658 0.0036655366
		 [81] 0.0036535101 0.0036114012 0.0035828003 0.0035717146
		 [85] 0.0035483547 0.0035080490 0.0034972468 0.0034766843
		 [89] 0.0034358354 0.0033957331 0.0033795316 0.0033437247
		 [93] 0.0033401718 0.0032887867 0.0032819748 0.0032694866
		 [97] 0.0032249440 0.0032078015 0.0031815380 0.0031681707
		[101] 0.0031451808 0.0031162379 0.0030935568 0.0030805076
		[105] 0.0030575590 0.0030204542 0.0029960152 0.0029906398
		[109] 0.0029470877 0.0029402197 0.0029179396 0.0028938172
		[113] 0.0028812100 0.0028512962 0.0028488426 0.0028142748
		[117] 0.0027955018 0.0027671627 0.0027440431 0.0027232635
		[121] 0.0026905334 0.0026797384 0.0026473773 0.0026404564
		[125] 0.0026160158 0.0026052002 0.0025860610 0.0025659821
		[129] 0.0025244652 0.0025173882 0.0024767750 0.0024709747
		[133] 0.0024451047 0.0024313399 0.0024063519 0.0023891770
		[137] 0.0023828997 0.0023701842 0.0023512243 0.0023330710
		[141] 0.0023257001 0.0022958914 0.0022773462 0.0022673295
		[145] 0.0022181756 0.0021785331 0.0021645410 0.0021459191
		[149] 0.0021118347 0.0021032444 0.0020938758 0.0020838333
		[153] 0.0020765788 0.0020486608 0.0020244761 0.0020135604
		[157] 0.0019938841 0.0019872753 0.0019566612 0.0019226713
		[161] 0.0019108571 0.0018913195 0.0018711006 0.0018624051
		[165] 0.0018362310 0.0018341405 0.0018074183 0.0017948964
		[169] 0.0017691084 0.0017440507 0.0017168802 0.0017033890
		[173] 0.0016885651 0.0016508115 0.0016411794 0.0016279131
		[177] 0.0016099909 0.0015952569 0.0015718418 0.0015533311
		[181] 0.0015141146 0.0015076574 0.0014948762 0.0014846925
		[185] 0.0014653853 0.0014560455 0.0014330627 0.0013858190
		[189] 0.0013751954 0.0013556525 0.0013277753 0.0013129643
		[193] 0.0013037434 0.0012651038 0.0012556974 0.0012061884
		[197] 0.0011869744 0.0011755787 0.0011567731 0.0009410121
		> sum(0.1278392623, 0.1056409183, 0.0693007523, 0.0363725007, 0.0107030317)  #Take sum of first five
		[1] 0.3498565
		> #alternative method is to use the cumulative sum function, which gives the same 0.3498565 on fifth entry
		> cumsum((pca.out$sdev)^2) / sum(pca.out$sdev^2)
		  [1] 0.1278393 0.2334802 0.3027809 0.3391534 0.3498565 0.3566807
		  [7] 0.3633911 0.3700664 0.3766319 0.3831223 0.3895769 0.3959829
		 [13] 0.4021487 0.4082881 0.4143630 0.4204086 0.4263771 0.4322765
		 [19] 0.4381647 0.4439402 0.4496584 0.4553045 0.4609190 0.4665148
		 [25] 0.4720732 0.4775829 0.4830550 0.4885165 0.4938785 0.4991766
		 [31] 0.5044264 0.5096654 0.5148803 0.5200567 0.5252013 0.5303134
		 [37] 0.5353646 0.5403668 0.5453379 0.5502802 0.5551938 0.5600497
		 [43] 0.5648548 0.5696536 0.5744279 0.5791700 0.5838565 0.5884417
		 [49] 0.5930122 0.5975552 0.6020831 0.6065924 0.6110715 0.6154973
		 [55] 0.6199093 0.6242514 0.6285421 0.6328275 0.6370868 0.6413146
		 [61] 0.6455261 0.6497149 0.6538571 0.6579813 0.6620490 0.6660821
		 [67] 0.6701010 0.6741026 0.6780369 0.6819308 0.6858014 0.6896571
		 [73] 0.6934940 0.6973233 0.7011372 0.7049157 0.7086580 0.7123650
		 [79] 0.7160369 0.7197024 0.7233559 0.7269673 0.7305501 0.7341218
		 [85] 0.7376702 0.7411782 0.7446755 0.7481522 0.7515880 0.7549837
		 [91] 0.7583633 0.7617070 0.7650472 0.7683360 0.7716179 0.7748874
		 [97] 0.7781124 0.7813202 0.7845017 0.7876699 0.7908151 0.7939313
		[103] 0.7970249 0.8001054 0.8031629 0.8061834 0.8091794 0.8121700
		[109] 0.8151171 0.8180573 0.8209753 0.8238691 0.8267503 0.8296016
		[115] 0.8324504 0.8352647 0.8380602 0.8408274 0.8435714 0.8462947
		[121] 0.8489852 0.8516650 0.8543123 0.8569528 0.8595688 0.8621740
		[127] 0.8647601 0.8673261 0.8698505 0.8723679 0.8748447 0.8773157
		[133] 0.8797608 0.8821921 0.8845985 0.8869876 0.8893705 0.8917407
		[139] 0.8940919 0.8964250 0.8987507 0.9010466 0.9033239 0.9055913
		[145] 0.9078095 0.9099880 0.9121525 0.9142984 0.9164103 0.9185135
		[151] 0.9206074 0.9226912 0.9247678 0.9268165 0.9288409 0.9308545
		[157] 0.9328484 0.9348357 0.9367923 0.9387150 0.9406259 0.9425172
		[163] 0.9443883 0.9462507 0.9480869 0.9499211 0.9517285 0.9535234
		[169] 0.9552925 0.9570365 0.9587534 0.9604568 0.9621454 0.9637962
		[175] 0.9654374 0.9670653 0.9686753 0.9702705 0.9718424 0.9733957
		[181] 0.9749098 0.9764175 0.9779123 0.9793970 0.9808624 0.9823185
		[187] 0.9837515 0.9851373 0.9865125 0.9878682 0.9891960 0.9905089
		[193] 0.9918127 0.9930778 0.9943335 0.9955397 0.9967266 0.9979022
		[199] 0.9990590 1.0000000
		> summary(pca.out)
		Importance of components:
		                          PC1    PC2    PC3     PC4    PC5
		Standard deviation     5.0565 4.5965 3.7229 2.69713 1.4631
		Proportion of Variance 0.1278 0.1056 0.0693 0.03637 0.0107
		Cumulative Proportion  0.1278 0.2335 0.3028 0.33915 0.3499
		                           PC6     PC7     PC8     PC9    PC10
		Standard deviation     1.16827 1.15848 1.15544 1.14591 1.13933
		Proportion of Variance 0.00682 0.00671 0.00668 0.00657 0.00649
		Cumulative Proportion  0.35668 0.36339 0.37007 0.37663 0.38312
		                          PC11    PC12    PC13    PC14    PC15
		Standard deviation     1.13619 1.13190 1.11047 1.10810 1.10226
		Proportion of Variance 0.00645 0.00641 0.00617 0.00614 0.00607
		Cumulative Proportion  0.38958 0.39598 0.40215 0.40829 0.41436
		                          PC16    PC17   PC18    PC19    PC20
		Standard deviation     1.09960 1.09257 1.0862 1.08519 1.07475
		Proportion of Variance 0.00605 0.00597 0.0059 0.00589 0.00578
		Cumulative Proportion  0.42041 0.42638 0.4323 0.43816 0.44394
		                          PC21    PC22    PC23   PC24    PC25
		Standard deviation     1.06942 1.06265 1.05967 1.0579 1.05436
		Proportion of Variance 0.00572 0.00565 0.00561 0.0056 0.00556
		Cumulative Proportion  0.44966 0.45530 0.46092 0.4665 0.47207
		                          PC26    PC27    PC28    PC29   PC30
		Standard deviation     1.04973 1.04615 1.04513 1.03557 1.0294
		Proportion of Variance 0.00551 0.00547 0.00546 0.00536 0.0053
		Cumulative Proportion  0.47758 0.48306 0.48852 0.49388 0.4992
		                          PC31    PC32    PC33    PC34    PC35
		Standard deviation     1.02468 1.02362 1.02126 1.01749 1.01436
		Proportion of Variance 0.00525 0.00524 0.00521 0.00518 0.00514
		Cumulative Proportion  0.50443 0.50967 0.51488 0.52006 0.52520
		                          PC36    PC37   PC38    PC39    PC40
		Standard deviation     1.01115 1.00510 1.0002 0.99711 0.99421
		Proportion of Variance 0.00511 0.00505 0.0050 0.00497 0.00494
		Cumulative Proportion  0.53031 0.53536 0.5404 0.54534 0.55028
		                          PC41    PC42    PC43   PC44    PC45
		Standard deviation     0.99132 0.98549 0.98031 0.9797 0.97717
		Proportion of Variance 0.00491 0.00486 0.00481 0.0048 0.00477
		Cumulative Proportion  0.55519 0.56005 0.56485 0.5696 0.57443
		                          PC46    PC47    PC48    PC49    PC50
		Standard deviation     0.97387 0.96814 0.95762 0.95609 0.95320
		Proportion of Variance 0.00474 0.00469 0.00459 0.00457 0.00454
		Cumulative Proportion  0.57917 0.58386 0.58844 0.59301 0.59756
		                          PC51    PC52    PC53    PC54    PC55
		Standard deviation     0.95162 0.94967 0.94648 0.94083 0.93936
		Proportion of Variance 0.00453 0.00451 0.00448 0.00443 0.00441
		Cumulative Proportion  0.60208 0.60659 0.61107 0.61550 0.61991
		                          PC56    PC57    PC58    PC59    PC60
		Standard deviation     0.93189 0.92636 0.92578 0.92296 0.91955
		Proportion of Variance 0.00434 0.00429 0.00429 0.00426 0.00423
		Cumulative Proportion  0.62425 0.62854 0.63283 0.63709 0.64131
		                          PC61    PC62    PC63    PC64    PC65
		Standard deviation     0.91776 0.91529 0.91019 0.90820 0.90197
		Proportion of Variance 0.00421 0.00419 0.00414 0.00412 0.00407
		Cumulative Proportion  0.64553 0.64971 0.65386 0.65798 0.66205
		                          PC66    PC67   PC68    PC69    PC70
		Standard deviation     0.89812 0.89654 0.8946 0.88705 0.88248
		Proportion of Variance 0.00403 0.00402 0.0040 0.00393 0.00389
		Cumulative Proportion  0.66608 0.67010 0.6741 0.67804 0.68193
		                          PC71    PC72    PC73    PC74    PC75
		Standard deviation     0.87985 0.87815 0.87600 0.87513 0.87338
		Proportion of Variance 0.00387 0.00386 0.00384 0.00383 0.00381
		Cumulative Proportion  0.68580 0.68966 0.69349 0.69732 0.70114
		                          PC76    PC77    PC78    PC79    PC80
		Standard deviation     0.86931 0.86514 0.86104 0.85696 0.85622
		Proportion of Variance 0.00378 0.00374 0.00371 0.00367 0.00367
		Cumulative Proportion  0.70492 0.70866 0.71237 0.71604 0.71970
		                          PC81    PC82    PC83    PC84    PC85
		Standard deviation     0.85481 0.84987 0.84650 0.84519 0.84242
		Proportion of Variance 0.00365 0.00361 0.00358 0.00357 0.00355
		Cumulative Proportion  0.72336 0.72697 0.73055 0.73412 0.73767
		                          PC86   PC87    PC88    PC89   PC90
		Standard deviation     0.83762 0.8363 0.83387 0.82896 0.8241
		Proportion of Variance 0.00351 0.0035 0.00348 0.00344 0.0034
		Cumulative Proportion  0.74118 0.7447 0.74815 0.75159 0.7550
		                          PC91    PC92    PC93    PC94    PC95
		Standard deviation     0.82214 0.81777 0.81733 0.81102 0.81018
		Proportion of Variance 0.00338 0.00334 0.00334 0.00329 0.00328
		Cumulative Proportion  0.75836 0.76171 0.76505 0.76834 0.77162
		                          PC96    PC97    PC98    PC99   PC100
		Standard deviation     0.80864 0.80311 0.80097 0.79769 0.79601
		Proportion of Variance 0.00327 0.00322 0.00321 0.00318 0.00317
		Cumulative Proportion  0.77489 0.77811 0.78132 0.78450 0.78767
		                         PC101   PC102   PC103   PC104   PC105
		Standard deviation     0.79312 0.78946 0.78658 0.78492 0.78199
		Proportion of Variance 0.00315 0.00312 0.00309 0.00308 0.00306
		Cumulative Proportion  0.79082 0.79393 0.79702 0.80011 0.80316
		                         PC106  PC107   PC108   PC109   PC110
		Standard deviation     0.77723 0.7741 0.77339 0.76774 0.76684
		Proportion of Variance 0.00302 0.0030 0.00299 0.00295 0.00294
		Cumulative Proportion  0.80618 0.8092 0.81217 0.81512 0.81806
		                         PC111   PC112   PC113   PC114   PC115
		Standard deviation     0.76393 0.76077 0.75911 0.75516 0.75483
		Proportion of Variance 0.00292 0.00289 0.00288 0.00285 0.00285
		Cumulative Proportion  0.82098 0.82387 0.82675 0.82960 0.83245
		                         PC116  PC117   PC118   PC119   PC120
		Standard deviation     0.75024 0.7477 0.74393 0.74082 0.73801
		Proportion of Variance 0.00281 0.0028 0.00277 0.00274 0.00272
		Cumulative Proportion  0.83526 0.8381 0.84083 0.84357 0.84629
		                         PC121   PC122   PC123   PC124   PC125
		Standard deviation     0.73356 0.73208 0.72765 0.72670 0.72333
		Proportion of Variance 0.00269 0.00268 0.00265 0.00264 0.00262
		Cumulative Proportion  0.84899 0.85166 0.85431 0.85695 0.85957
		                         PC126   PC127   PC128   PC129   PC130
		Standard deviation     0.72183 0.71917 0.71638 0.71056 0.70956
		Proportion of Variance 0.00261 0.00259 0.00257 0.00252 0.00252
		Cumulative Proportion  0.86217 0.86476 0.86733 0.86985 0.87237
		                         PC131   PC132   PC133   PC134   PC135
		Standard deviation     0.70381 0.70299 0.69930 0.69733 0.69374
		Proportion of Variance 0.00248 0.00247 0.00245 0.00243 0.00241
		Cumulative Proportion  0.87484 0.87732 0.87976 0.88219 0.88460
		                         PC136   PC137   PC138   PC139   PC140
		Standard deviation     0.69126 0.69035 0.68850 0.68574 0.68309
		Proportion of Variance 0.00239 0.00238 0.00237 0.00235 0.00233
		Cumulative Proportion  0.88699 0.88937 0.89174 0.89409 0.89643
		                         PC141  PC142   PC143   PC144   PC145
		Standard deviation     0.68201 0.6776 0.67488 0.67340 0.66606
		Proportion of Variance 0.00233 0.0023 0.00228 0.00227 0.00222
		Cumulative Proportion  0.89875 0.9011 0.90332 0.90559 0.90781
		                         PC146   PC147   PC148   PC149  PC150
		Standard deviation     0.66008 0.65796 0.65512 0.64990 0.6486
		Proportion of Variance 0.00218 0.00216 0.00215 0.00211 0.0021
		Cumulative Proportion  0.90999 0.91215 0.91430 0.91641 0.9185
		                         PC151   PC152   PC153   PC154   PC155
		Standard deviation     0.64713 0.64557 0.64445 0.64010 0.63631
		Proportion of Variance 0.00209 0.00208 0.00208 0.00205 0.00202
		Cumulative Proportion  0.92061 0.92269 0.92477 0.92682 0.92884
		                         PC156   PC157   PC158   PC159   PC160
		Standard deviation     0.63460 0.63149 0.63044 0.62557 0.62011
		Proportion of Variance 0.00201 0.00199 0.00199 0.00196 0.00192
		Cumulative Proportion  0.93085 0.93285 0.93484 0.93679 0.93872
		                         PC161   PC162   PC163   PC164   PC165
		Standard deviation     0.61820 0.61503 0.61174 0.61031 0.60601
		Proportion of Variance 0.00191 0.00189 0.00187 0.00186 0.00184
		Cumulative Proportion  0.94063 0.94252 0.94439 0.94625 0.94809
		                         PC166   PC167   PC168   PC169   PC170
		Standard deviation     0.60566 0.60124 0.59915 0.59483 0.59060
		Proportion of Variance 0.00183 0.00181 0.00179 0.00177 0.00174
		Cumulative Proportion  0.94992 0.95173 0.95352 0.95529 0.95704
		                         PC171  PC172   PC173   PC174   PC175
		Standard deviation     0.58598 0.5837 0.58113 0.57460 0.57292
		Proportion of Variance 0.00172 0.0017 0.00169 0.00165 0.00164
		Cumulative Proportion  0.95875 0.9605 0.96215 0.96380 0.96544
		                         PC176   PC177  PC178   PC179   PC180
		Standard deviation     0.57060 0.56745 0.5648 0.56069 0.55737
		Proportion of Variance 0.00163 0.00161 0.0016 0.00157 0.00155
		Cumulative Proportion  0.96707 0.96868 0.9703 0.97184 0.97340
		                         PC181   PC182   PC183   PC184   PC185
		Standard deviation     0.55029 0.54912 0.54679 0.54492 0.54137
		Proportion of Variance 0.00151 0.00151 0.00149 0.00148 0.00147
		Cumulative Proportion  0.97491 0.97642 0.97791 0.97940 0.98086
		                         PC186   PC187   PC188   PC189   PC190
		Standard deviation     0.53964 0.53536 0.52646 0.52444 0.52070
		Proportion of Variance 0.00146 0.00143 0.00139 0.00138 0.00136
		Cumulative Proportion  0.98232 0.98375 0.98514 0.98651 0.98787
		                         PC191   PC192  PC193   PC194   PC195
		Standard deviation     0.51532 0.51244 0.5106 0.50301 0.50114
		Proportion of Variance 0.00133 0.00131 0.0013 0.00127 0.00126
		Cumulative Proportion  0.98920 0.99051 0.9918 0.99308 0.99433
		                         PC196   PC197   PC198   PC199   PC200
		Standard deviation     0.49116 0.48723 0.48489 0.48099 0.43382
		Proportion of Variance 0.00121 0.00119 0.00118 0.00116 0.00094
		Cumulative Proportion  0.99554 0.99673 0.99790 0.99906 1.00000
		> #Q.2 The previous answer suggests that a relatively small number of '"latent variables" account for a substantial fraction of the features' variability. We might believe that these 
		> latent variables are more important than linear combinations of the features that have low variance.
		Error: unexpected symbol in "latent variables"
		> We can try forgetting about the raw features and using the first five principal components (computed on rbind(x,x.test)) instead as low-dimensional derived features. What is the mean-squared test error if we regress y on the first five principal components, and use the resulting model to predict y.test? 
		Error: unexpected symbol in "We can"
		>     xols<-pca.out$x[1:300,1:5]
		> fit0 <- lm(y ~ xols)
		> summary(fit0)

		Call:
		lm(formula = y ~ xols)

		Residuals:
		    Min      1Q  Median      3Q     Max 
		-3.3289 -0.6992  0.0319  0.8075  2.5240 

		Coefficients:
		            Estimate Std. Error t value Pr(>|t|)    
		(Intercept)  0.09541    0.06107   1.562 0.119314    
		xolsPC1      0.07608    0.01159   6.564 2.36e-10 ***
		xolsPC2     -0.02276    0.01314  -1.732 0.084309 .  
		xolsPC3     -0.04023    0.01538  -2.616 0.009352 ** 
		xolsPC4     -0.06368    0.02237  -2.847 0.004722 ** 
		xolsPC5     -0.16069    0.04299  -3.738 0.000223 ***
		---
		Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

		Residual standard error: 1.056 on 294 degrees of freedom
		Multiple R-squared:  0.1906,	Adjusted R-squared:  0.1769 
		F-statistic: 13.85 on 5 and 294 DF,  p-value: 3.704e-12

		> yhat0 = predict(fit0, x.test)
		Warning message:
		'newdata' had 1000 rows but variables found have 300 rows 
		> mean((yhat0-y.test)^2)
		[1] 1.413063
		Warning message:
		In yhat0 - y.test :
		  longer object length is not a multiple of shorter object length
		> lm(formula = y ~ xols)

		Call:
		lm(formula = y ~ xols)

		Coefficients:
		(Intercept)      xolsPC1      xolsPC2      xolsPC3      xolsPC4  
		    0.09541      0.07608     -0.02276     -0.04023     -0.06368  
		    xolsPC5  
		   -0.16069  

		> yhat0 = predict(fit0, x.test)
		Warning message:
		'newdata' had 1000 rows but variables found have 300 rows 
		> mean((yhat0-y.test)^2)
		[1] 1.413063
		Warning message:
		In yhat0 - y.test :
		  longer object length is not a multiple of shorter object length
		> cumsum((pca.out$sdev)^2) / sum(pca.out$sdev^2)
		  [1] 0.1278393 0.2334802 0.3027809 0.3391534 0.3498565 0.3566807
		  [7] 0.3633911 0.3700664 0.3766319 0.3831223 0.3895769 0.3959829
		 [13] 0.4021487 0.4082881 0.4143630 0.4204086 0.4263771 0.4322765
		 [19] 0.4381647 0.4439402 0.4496584 0.4553045 0.4609190 0.4665148
		 [25] 0.4720732 0.4775829 0.4830550 0.4885165 0.4938785 0.4991766
		 [31] 0.5044264 0.5096654 0.5148803 0.5200567 0.5252013 0.5303134
		 [37] 0.5353646 0.5403668 0.5453379 0.5502802 0.5551938 0.5600497
		 [43] 0.5648548 0.5696536 0.5744279 0.5791700 0.5838565 0.5884417
		 [49] 0.5930122 0.5975552 0.6020831 0.6065924 0.6110715 0.6154973
		 [55] 0.6199093 0.6242514 0.6285421 0.6328275 0.6370868 0.6413146
		 [61] 0.6455261 0.6497149 0.6538571 0.6579813 0.6620490 0.6660821
		 [67] 0.6701010 0.6741026 0.6780369 0.6819308 0.6858014 0.6896571
		 [73] 0.6934940 0.6973233 0.7011372 0.7049157 0.7086580 0.7123650
		 [79] 0.7160369 0.7197024 0.7233559 0.7269673 0.7305501 0.7341218
		 [85] 0.7376702 0.7411782 0.7446755 0.7481522 0.7515880 0.7549837
		 [91] 0.7583633 0.7617070 0.7650472 0.7683360 0.7716179 0.7748874
		 [97] 0.7781124 0.7813202 0.7845017 0.7876699 0.7908151 0.7939313
		[103] 0.7970249 0.8001054 0.8031629 0.8061834 0.8091794 0.8121700
		[109] 0.8151171 0.8180573 0.8209753 0.8238691 0.8267503 0.8296016
		[115] 0.8324504 0.8352647 0.8380602 0.8408274 0.8435714 0.8462947
		[121] 0.8489852 0.8516650 0.8543123 0.8569528 0.8595688 0.8621740
		[127] 0.8647601 0.8673261 0.8698505 0.8723679 0.8748447 0.8773157
		[133] 0.8797608 0.8821921 0.8845985 0.8869876 0.8893705 0.8917407
		[139] 0.8940919 0.8964250 0.8987507 0.9010466 0.9033239 0.9055913
		[145] 0.9078095 0.9099880 0.9121525 0.9142984 0.9164103 0.9185135
		[151] 0.9206074 0.9226912 0.9247678 0.9268165 0.9288409 0.9308545
		[157] 0.9328484 0.9348357 0.9367923 0.9387150 0.9406259 0.9425172
		[163] 0.9443883 0.9462507 0.9480869 0.9499211 0.9517285 0.9535234
		[169] 0.9552925 0.9570365 0.9587534 0.9604568 0.9621454 0.9637962
		[175] 0.9654374 0.9670653 0.9686753 0.9702705 0.9718424 0.9733957
		[181] 0.9749098 0.9764175 0.9779123 0.9793970 0.9808624 0.9823185
		[187] 0.9837515 0.9851373 0.9865125 0.9878682 0.9891960 0.9905089
		[193] 0.9918127 0.9930778 0.9943335 0.9955397 0.9967266 0.9979022
		[199] 0.9990590 1.0000000
		> fit<-lm(y~.,x)  # Run linear model
		> summary(fit)

		Call:
		lm(formula = y ~ ., data = x)

		Residuals:
		     Min       1Q   Median       3Q      Max 
		-2.07787 -0.39188 -0.01094  0.46463  2.07281 

		Coefficients:
		              Estimate Std. Error t value Pr(>|t|)   
		(Intercept)  0.2038165  0.1084088   1.880  0.06304 . 
		X1           0.0956991  0.1083461   0.883  0.37923   
		X2          -0.0768577  0.0905108  -0.849  0.39784   
		X3          -0.1450540  0.1006050  -1.442  0.15251   
		X4          -0.1129411  0.0977190  -1.156  0.25056   
		X5           0.1998471  0.1028714   1.943  0.05489 . 
		X6           0.0995771  0.1132813   0.879  0.38152   
		X7          -0.0597859  0.1082694  -0.552  0.58206   
		X8           0.0943231  0.1134184   0.832  0.40761   
		X9          -0.1473773  0.0997019  -1.478  0.14253   
		X10         -0.0678923  0.1179652  -0.576  0.56624   
		X11          0.1484546  0.1060616   1.400  0.16473   
		X12          0.1061323  0.1119930   0.948  0.34561   
		X13         -0.0478415  0.0997639  -0.480  0.63261   
		X14         -0.1972534  0.1166065  -1.692  0.09386 . 
		X15          0.1791277  0.1011492   1.771  0.07965 . 
		X16          0.0628750  0.1145631   0.549  0.58436   
		X17          0.0485600  0.1023332   0.475  0.63617   
		X18          0.1903937  0.1062388   1.792  0.07617 . 
		X19         -0.1722703  0.1022425  -1.685  0.09515 . 
		X20         -0.2073261  0.0918588  -2.257  0.02621 * 
		X21         -0.0036302  0.1181700  -0.031  0.97555   
		X22         -0.1129806  0.1121757  -1.007  0.31631   
		X23          0.0923017  0.1201585   0.768  0.44422   
		X24          0.3016380  0.1136175   2.655  0.00925 **
		X25         -0.0468127  0.1221786  -0.383  0.70243   
		X26         -0.0017065  0.1148268  -0.015  0.98817   
		X27         -0.1411393  0.1041841  -1.355  0.17859   
		X28          0.0285920  0.1065200   0.268  0.78893   
		X29          0.0237577  0.1093413   0.217  0.82844   
		X30          0.2165345  0.1053510   2.055  0.04248 * 
		X31         -0.0848819  0.1105101  -0.768  0.44426   
		X32          0.0720145  0.1049882   0.686  0.49436   
		X33          0.1120851  0.1057827   1.060  0.29191   
		X34         -0.1106502  0.0980774  -1.128  0.26197   
		X35          0.0884572  0.1164613   0.760  0.44933   
		X36         -0.1026070  0.1063570  -0.965  0.33703   
		X37          0.1097935  0.1074168   1.022  0.30921   
		X38          0.0768076  0.1022340   0.751  0.45426   
		X39          0.0054946  0.1086705   0.051  0.95978   
		X40          0.0550993  0.1040337   0.530  0.59755   
		X41         -0.1938989  0.1126176  -1.722  0.08824 . 
		X42          0.0344070  0.1087861   0.316  0.75245   
		X43         -0.0238439  0.1154725  -0.206  0.83683   
		X44         -0.1549566  0.1021085  -1.518  0.13231   
		X45          0.1881682  0.1122008   1.677  0.09668 . 
		X46          0.0405429  0.0991426   0.409  0.68347   
		X47         -0.0910576  0.1065085  -0.855  0.39465   
		X48          0.0466373  0.1059641   0.440  0.66081   
		X49         -0.0617196  0.0922411  -0.669  0.50498   
		X50          0.0596620  0.1091763   0.546  0.58597   
		X51         -0.0434226  0.1042990  -0.416  0.67807   
		X52          0.2683813  0.1145112   2.344  0.02109 * 
		X53          0.0489790  0.1164433   0.421  0.67494   
		X54          0.0111122  0.1049820   0.106  0.91592   
		X55         -0.1492262  0.1219943  -1.223  0.22415   
		X56          0.1014138  0.1054024   0.962  0.33831   
		X57         -0.0943702  0.1168626  -0.808  0.42130   
		X58         -0.0054378  0.1076259  -0.051  0.95981   
		X59         -0.0273607  0.1050842  -0.260  0.79512   
		X60          0.1303712  0.0915173   1.425  0.15743   
		X61         -0.0759084  0.1035465  -0.733  0.46524   
		X62         -0.1552165  0.0965579  -1.607  0.11113   
		X63         -0.0593877  0.1174564  -0.506  0.61425   
		X64          0.0379123  0.1078096   0.352  0.72584   
		X65          0.0074897  0.1024193   0.073  0.94185   
		X66         -0.0156056  0.1166708  -0.134  0.89387   
		X67          0.2668173  0.1068497   2.497  0.01417 * 
		X68          0.0892880  0.1050937   0.850  0.39760   
		X69         -0.0107980  0.1137296  -0.095  0.92455   
		X70          0.1842993  0.1116974   1.650  0.10211   
		X71          0.0785456  0.1126727   0.697  0.48737   
		X72         -0.0682916  0.1125840  -0.607  0.54552   
		X73         -0.1886139  0.1080375  -1.746  0.08394 . 
		X74          0.0001625  0.1019790   0.002  0.99873   
		X75          0.0185058  0.1007334   0.184  0.85462   
		X76          0.1129122  0.1024367   1.102  0.27302   
		X77          0.0319910  0.1054799   0.303  0.76231   
		X78          0.1498720  0.1034477   1.449  0.15056   
		X79         -0.1118129  0.1021143  -1.095  0.27618   
		X80          0.1393635  0.0942044   1.479  0.14222   
		X81          0.1237867  0.1154491   1.072  0.28623   
		X82         -0.1098310  0.1010234  -1.087  0.27960   
		X83         -0.0546254  0.1146525  -0.476  0.63481   
		X84          0.0167891  0.0940203   0.179  0.85864   
		X85          0.1235006  0.1206492   1.024  0.30850   
		X86         -0.0863701  0.1013823  -0.852  0.39631   
		X87         -0.1943033  0.0979702  -1.983  0.05011 . 
		X88          0.0260281  0.0967675   0.269  0.78851   
		X89          0.0537113  0.1027550   0.523  0.60234   
		X90          0.1756692  0.1116231   1.574  0.11873   
		X91         -0.0270830  0.0985053  -0.275  0.78394   
		X92          0.0845792  0.1086375   0.779  0.43810   
		X93          0.1764917  0.1088027   1.622  0.10796   
		X94         -0.0659248  0.0985649  -0.669  0.50515   
		X95         -0.0129704  0.1120624  -0.116  0.90809   
		X96         -0.0436070  0.0936786  -0.465  0.64260   
		X97          0.0398414  0.1238983   0.322  0.74846   
		X98         -0.0441317  0.1022291  -0.432  0.66690   
		X99         -0.1223492  0.1226529  -0.998  0.32094   
		X100         0.0403547  0.1075952   0.375  0.70842   
		X101         0.0417642  0.0996903   0.419  0.67617   
		X102        -0.1840968  0.1154864  -1.594  0.11410   
		X103        -0.1982751  0.1051755  -1.885  0.06234 . 
		X104         0.1093796  0.1032898   1.059  0.29220   
		X105        -0.1108043  0.1120751  -0.989  0.32524   
		X106         0.0236541  0.1114778   0.212  0.83240   
		X107        -0.2165246  0.1064929  -2.033  0.04471 * 
		X108        -0.0733669  0.0986834  -0.743  0.45897   
		X109         0.1511465  0.1105149   1.368  0.17452   
		X110        -0.0411139  0.1131549  -0.363  0.71712   
		X111        -0.0013255  0.1143815  -0.012  0.99078   
		X112        -0.0338146  0.1161045  -0.291  0.77148   
		X113        -0.1075306  0.1042257  -1.032  0.30472   
		X114         0.0781963  0.1062654   0.736  0.46356   
		X115         0.1904687  0.1115838   1.707  0.09096 . 
		X116         0.1119430  0.1213317   0.923  0.35845   
		X117        -0.0598707  0.0967647  -0.619  0.53752   
		X118         0.0412772  0.1059136   0.390  0.69758   
		X119        -0.1395021  0.0998264  -1.397  0.16540   
		X120        -0.0214465  0.0952355  -0.225  0.82229   
		X121        -0.0061630  0.1136857  -0.054  0.95688   
		X122         0.1363865  0.1092217   1.249  0.21471   
		X123        -0.0715526  0.1069072  -0.669  0.50486   
		X124        -0.0094576  0.1009250  -0.094  0.92553   
		X125        -0.0527586  0.1184331  -0.445  0.65695   
		X126        -0.1747779  0.1083930  -1.612  0.11005   
		X127        -0.1506615  0.0950797  -1.585  0.11625   
		X128        -0.1332091  0.1074574  -1.240  0.21804   
		X129        -0.0227213  0.1078456  -0.211  0.83357   
		X130        -0.0066388  0.1065831  -0.062  0.95046   
		X131         0.0056137  0.1008164   0.056  0.95571   
		X132        -0.1126923  0.1058529  -1.065  0.28964   
		X133        -0.0478067  0.1131040  -0.423  0.67345   
		X134        -0.0187423  0.1062751  -0.176  0.86037   
		X135        -0.0126841  0.1186105  -0.107  0.91505   
		X136         0.0566899  0.1196393   0.474  0.63666   
		X137        -0.1878898  0.0999338  -1.880  0.06303 . 
		X138        -0.0499201  0.0939563  -0.531  0.59639   
		X139         0.0604688  0.0913343   0.662  0.50947   
		X140         0.0812187  0.1059396   0.767  0.44511   
		X141         0.0797878  0.0993465   0.803  0.42383   
		X142        -0.0936362  0.1125129  -0.832  0.40728   
		X143         0.0724414  0.1023761   0.708  0.48086   
		X144        -0.0496059  0.1073318  -0.462  0.64497   
		X145        -0.1134710  0.1133926  -1.001  0.31942   
		X146         0.0251678  0.1082066   0.233  0.81656   
		X147        -0.0618058  0.0991916  -0.623  0.53465   
		X148         0.0630591  0.1006678   0.626  0.53249   
		X149         0.1186819  0.1070596   1.109  0.27031   
		X150         0.2332173  0.1180847   1.975  0.05105 . 
		X151        -0.0522802  0.1047342  -0.499  0.61877   
		X152        -0.0024918  0.1146501  -0.022  0.98270   
		X153        -0.0400985  0.1056010  -0.380  0.70497   
		X154        -0.0186653  0.1067954  -0.175  0.86161   
		X155        -0.0713475  0.0961410  -0.742  0.45978   
		X156         0.0931428  0.0968907   0.961  0.33873   
		X157         0.1201675  0.1245571   0.965  0.33702   
		X158         0.0977367  0.1123059   0.870  0.38626   
		X159        -0.0424229  0.1134814  -0.374  0.70933   
		X160        -0.1991074  0.1060539  -1.877  0.06341 . 
		X161         0.0940010  0.1078042   0.872  0.38534   
		X162         0.0565452  0.1026372   0.551  0.58293   
		X163        -0.0123934  0.0958702  -0.129  0.89740   
		X164         0.0930203  0.1087025   0.856  0.39421   
		X165        -0.0537499  0.1082785  -0.496  0.62071   
		X166        -0.1673161  0.1210329  -1.382  0.16996   
		X167        -0.2352736  0.1100655  -2.138  0.03501 * 
		X168         0.0784007  0.1240198   0.632  0.52874   
		X169         0.1044123  0.1061787   0.983  0.32783   
		X170        -0.1811877  0.1059420  -1.710  0.09035 . 
		X171         0.1611620  0.0952439   1.692  0.09377 . 
		X172        -0.0506896  0.1122349  -0.452  0.65252   
		X173        -0.3030179  0.1010782  -2.998  0.00344 **
		X174        -0.1222357  0.0984114  -1.242  0.21714   
		X175         0.0133538  0.0937384   0.142  0.88701   
		X176        -0.0488612  0.1024626  -0.477  0.63451   
		X177        -0.1311311  0.1177862  -1.113  0.26828   
		X178        -0.0731235  0.1077644  -0.679  0.49901   
		X179        -0.1856217  0.1228441  -1.511  0.13396   
		X180        -0.2979048  0.1177915  -2.529  0.01301 * 
		X181        -0.0012166  0.1162361  -0.010  0.99167   
		X182        -0.2258966  0.1081018  -2.090  0.03921 * 
		X183        -0.1205862  0.1113622  -1.083  0.28151   
		X184        -0.0673809  0.1006523  -0.669  0.50477   
		X185        -0.0071448  0.1259900  -0.057  0.95489   
		X186        -0.0614695  0.1172649  -0.524  0.60132   
		X187        -0.0305385  0.1047558  -0.292  0.77126   
		X188        -0.0613496  0.1057987  -0.580  0.56332   
		X189        -0.1565336  0.0995340  -1.573  0.11899   
		X190        -0.0196097  0.0999919  -0.196  0.84492   
		X191        -0.0235088  0.1087213  -0.216  0.82925   
		X192         0.0319547  0.0969477   0.330  0.74239   
		X193        -0.0226363  0.1002944  -0.226  0.82190   
		X194         0.0274039  0.1121933   0.244  0.80754   
		X195         0.0265554  0.1049318   0.253  0.80074   
		X196        -0.0802388  0.1082499  -0.741  0.46030   
		X197         0.0732440  0.1040338   0.704  0.48306   
		X198        -0.0222079  0.1008502  -0.220  0.82616   
		X199        -0.0403121  0.1030368  -0.391  0.69646   
		 [ reached getOption("max.print") -- omitted 1 row ]
		---
		Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

		Residual standard error: 1.072 on 99 degrees of freedom
		Multiple R-squared:  0.7195,	Adjusted R-squared:  0.1527 
		F-statistic: 1.269 on 200 and 99 DF,  p-value: 0.09151

		> yhat = predict(fit, newdata=x.test)  # Use linear model and new dataset to get the predicted values, yhat
		> mean((yhat-y.test)^2)  #Calculate mean squared error (difference between predicted y and true y values is erro
			
		[1] 3.657197



	4/ K-Means is a seemingly complicated clustering algorithms. Here is a simpler one:

	Given k, the number of clusters, and n, the number of observations, try all possible assignments of the n observations into k clusters. Then, select one of the assignments that minimizes Within-Cluster Variation as defined on page 30.

	Assume that you implemented the most naive version of the above algorithm. Here, by naive we mean that you try all possible assignments even though some of them might be redundant (for example, the algorithm tries assigning all of the observations to cluster 1 and it also tries to assign them all to cluster 2 even though those are effectively the same solution).

	In terms of n and k, how many potential solutions will your algorithm try?

		Answer: k^n
			For each of the n observations we have k options for assignment. Each of the assignments is done independently, so k^n.
			Note, the exponential explosion in the number of potential solutions is the reason we need to use greedy algorithms like K-Means in order to perform clustering.