@@ -23,7 +23,7 @@ def return_feature_selection():
23
23
st .markdown ("""
24
24
Correlation is a statistical term which refers to how close two variables have a linear relationship to each other.
25
25
Variables that have a linear relationship tell us less about our dataset, since measuring one tells you something about the other.
26
- In other words, if two variables have a high correlation, we can drop on of the two!
26
+ In other words, if two variables have a high correlation, we can drop one of the two!
27
27
""" )
28
28
import pandas as pd
29
29
# iris_correlation = pd.read_csv("https://raw.githubusercontent.com/uiuc-cse/data-fa14/gh-pages/data/iris.csv")
@@ -40,6 +40,12 @@ def return_feature_selection():
40
40
], overwrite = False )\
41
41
42
42
.set_caption ('Table 1.' ))
43
+
44
+ st .markdown ("""
45
+ Here you can see a correlation table where a 1 means two variables correlate and 0 means they don't.
46
+ If you want to test this on your own data, try out the Data Analytics tool!
47
+ """ )
48
+
43
49
corr = iris_correlation .corr ().round (2 )
44
50
corr .style .background_gradient (cmap = 'coolwarm' )
45
51
st .table (corr .style .background_gradient (cmap = 'coolwarm' )\
@@ -116,7 +122,7 @@ def return_feature_selection():
116
122
st .title ('PCA Analysis' )
117
123
st .markdown ('''
118
124
Another technique to reduce the dimensionality of your dataset is by performing Principal Component Analysis.
119
- PCA uses a set of large variables by combining them together to retain as much as information as possible.
125
+ PCA uses a set of large variables by combining them together to retain as much information as possible.
120
126
PCA dates back to the 1990's and is one of the most widely used analysis techniques in Data Science.
121
127
''' )
122
128
0 commit comments