You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: red-wine.Rmd
+18-1Lines changed: 18 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -14,6 +14,7 @@ library(ggplot2)
14
14
library(cowplot)
15
15
library(GGally)
16
16
library(corrplot)
17
+
library(psych)
17
18
```
18
19
19
20
# Univariate Plots Section
@@ -214,7 +215,7 @@ It looks like most wines have sulphate between 0.5 and 0.9.
214
215
### What is the structure of your dataset?
215
216
There are 1,599 diamonds in the dataset with 12 features (fixed acidity, volatile acidity, citric acid, residual sugar, chlorides, free sulfur dioxide, total sulfur dioxide, density, pH, sulphates, alcohol, and quality). The "quality" variable can be represented as a factor variable.
216
217
217
-
(worst) —————-> (best)
218
+
(worst) ???????????????-> (best)
218
219
219
220
**Quality**: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
220
221
@@ -241,3 +242,19 @@ Not necessarily. The only consideration is to make quality variable to be a fact
241
242
For the variable, "alcohol", I zoomed the distribution in a bit. I found that alcohol is not a continuous variable. 2nd decimal point is rarely appeared whereas the alcohol value jumps by the 1st decimal point.
242
243
243
244
# Bivariate Plots Section
245
+
```{r echo=FALSE}
246
+
rwd_cor <- round(cor(rwd), 3)
247
+
```
248
+
249
+
The most correlated variable's coefficiency is 0.476, and that is alcohol. Therefore, there aren't a variable strongly correlated to the quality by looking at the coefficiency chart above. There are variables showing somewhat weekly related, but most of variables seem not correlated. I think I should look into each of those in more detail.
0 commit comments