📊 R Script for Data Analysis

Welcome to this data analysis project! Below is a summary of the tasks and analyses conducted using R across multiple days.

🛒 Association Rule Mining with `GrocBinary24.csv`

Day 4 Tasks

Frequent Itemsets:
- 🔍 Identified all frequent itemsets with a minimum support of 30%.
Association Rules:
- 📏 Extracted rules with at least 40% support and 60% confidence.
Advanced Rules:
- ⚖️ Found rules with at least 30% support, 70% confidence, and lift > 1.

🚗 Analyzing the `auto-mpg.csv` Dataset

Day 4 Tasks

Data Loading and Initial Exploration:
- 📄 Loaded the dataset and displayed the first few rows.
- 📊 Showed the number of rows, columns, and summary statistics.
- 🏷️ Listed the column names.
Working with Factors:
- 🔄 Converted the cylinders column to a factor with descriptive labels.
Visualizations:
- 📈 Histogram of Acceleration: Plotted to show the distribution of acceleration.
- 📉 Histogram of MPG: Plotted to show the distribution of miles per gallon.
- 📊 Barplot of MPG: Visualized the miles per gallon as a bar plot.
- 🔢 Frequency Count of Cylinders: Counted and plotted the frequency of each cylinder type.
Boxplots:
- 📦 MPG Distribution: Created a boxplot for MPG.
- 🚗 MPG by Cylinders: Boxplot of MPG grouped by the number of cylinders.
Pair Plots:
- 🔗 MPG vs. Displacement: Pair plot to explore relationships between MPG and displacement.
- 🔗 MPG, Displacement, and Horsepower: Pair plot to explore relationships among MPG, displacement, and horsepower.

📊 Decision Trees with C5.0

Day 7 Overview

This section focuses on decision trees using the C5.0 algorithm. We will explore concepts like accuracy, sensitivity, and specificity of classifiers using different training/test splits on multiple datasets.

Requirements

Make sure you have the following libraries installed:

caret: For creating and evaluating classification models.
C50: To implement the C5.0 algorithm for decision trees.
modeldata: To use sample datasets for training and testing.

install.packages("caret")
install.packages("C50")
install.packages("modeldata")

Load the libraries:

library(caret)
library(C50)
library(modeldata)

🔍 Problems and Solutions

This repository focuses on solving common problems such as:

Training and testing C5.0 models with different data splits.
Analyzing accuracy, sensitivity, and specificity for training and test sets.
Evaluating models with and without rules.
Comparing model performance across various training partitions: 40%, 50%, 60%, 70%, and 80%.

🔧 How to Use the Analysis Scripts

Run the Association Rule Mining:
- Load the GrocBinary24.csv dataset and execute the analysis.
Analyze the auto-mpg.csv Dataset:
- Load the dataset and run the visualizations and boxplots.
Implement Decision Trees:
- Follow the requirements to set up the environment and execute the C5.0 decision tree analyses.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Day _4		Day _4
Day_5		Day_5
Day_7		Day_7
Day_8		Day_8
Day_9		Day_9
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📊 R Script for Data Analysis

🛒 Association Rule Mining with `GrocBinary24.csv`

Day 4 Tasks

🚗 Analyzing the `auto-mpg.csv` Dataset

Day 4 Tasks

📊 Decision Trees with C5.0

Day 7 Overview

Requirements

🔍 Problems and Solutions

🔧 How to Use the Analysis Scripts

About

Uh oh!

Releases

Packages

Languages

License

Anidipta/R

Folders and files

Latest commit

History

Repository files navigation

📊 R Script for Data Analysis

🛒 Association Rule Mining with GrocBinary24.csv

Day 4 Tasks

🚗 Analyzing the auto-mpg.csv Dataset

Day 4 Tasks

📊 Decision Trees with C5.0

Day 7 Overview

Requirements

🔍 Problems and Solutions

🔧 How to Use the Analysis Scripts

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

🛒 Association Rule Mining with `GrocBinary24.csv`

🚗 Analyzing the `auto-mpg.csv` Dataset

Packages