getdata-014.project

This repository contains my submission for the Getting and Cleaning Data Course Project May 2015.

This ReadMe file includes the following:

There is a separate Codebook.txt files that describes the output dataset.

Environment

the script can be run offline, that is:
the raw input dataset has been downloaded and unzipped into the working directory (where the script is located). The script is not expected to access input data from the downloaded zip file directly. If the inputs folder is not found, the script reports the error and stops.
all required packages are already installed

Loads libraries and checks for data input folder. Exits with error message if not found.
Loads raw test and train data files into R data frames.
experimental observations
activities
subjects
Loads supporting datainto R data frames
activity labels
feature names (variables)
Combines all test and train datasets
combine train_ and test_ data - the experimental observations
combine train_ and test_ labels and assign a meaningful variable name
combine train_ and test_ subjects and assign a meaningful variable name
Trim all_data down to the required variables - means and std deviations only.
Assigns meaningful variable_names to all_data column names. Uses the variable names provided in features.txt
Column binds subjects and labels onto all_data
Merges activity labels onto all_data and drops the activity_id column
Creates the final tidy dataset and writes it out.
Sort the data into activity / subject sequence
Reshape into summarised data
Output data is in the wide data form.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Codebook.md		Codebook.md
README.md		README.md
final_ds_wide_form.txt		final_ds_wide_form.txt
run_analysis.R		run_analysis.R