Skip to content

📊 Projects for the Udacity Data Analyst nanodegree

Notifications You must be signed in to change notification settings

Laura-O/Udacity-Data-Analyst-Nanodegree

Repository files navigation

Udacity-Data-Analyst-Nanodegree

📊 These are projects I have made for the Udacity Data Analyst Nanodegree.

Does the code work?

I graduated from the nanodegree in May 2017. The projects in this repository are in their final state, which means this is the version I submitted and which was accepted.

Reading other projects people have put on Github has helped me a lot throughout the complete course, so I hope that my projects will be helpful for someone as well.

P1: Test a Perceptual Phenomenon - Use descriptive statistics and a statistical test to analyze the Stroop effect

This project is a descriptive analysis and test of the Stroop effect. It was implemented in RMarkdown.

P2: Investigate a dataset: Titanic

For this project the Titanic dataset provided by kaggle.com was analyzed. The main goal was to find factors which made people survive the accident.

Here are two example plots I created:

Visualization of the gender and passenger class and the survival rates:

Violin plot for fare vs. passenger class

The project was created with Python in a Jupyter Notebook. You can find the notebook file here.

P3: Wrangle OpenStreetMap Data

In this project, I analyzed OpenStreetMao data of the area of Paderborn, Germany. This includes the cleanup of the data, as well as some further exploration. This project was implemented in a Jupyter Notebook with Python. The data was imported into a MongoDB instance.

You can find the results here

P4: Explore and summarize data

The goal of this project was to explore the relationship between several variables describing the quality and characteristics of white wine.

This is an overview of the given dataset:

Here are some more plots to visualize the data:

The project was implemented in RMarkdown. You can download the results here (click on "Download", save the file locally and open it in a web browser).

P5: Identifying Fraud from Enron Emails

The goal of the project is to use a machine learning algorithm to predict the person of interest (POI) of the fraud. The results were described in a Jupyter Notebook which you can find here.

Most important libraries used in this project: scikit-learn, pandas, numpy

P6: Visualize data

App on Heroku

The goal of the project is to visualize some intersting data. I chose to visualize data from the database of the World Cube Association, showing all competitors who have competed in more than 50 speedcubing competitions.

Languages and frameworks used: D3.js, HTML, Bootstrap, R

P7: Design and Analyze an A/B Test

Compared to the other projects, this project is very theoretical. It is a report about several design decisions for an A/B test, e.g. what metrics should be chosen or how long the test should run.

Languages used: R, RMarkdown