|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "markdown", |
| 5 | + "metadata": {}, |
| 6 | + "source": [ |
| 7 | + "# Assignment 1: About The Bike Sharing Dataset\n" |
| 8 | + ] |
| 9 | + }, |
| 10 | + { |
| 11 | + "cell_type": "markdown", |
| 12 | + "metadata": {}, |
| 13 | + "source": [ |
| 14 | + "## Step1: Sample description\n", |
| 15 | + "\n", |
| 16 | + "1. *Study population*: People renting bikes from Capital Bikeshare system, Washington D.C., USA in 2011 and 2012\n", |
| 17 | + "2. *Level of analysis studied*: The data provides information on the number of bikes rent and several other factors that might influence bike rental, aggregated on an hourly basis.\n", |
| 18 | + "3. *Number of observations*: 17379\n", |
| 19 | + "4. *My analytic sample*: all data" |
| 20 | + ] |
| 21 | + }, |
| 22 | + { |
| 23 | + "cell_type": "markdown", |
| 24 | + "metadata": {}, |
| 25 | + "source": [ |
| 26 | + "## Step 2: Procedures used to collect the data\n", |
| 27 | + "The core data set is obtained from the two-year historical log corresponding to years 2011 and 2012 from Capital Bikeshare system, Washington D.C., USA which is publicly available in http://capitalbikeshare.com/system-data. This data was aggregated on an hourly basis. The corresponding weather and seasonal information, extracted from http://www.freemeteo.com was then added to it.\n", |
| 28 | + "\n", |
| 29 | + "The purpose of the dataset was anomaly and event detection (see Fanaee-T, Hadi, and Gama, Joao, \"Event labeling combining ensemble detectors and background knowledge\", Progress in Artificial Intelligence (2013): pp. 1-15, Springer Berlin Heidelberg, doi:10.1007/s13748-013-0040-3.). That is: to see when the data is non-typical and then to infer that some event or anomaly occured (for example, they find that 'weird' values for bike rental occur during Hurricane Sandy).\n" |
| 30 | + ] |
| 31 | + }, |
| 32 | + { |
| 33 | + "cell_type": "markdown", |
| 34 | + "metadata": {}, |
| 35 | + "source": [ |
| 36 | + "## Step 3: Variables\n", |
| 37 | + "### Response variables:\n", |
| 38 | + "\n", |
| 39 | + " * *casual*: then number of bikes rented by casual users\n", |
| 40 | + " * *registered*: the number of bikes rented by registered users\n", |
| 41 | + " * *cnt*: the total number of bikes rented (=casual+registered)\n", |
| 42 | + " \n", |
| 43 | + "### Explanatory variables\n", |
| 44 | + " * *season*: this is actually quarter (1 to 4) not really season\n", |
| 45 | + " * *yr*: year (0: 2011, 1:2012)\n", |
| 46 | + " * *mnth* : month ( 1 to 12)\n", |
| 47 | + " * *hr* : hour (0 to 23)\n", |
| 48 | + " * *holiday* : whether day is holiday or not (1: holiday, 0: no holiday)\n", |
| 49 | + " * *weekday* : day of the week (0 to 6, 0 is Sunday)\n", |
| 50 | + " * *workingday* : if day is neither weekend nor holiday is 1, otherwise is 0.\n", |
| 51 | + " * *weathersit* : \n", |
| 52 | + " * 1: Clear, Few clouds, Partly cloudy, Partly cloudy\n", |
| 53 | + " * 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist\n", |
| 54 | + " * 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds\n", |
| 55 | + " * 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog\n", |
| 56 | + " * *temp* : Normalized temperature (0..1) in Celsius. The values are divided to 41 (max)\n", |
| 57 | + " * *atemp*: Normalized feeling temperature (0..1) in Celsius. The values are divided to 50 (max)\n", |
| 58 | + " * *hum*: Normalized humidity (0..1). The values are divided to 100 (max)\n", |
| 59 | + " * *windspeed*: Normalized wind speed (0..1). The values are divided to 67 (max)" |
| 60 | + ] |
| 61 | + }, |
| 62 | + { |
| 63 | + "cell_type": "markdown", |
| 64 | + "metadata": {}, |
| 65 | + "source": [ |
| 66 | + "## Managing exploratory and response variables\n", |
| 67 | + "I'm interested in predicting the number of rented bikes given the explanatory variables, that is: figuring out which of the variables predict the number of rented bikes best." |
| 68 | + ] |
| 69 | + } |
| 70 | + ], |
| 71 | + "metadata": { |
| 72 | + "kernelspec": { |
| 73 | + "display_name": "Python 3", |
| 74 | + "language": "python", |
| 75 | + "name": "python3" |
| 76 | + }, |
| 77 | + "language_info": { |
| 78 | + "codemirror_mode": { |
| 79 | + "name": "ipython", |
| 80 | + "version": 3 |
| 81 | + }, |
| 82 | + "file_extension": ".py", |
| 83 | + "mimetype": "text/x-python", |
| 84 | + "name": "python", |
| 85 | + "nbconvert_exporter": "python", |
| 86 | + "pygments_lexer": "ipython3", |
| 87 | + "version": "3.5.1" |
| 88 | + } |
| 89 | + }, |
| 90 | + "nbformat": 4, |
| 91 | + "nbformat_minor": 0 |
| 92 | +} |
0 commit comments