|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "markdown", |
| 5 | + "metadata": {}, |
| 6 | + "source": [ |
| 7 | + "# Visualizing Data with Graphs" |
| 8 | + ] |
| 9 | + }, |
| 10 | + { |
| 11 | + "cell_type": "markdown", |
| 12 | + "metadata": {}, |
| 13 | + "source": [ |
| 14 | + "### Learning Objectives" |
| 15 | + ] |
| 16 | + }, |
| 17 | + { |
| 18 | + "cell_type": "markdown", |
| 19 | + "metadata": {}, |
| 20 | + "source": [ |
| 21 | + "* Understand the components of a point in a graph, an $x$ value, and a $y$ value \n", |
| 22 | + "* Understand how to plot a point on a graph, from a point's $x$ and $y$ value\n", |
| 23 | + "* Get a sense of how to use a graphing library, like Plotly, to answer questions about our data" |
| 24 | + ] |
| 25 | + }, |
| 26 | + { |
| 27 | + "cell_type": "markdown", |
| 28 | + "metadata": {}, |
| 29 | + "source": [ |
| 30 | + "### A common problem" |
| 31 | + ] |
| 32 | + }, |
| 33 | + { |
| 34 | + "cell_type": "markdown", |
| 35 | + "metadata": {}, |
| 36 | + "source": [ |
| 37 | + "Imagine that Molly is selling cupcakes out of her kitchen. Things are beginning to pick up, and so, she decides to hire her friend Bob to make deliveries. Molly asks us -- seeing us as a go to problem solver -- to figure out which customers are closest and furthest from Bob. This way, she can compensate him -- and let's be honest, monitor his performance -- appropriately." |
| 38 | + ] |
| 39 | + }, |
| 40 | + { |
| 41 | + "cell_type": "markdown", |
| 42 | + "metadata": {}, |
| 43 | + "source": [ |
| 44 | + "Molly gives us a list of all of the customer locations, along with Bob's. Here they are:" |
| 45 | + ] |
| 46 | + }, |
| 47 | + { |
| 48 | + "cell_type": "markdown", |
| 49 | + "metadata": {}, |
| 50 | + "source": [ |
| 51 | + "| Name | Avenue #| Block # | \n", |
| 52 | + "|------|------| ------ |\n", |
| 53 | + "| Bob | 4 | 8 | \n", |
| 54 | + "| Suzie | 1 | 11 | \n", |
| 55 | + "| Fred | 5 | 8 | \n", |
| 56 | + "| Edgar | 6 | 13 | \n", |
| 57 | + "| Steven | 3 | 6 | \n", |
| 58 | + "| Natalie| 5 | 4 | " |
| 59 | + ] |
| 60 | + }, |
| 61 | + { |
| 62 | + "cell_type": "markdown", |
| 63 | + "metadata": {}, |
| 64 | + "source": [ |
| 65 | + "Now to figure out is who is closest to Bob you decide to make a graph of each customer's locations, as well as Bob's, in a graph." |
| 66 | + ] |
| 67 | + }, |
| 68 | + { |
| 69 | + "cell_type": "markdown", |
| 70 | + "metadata": {}, |
| 71 | + "source": [ |
| 72 | + "### Visualizing Data with Graphs" |
| 73 | + ] |
| 74 | + }, |
| 75 | + { |
| 76 | + "cell_type": "markdown", |
| 77 | + "metadata": {}, |
| 78 | + "source": [ |
| 79 | + "We want to ease into graphing data, so let's start off with a scatter plot of just one random point, the point $(2, 1)$." |
| 80 | + ] |
| 81 | + }, |
| 82 | + { |
| 83 | + "cell_type": "markdown", |
| 84 | + "metadata": {}, |
| 85 | + "source": [ |
| 86 | + "" |
| 87 | + ] |
| 88 | + }, |
| 89 | + { |
| 90 | + "cell_type": "markdown", |
| 91 | + "metadata": {}, |
| 92 | + "source": [ |
| 93 | + "Ok so that graph above is our first introduction to the **cartesian coordinate system**. The coordinate system is used to display data along both an x-axis and y-axis. The **x-axis** runs horizontally, from left to right, and you can see it as the labeled gray line along the bottom. The **y-axis** runs vertically, from the bottom to the top. You can see it labeled on the far left of our graph." |
| 94 | + ] |
| 95 | + }, |
| 96 | + { |
| 97 | + "cell_type": "markdown", |
| 98 | + "metadata": {}, |
| 99 | + "source": [ |
| 100 | + "Our graph may show the x-axis starting at -4 and the y-axis starting at -1, but that's just the graph. In reality, you can imagine the x-axis and y-axis both including all numbers from negative infinity to positive infinity. And that marker in the center of our graph represents the point where $x = 2 $ and $y = 1$. Do you see why? Well it's the place where the $x$ value is $2$, and the $y$ value is $1$. As a shorthand, we mathematicians express this point as $(2, 1)$. The format is $(x, y) $, with the $x$ coordinate always coming first.\n", |
| 101 | + "\n", |
| 102 | + "There are light-gray lines forming a grid on the graph to help us see where any given **point** is on a graph. Now, test your knowledge by moving your mouse to the point $(4, 2)$. Did you get it? It's the spot at the top right of the graph." |
| 103 | + ] |
| 104 | + }, |
| 105 | + { |
| 106 | + "cell_type": "markdown", |
| 107 | + "metadata": {}, |
| 108 | + "source": [ |
| 109 | + "### Plotting our data" |
| 110 | + ] |
| 111 | + }, |
| 112 | + { |
| 113 | + "cell_type": "markdown", |
| 114 | + "metadata": {}, |
| 115 | + "source": [ |
| 116 | + "Ok, now let's plot the data given. \n", |
| 117 | + "\n", |
| 118 | + "\n", |
| 119 | + "| Name | Avenue #| Block # | \n", |
| 120 | + "|------|------| ------ |\n", |
| 121 | + "| Bob | 4 | 8 | \n", |
| 122 | + "| Suzie | 1 | 11 | \n", |
| 123 | + "| Fred | 5 | 8 | \n", |
| 124 | + "| Edgar | 6 | 13 | \n", |
| 125 | + "| Steven | 3 | 6 | \n", |
| 126 | + "| Natalie| 5 | 4 | \n" |
| 127 | + ] |
| 128 | + }, |
| 129 | + { |
| 130 | + "cell_type": "markdown", |
| 131 | + "metadata": {}, |
| 132 | + "source": [ |
| 133 | + "We cannot graph the data with python itself, so we need to download a library from the internet. This is easy enough. Simply go to your terminal and type in `pip install plotly` followed, by the enter key. Or you can press shift enter on the cell below. If you already have `plotly` installed, you will see a message saying that it's already installed -- which you can safely ignore." |
| 134 | + ] |
| 135 | + }, |
| 136 | + { |
| 137 | + "cell_type": "code", |
| 138 | + "execution_count": null, |
| 139 | + "metadata": { |
| 140 | + "collapsed": true |
| 141 | + }, |
| 142 | + "outputs": [], |
| 143 | + "source": [ |
| 144 | + "!pip install plotly" |
| 145 | + ] |
| 146 | + }, |
| 147 | + { |
| 148 | + "cell_type": "markdown", |
| 149 | + "metadata": {}, |
| 150 | + "source": [ |
| 151 | + "Now we have `plotly` on our computer. The next step is to get it into this notebook. We do so with the following two lines." |
| 152 | + ] |
| 153 | + }, |
| 154 | + { |
| 155 | + "cell_type": "code", |
| 156 | + "execution_count": 22, |
| 157 | + "metadata": {}, |
| 158 | + "outputs": [ |
| 159 | + { |
| 160 | + "data": { |
| 161 | + "text/html": [ |
| 162 | + "<script>requirejs.config({paths: { 'plotly': ['https://cdn.plot.ly/plotly-latest.min']},});if(!window.Plotly) {{require(['plotly'],function(plotly) {window.Plotly=plotly;});}}</script>" |
| 163 | + ], |
| 164 | + "text/vnd.plotly.v1+html": [ |
| 165 | + "<script>requirejs.config({paths: { 'plotly': ['https://cdn.plot.ly/plotly-latest.min']},});if(!window.Plotly) {{require(['plotly'],function(plotly) {window.Plotly=plotly;});}}</script>" |
| 166 | + ] |
| 167 | + }, |
| 168 | + "metadata": {}, |
| 169 | + "output_type": "display_data" |
| 170 | + } |
| 171 | + ], |
| 172 | + "source": [ |
| 173 | + "import plotly\n", |
| 174 | + "\n", |
| 175 | + "plotly.offline.init_notebook_mode(connected=True)\n", |
| 176 | + "# use offline mode to avoid initial registration" |
| 177 | + ] |
| 178 | + }, |
| 179 | + { |
| 180 | + "cell_type": "markdown", |
| 181 | + "metadata": {}, |
| 182 | + "source": [ |
| 183 | + "We bring in the `plotly` library by using the keyword `import` followed by our library name, `plotly`. We create new dictionary in python with the `dict` constructor. Then we pass through **named arguments** to the constructor to create a dictionary with an `x` key that points to an array of $x$ values. Similarly, we create a `y` key with a value of an array of $y$ values. Note that the $x$ values match avenue numbers and the $y$ values match the block numbers. We display this data by assigning our dictionary to the variable of `trace0`, and passing it through as an argument to the `plotly.offline.iplot` method. " |
| 184 | + ] |
| 185 | + }, |
| 186 | + { |
| 187 | + "cell_type": "code", |
| 188 | + "execution_count": 23, |
| 189 | + "metadata": {}, |
| 190 | + "outputs": [ |
| 191 | + { |
| 192 | + "data": { |
| 193 | + "application/vnd.plotly.v1+json": { |
| 194 | + "data": [ |
| 195 | + { |
| 196 | + "x": [ |
| 197 | + 4, |
| 198 | + 1, |
| 199 | + 5, |
| 200 | + 6, |
| 201 | + 3, |
| 202 | + 2 |
| 203 | + ], |
| 204 | + "y": [ |
| 205 | + 8, |
| 206 | + 11, |
| 207 | + 8, |
| 208 | + 13, |
| 209 | + 6, |
| 210 | + 4 |
| 211 | + ] |
| 212 | + } |
| 213 | + ], |
| 214 | + "layout": {} |
| 215 | + }, |
| 216 | + "text/html": [ |
| 217 | + "<div id=\"b3d96e38-ac22-49d2-a5ea-0f81a1d59caf\" style=\"height: 525px; width: 100%;\" class=\"plotly-graph-div\"></div><script type=\"text/javascript\">require([\"plotly\"], function(Plotly) { window.PLOTLYENV=window.PLOTLYENV || {};window.PLOTLYENV.BASE_URL=\"https://plot.ly\";Plotly.newPlot(\"b3d96e38-ac22-49d2-a5ea-0f81a1d59caf\", [{\"x\": [4, 1, 5, 6, 3, 2], \"y\": [8, 11, 8, 13, 6, 4]}], {}, {\"showLink\": true, \"linkText\": \"Export to plot.ly\"})});</script>" |
| 218 | + ], |
| 219 | + "text/vnd.plotly.v1+html": [ |
| 220 | + "<div id=\"b3d96e38-ac22-49d2-a5ea-0f81a1d59caf\" style=\"height: 525px; width: 100%;\" class=\"plotly-graph-div\"></div><script type=\"text/javascript\">require([\"plotly\"], function(Plotly) { window.PLOTLYENV=window.PLOTLYENV || {};window.PLOTLYENV.BASE_URL=\"https://plot.ly\";Plotly.newPlot(\"b3d96e38-ac22-49d2-a5ea-0f81a1d59caf\", [{\"x\": [4, 1, 5, 6, 3, 2], \"y\": [8, 11, 8, 13, 6, 4]}], {}, {\"showLink\": true, \"linkText\": \"Export to plot.ly\"})});</script>" |
| 221 | + ] |
| 222 | + }, |
| 223 | + "metadata": {}, |
| 224 | + "output_type": "display_data" |
| 225 | + } |
| 226 | + ], |
| 227 | + "source": [ |
| 228 | + "trace0 = dict(x=[4, 1, 5, 6, 3, 2], y=[8, 11, 8, 13, 6, 4])\n", |
| 229 | + "\n", |
| 230 | + "# All that, and it doesn't even look good :(\n", |
| 231 | + "plotly.offline.iplot([trace0])" |
| 232 | + ] |
| 233 | + }, |
| 234 | + { |
| 235 | + "cell_type": "markdown", |
| 236 | + "metadata": {}, |
| 237 | + "source": [ |
| 238 | + "The points were plotted correctly, but they are connected by a line, which doesn't represent anything in particular." |
| 239 | + ] |
| 240 | + }, |
| 241 | + { |
| 242 | + "cell_type": "markdown", |
| 243 | + "metadata": {}, |
| 244 | + "source": [ |
| 245 | + "Let's remove the lines by setting `mode = \"markers\"`. Then, let's also set labels to each of the dots, by setting `text` equal to an array of our names. " |
| 246 | + ] |
| 247 | + }, |
| 248 | + { |
| 249 | + "cell_type": "code", |
| 250 | + "execution_count": 25, |
| 251 | + "metadata": {}, |
| 252 | + "outputs": [ |
| 253 | + { |
| 254 | + "data": { |
| 255 | + "application/vnd.plotly.v1+json": { |
| 256 | + "data": [ |
| 257 | + { |
| 258 | + "mode": "markers", |
| 259 | + "text": [ |
| 260 | + "bob", |
| 261 | + "suzie", |
| 262 | + "fred", |
| 263 | + "edgar", |
| 264 | + "steven", |
| 265 | + "natalie" |
| 266 | + ], |
| 267 | + "x": [ |
| 268 | + 4, |
| 269 | + 1, |
| 270 | + 5, |
| 271 | + 6, |
| 272 | + 3, |
| 273 | + 2 |
| 274 | + ], |
| 275 | + "y": [ |
| 276 | + 8, |
| 277 | + 11, |
| 278 | + 8, |
| 279 | + 13, |
| 280 | + 6, |
| 281 | + 4 |
| 282 | + ] |
| 283 | + } |
| 284 | + ], |
| 285 | + "layout": {} |
| 286 | + }, |
| 287 | + "text/html": [ |
| 288 | + "<div id=\"d46f853c-0243-4540-8171-7f902e9ffa77\" style=\"height: 525px; width: 100%;\" class=\"plotly-graph-div\"></div><script type=\"text/javascript\">require([\"plotly\"], function(Plotly) { window.PLOTLYENV=window.PLOTLYENV || {};window.PLOTLYENV.BASE_URL=\"https://plot.ly\";Plotly.newPlot(\"d46f853c-0243-4540-8171-7f902e9ffa77\", [{\"x\": [4, 1, 5, 6, 3, 2], \"y\": [8, 11, 8, 13, 6, 4], \"mode\": \"markers\", \"text\": [\"bob\", \"suzie\", \"fred\", \"edgar\", \"steven\", \"natalie\"]}], {}, {\"showLink\": true, \"linkText\": \"Export to plot.ly\"})});</script>" |
| 289 | + ], |
| 290 | + "text/vnd.plotly.v1+html": [ |
| 291 | + "<div id=\"d46f853c-0243-4540-8171-7f902e9ffa77\" style=\"height: 525px; width: 100%;\" class=\"plotly-graph-div\"></div><script type=\"text/javascript\">require([\"plotly\"], function(Plotly) { window.PLOTLYENV=window.PLOTLYENV || {};window.PLOTLYENV.BASE_URL=\"https://plot.ly\";Plotly.newPlot(\"d46f853c-0243-4540-8171-7f902e9ffa77\", [{\"x\": [4, 1, 5, 6, 3, 2], \"y\": [8, 11, 8, 13, 6, 4], \"mode\": \"markers\", \"text\": [\"bob\", \"suzie\", \"fred\", \"edgar\", \"steven\", \"natalie\"]}], {}, {\"showLink\": true, \"linkText\": \"Export to plot.ly\"})});</script>" |
| 292 | + ] |
| 293 | + }, |
| 294 | + "metadata": {}, |
| 295 | + "output_type": "display_data" |
| 296 | + } |
| 297 | + ], |
| 298 | + "source": [ |
| 299 | + "trace1 = dict(x=[4, 1, 5, 6, 3, 2],\n", |
| 300 | + " y=[8, 11, 8, 13, 6, 4], \n", |
| 301 | + " mode=\"markers\", \n", |
| 302 | + " text=[\"bob\", \"suzie\", \"fred\", \"edgar\", \"steven\", \"natalie\"],)\n", |
| 303 | + "\n", |
| 304 | + "\n", |
| 305 | + "plotly.offline.iplot([trace1])\n", |
| 306 | + "\n", |
| 307 | + "# much better :)" |
| 308 | + ] |
| 309 | + }, |
| 310 | + { |
| 311 | + "cell_type": "markdown", |
| 312 | + "metadata": {}, |
| 313 | + "source": [ |
| 314 | + "Ok, so if you move your mouse over the dots, you can see the names that correspond to each point. Also, when we hover over the dot at the x axis of point four, we can see that is Bob's point, just like it should be. Now, who is closest to Bob? It looks like Fred is, so that is the delivery that is easiest for Bob." |
| 315 | + ] |
| 316 | + }, |
| 317 | + { |
| 318 | + "cell_type": "markdown", |
| 319 | + "metadata": {}, |
| 320 | + "source": [ |
| 321 | + "### Summary" |
| 322 | + ] |
| 323 | + }, |
| 324 | + { |
| 325 | + "cell_type": "markdown", |
| 326 | + "metadata": {}, |
| 327 | + "source": [ |
| 328 | + "In this section, we saw how we use data visualisations to better understand the data. A cartesian coordinate system nicely represents two dimensional data. It allows us to represent a point's $x$ value by placing the point horizontally at the correct spot on the x-axis. It represents a point's $y$ value by placing the point at the correct spot along the y-axis.\n", |
| 329 | + "\n", |
| 330 | + "To display the data with `plotly` we need to do a couple of things. First, we install plotly by going to our terminal and running `pip install plotly`. Then to use the library, we import the `plotly` library into our notebook. Once the library is loaded in our notebook, it's time to use it. We create a new dictionary with keys of $x$ and $y$, with each key pointing to an array of the $x$ or $y$ values of our points. To clean up the appearance we set the `mode` attribute equal to `'markers'`." |
| 331 | + ] |
| 332 | + } |
| 333 | + ], |
| 334 | + "metadata": { |
| 335 | + "kernelspec": { |
| 336 | + "display_name": "Python 3", |
| 337 | + "language": "python", |
| 338 | + "name": "python3" |
| 339 | + }, |
| 340 | + "language_info": { |
| 341 | + "codemirror_mode": { |
| 342 | + "name": "ipython", |
| 343 | + "version": 3 |
| 344 | + }, |
| 345 | + "file_extension": ".py", |
| 346 | + "mimetype": "text/x-python", |
| 347 | + "name": "python", |
| 348 | + "nbconvert_exporter": "python", |
| 349 | + "pygments_lexer": "ipython3", |
| 350 | + "version": "3.6.1" |
| 351 | + } |
| 352 | + }, |
| 353 | + "nbformat": 4, |
| 354 | + "nbformat_minor": 2 |
| 355 | +} |
0 commit comments