Removed the exercises for lecture 5 into the lecture notes.

Christian Jacobs · Christian Jacobs · commit e40797f7518c · 2014-11-03T21:27:23.000Z
diff --git a/notebook/Lecture-5-Introduction-to-programming-for-geoscientists.ipynb b/notebook/Lecture-5-Introduction-to-programming-for-geoscientists.ipynb
@@ -1,18 +1,18 @@
 {
  "metadata": {
-  "name": ""
+  "name": "Lecture-5-Introduction-to-programming-for-geoscientists"
  },
  "nbformat": 3,
  "nbformat_minor": 0,
  "worksheets": [
   {
    "cells": [
     {
-     "cell_type": "heading",
-     "level": 1,
+     "cell_type": "markdown",
      "metadata": {},
      "source": [
-      "Introduction to programming for Geoscientists (through Python)"
+      "#Introduction to programming for Geoscientists (through Python)\n",
+      "###[Gerard Gorman](http://www.imperial.ac.uk/people/g.gorman), [Christian Jacobs](http://www.imperial.ac.uk/people/c.jacobs10)"
      ]
     },
     {
@@ -21,13 +21,11 @@
      "source": [
       "#Lecture 5: Files, strings, and dictionaries\n",
       "\n",
-      "[Gerard J. Gorman](http://www.imperial.ac.uk/people/g.gorman) <g.gorman@imperial.ac.uk>\n",
-      "\n",
       "Learning objectives: You will learn how to:\n",
       "\n",
-      "1. Read data in from a file\n",
-      "2. Parse strings to extract specific data of interest.\n",
-      "3. Use dictionaries to index data using any type of key."
+      "* Read data in from a file\n",
+      "* Parse strings to extract specific data of interest.\n",
+      "* Use dictionaries to index data using any type of key."
      ]
     },
     {
@@ -259,6 +257,99 @@
       "You will notice in the above example that we used the *split()* string member function. This is a very useful function for grabbing individual words on a line. When called without any arguments it assumes that the [delimiter](http://en.wikipedia.org/wiki/Delimiter) is a blank space. However, you can use this to split a string with any delimiter, *e.g.*, *line.split(';')*, *line.split(':')*."
      ]
     },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "## <span style=\"color:blue\">Exercise 1: Read a two-column data file</span>\n",
+      "The file *data/xy.dat* contains two columns of numbers, corresponding to *x* and *y* coordinates on a curve. The start of the file looks like this:\n",
+      "\n",
+      "-1.0000   -0.0000</br>\n",
+      "-0.9933   -0.0087</br>\n",
+      "-0.9867   -0.0179</br>\n",
+      "-0.9800   -0.0274</br>\n",
+      "-0.9733   -0.0374</br>\n",
+      "\n",
+      "Make a program that reads the first column into a list *x* and the second column into a list *y*. Then convert the lists to arrays, and plot the curve. Print out the maximum and minimum y coordinates. (Hint: Read the file line by line, split each line into words, convert to float, and append to *x* and *y*.)</br>"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "## <span style=\"color:blue\">Exercise 2: Read a data file</span>\n",
+      "The files data/density_water.dat and data/density_air.dat contain data about the density of water and air (respectively) for different temperatures. The data files have some comment lines starting with # and some lines are blank. The rest of the lines contain density data: the temperature in the first column and the corresponding density in the second column. The goal of this exercise is to read the data in such a file and plot the density versus the temperature as distinct (small) circles for each data point. Let the program take the name of the data file via raw_input. Apply the program to both files."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "## <span style=\"color:blue\">Exercise 3: Read acceleration data and find velocities</span>\n",
+      "A file data/acc.dat contains measurements $a_0, a_1, \\ldots, a_{n-1}$ of the acceleration of an object moving along a straight line. The measurement $a_k$ is taken at time point $t_k = k\\Delta t$, where $\\Delta t$ is the time spacing between the measurements. The purpose of the exercise is to load the acceleration data into a program and compute the velocity $v(t)$ of the object at some time $t$.\n",
+      "\n",
+      "In general, the acceleration $a(t)$ is related to the velocity $v(t)$ through $v^\\prime(t) = a(t)$. This means that\n",
+      "\n",
+      "$$\n",
+      "v(t) = v(0) + \\int_0^t{a(\\tau)d\\tau}\n",
+      "$$\n",
+      "\n",
+      "If $a(t)$ is only known at some discrete, equally spaced points in time, $a_0, \\ldots, a_{n-1}$ (which is the case in this exercise), we must compute the integral above numerically, for example by the Trapezoidal rule:\n",
+      "\n",
+      "$$\n",
+      "v(t_k) \\approx \\Delta t \\left(\\frac{1}{2}a_0 + \\frac{1}{2}a_k + \\sum_{i=1}^{k-1}a_i \\right), \\ \\ 1 \\leq k \\leq n-1. \n",
+      "$$\n",
+      "\n",
+      "We assume $v(0) = 0$ so that also $v_0 = 0$.\n",
+      "Read the values $a_0, \\ldots, a_{n-1}$ from file into an array, plot the acceleration versus time, and use the Trapezoidal rule to compute one $v(t_k)$ value, where $\\Delta t$ and $k \\geq 1$ are specified using raw_input."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "## <span style=\"color:blue\">Exercise 4: Read acceleration data and plot velocities</span>\n",
+      "The task in this exercise is the same as the one above, except that we now want to compute $v(t_k)$ for all time points $t_k = k\\Delta t$ and plot the velocity versus time. Repeated use of the Trapezoidal rule for all $k$ values is very inefficient. A more efficient formula arises if we add the area of a new trapezoid to the previous integral:\n",
+      "\n",
+      "$$\n",
+      "v(t_k) = v(t_{k-1}) + \\int_{t_{k-1}}^{t_k}a(\\tau)\\ d\\tau \\approx v(t_{k-1}) + \\Delta t \\frac{1}{2}\\left(a_{k-1} + a_k\\right), \n",
+      "$$\n",
+      "\n",
+      "for $k = 1, 2, \\ldots, n-1$, while $v_0 = 0$. Use this formula to fill an array *v* with velocity values. Now only $\\Delta t$ is given on via raw_input, and the $a_0, \\ldots, a_{n-1}$ values must be read from file as in the previous exercise."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
     {
      "cell_type": "markdown",
      "metadata": {},
@@ -588,6 +679,90 @@
      ],
      "prompt_number": 227
     },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "## <span style=\"color:blue\">Exercise 5: Make a dictionary from a table</span>\n",
+      "The file *data/constants.txt* contains a table of the values and the dimensions of some fundamental constants from physics. We want to load this table into a dictionary *constants*, where the keys are the names of the constants. For example, *constants['gravitational constant']* holds the value of the gravitational constant (6.67259 $\\times$ 10$^{-11}$) in Newton's law of gravitation. Make a function that that reads and interprets the text in the file, and thereafter returns the dictionary."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "## <span style=\"color:blue\">Exercise 6: Explore syntax differences: lists vs. dictionaries</span>\n",
+      "Consider this code:"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "t1 = {}\n",
+      "t1[0] = -5\n",
+      "t1[1] = 10.5"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "Explain why the lines above work fine while the ones below do not:"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "t2 = []\n",
+      "t2[0] = -5\n",
+      "t2[1] = 10.5"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "What must be done in the last code snippet to make it work properly?"
+     ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "## <span style=\"color:blue\">Exercise 7: Compute the area of a triangle</span>\n",
+      "An arbitrary triangle can be described by the coordinates of its three vertices: $(x_1, y_1), (x_2, y_2), (x_3, y_3)$, numbered in a counterclockwise direction. The area of the triangle is given by the formula:\n",
+      "\n",
+      "$A = \\frac{1}{2}|x_2y_3 - x_3y_2 - x_1y_3 + x_3y_1 + x_1y_2 - x_2y_1|.$\n",
+      "\n",
+      "Write a function *area(vertices)* that returns the area of a triangle whose vertices are specified by the argument vertices, which is a nested list of the vertex coordinates. For example, vertices can be [[0,0], [1,0], [0,2]] if the three corners of the triangle have coordinates (0, 0), (1, 0), and (0, 2).\n",
+      "\n",
+      "Then, assume that the vertices of the triangle are stored in a dictionary and not a list. The keys in the dictionary correspond to the vertex number (1, 2, or 3) while the values are 2-tuples with the x and y coordinates of the vertex. For example, in a triangle with vertices (0, 0), (1, 0), and (0, 2) the vertices argument becomes:"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
     {
      "cell_type": "markdown",
      "metadata": {},
@@ -1140,12 +1315,57 @@
      ],
      "prompt_number": 255
     },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "## <span style=\"color:blue\">Exercise 8: Improve a program</span>\n",
+      "The file *data/densities.dat* contains a table of densities of various substances measured in g/cm$^3$. The following program reads the data in this file and produces a dictionary whose keys are the names of substances, and the values are the corresponding densities."
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [
+      "def read_densities(filename):\n",
+      "    infile = open(filename, 'r')\n",
+      "    densities = {}\n",
+      "    for line in infile:\n",
+      "        words = line.split()\n",
+      "        density = float(words[-1])\n",
+      "    \n",
+      "        if len(words[:-1]) == 2:\n",
+      "            substance = words[0] + ' ' + words[1]\n",
+      "        else:\n",
+      "            substance = words[0]\n",
+      "        \n",
+      "        densities[substance] = density\n",
+      "    \n",
+      "    infile.close()\n",
+      "    return densities\n",
+      "\n",
+      "densities = read_densities('densities.dat')"
+     ],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "One problem we face when implementing the program above is that the name of the substance can contain one or two words, and maybe more words in a more comprehensive table. The purpose of this exercise is to use string operations to shorten the code and make it more general. Implement the following two methods in separate functions in the same program, and control that they give the same result.\n",
+      "\n",
+      "1. Let *substance* consist of all the words but the last, using the join method in string objects to combine the words.\n",
+      "2. Observe that all the densities start in the same column file and use substrings to divide line into two parts. (Hint: Remember to strip the first part such that, e.g., the density of ice is obtained as *densities['ice']* and not *densities['ice     ']*.)"
+     ]
+    },
     {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
       "##File writing\n",
-      "\ufffc\ufffc\ufffc\ufffc\ufffcWriting a file in Python is simple. You just collect the text you want to write in one or more strings and, for each string, use a statement along the lines of"
+      "Writing a file in Python is simple. You just collect the text you want to write in one or more strings and, for each string, use a statement along the lines of"
      ]
     },
     {
@@ -1205,6 +1425,24 @@
      "source": [
       "And that's it - run the above cell and take a look at the file that was generated in the folder you run IPython from."
      ]
+    },
+    {
+     "cell_type": "markdown",
+     "metadata": {},
+     "source": [
+      "## <span style=\"color:blue\">Exercise 9: Write function data to a file</span>\n",
+      "We want to dump $x$ and $f(x)$ values to a file named function_data.dat, where the $x$ values appear in the first column and the $f(x)$ values appear in the second. Choose $n$ equally spaced $x$ values in the interval [-4, 4]. Here, the function $f(x)$ is given by:\n",
+      "\n",
+      "$f(x) = \\frac{1}{\\sqrt{2\\pi}}\\exp(-0.5x^2)$"
+     ]
+    },
+    {
+     "cell_type": "code",
+     "collapsed": false,
+     "input": [],
+     "language": "python",
+     "metadata": {},
+     "outputs": []
     }
    ],
    "metadata": {}