Merge pull request #23 from jonathanrocher/feature/update_pandas_0.18

jonathanrocher · web-flow · commit b878c2025e9d · 2016-06-29T17:47:18.000-06:00
Feature: update to pandas 0.18
diff --git a/README.rst b/README.rst
@@ -47,10 +47,10 @@ Packages needed
 If you already have a working distribution, you will need to make sure that you
 install or update all needed packages. To be able to run the examples, demoes
 and exercises, you must have the following packages installed:
-- pandas 0.15+
-- numpy 1.9+
-- matplotlib 1.4+
-- pytables 3.1.1+
+- pandas 0.18+
+- numpy 1.10+
+- matplotlib 1.5+
+- pytables 3.1+
 - jupyter 1.0 or ipython 4.0+ (for running, experimenting and doing exercises)
 - nose (only to test your python installation)
 
diff --git a/climate_timeseries/climate_timeseries.ipynb b/climate_timeseries/climate_timeseries.ipynb
@@ -76,8 +76,7 @@
     "import numpy as np\n",
     "import matplotlib.pyplot as plt\n",
     "\n",
-    "from pandas import set_option\n",
-    "set_option(\"display.max_rows\", 16)\n",
+    "pd.set_option(\"display.max_rows\", 16)\n",
     "\n",
     "LARGE_FIGSIZE = (12, 8)"
    ]
@@ -91,7 +90,7 @@
    "outputs": [],
    "source": [
     "# Change this cell to the demo location on YOUR machine\n",
-    "%cd ~/Projects/SciPy2015_pandas_tutorial/demos/climate_timeseries/\n",
+    "%cd ~/Projects/pandas_tutorial/climate_timeseries/\n",
     "%ls"
    ]
   },
@@ -1714,7 +1713,7 @@
    "source": [
     "# Frequencies can be specified as strings: \"us\", \"ms\", \"S\", \"T\", \"H\", \"D\", \"B\", \"W\", \"M\", \"A\", \"3min\", \"2h20\", ...\n",
     "# More aliases at http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases\n",
-    "full_globe_temp.resample(\"M\")"
+    "full_globe_temp.resample(\"M\").mean()"
    ]
   },
   {
@@ -1725,7 +1724,7 @@
    },
    "outputs": [],
    "source": [
-    "full_globe_temp.resample(\"10A\", how=\"mean\")"
+    "full_globe_temp.resample(\"10A\").mean()"
    ]
   },
   {
@@ -1920,7 +1919,7 @@
    },
    "outputs": [],
    "source": [
-    "local_sea_level_stations.sort(\"Date\")"
+    "local_sea_level_stations.sort_values(by=\"Date\")"
    ]
   },
   {
@@ -1938,7 +1937,7 @@
    },
    "outputs": [],
    "source": [
-    "local_sea_level_stations.sort([\"Date\", \"Country\"], ascending=False)"
+    "local_sea_level_stations.sort_values(by=[\"Date\", \"Country\"], ascending=False)"
    ]
   },
   {
@@ -2186,7 +2185,9 @@
    "outputs": [],
    "source": [
     "full_globe_temp.plot()\n",
-    "pd.rolling_mean(full_globe_temp, 10).plot(figsize=LARGE_FIGSIZE)"
+    "rolled_series = full_globe_temp.rolling(window=10, center=False)\n",
+    "print rolled_series\n",
+    "rolled_series.mean().plot(figsize=LARGE_FIGSIZE)"
    ]
   },
   {
@@ -2648,7 +2649,7 @@
    },
    "outputs": [],
    "source": [
-    "european_stations.sort(\"Country\")"
+    "european_stations.sort_values(by=\"Country\")"
    ]
   },
   {
@@ -2817,7 +2818,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "There are 2 objects constructors inside Pandas and inside `statsmodels`. There has been talks about merging the 2 into SM, but that hasn't happened yet. OLS in statsmodels allows more complex formulas:"
+    "The recommeded way to build ordinaty least square regressions is by using `statsmodels`."
    ]
   },
   {
@@ -2888,39 +2889,6 @@
     "plt.legend(loc=\"upper left\")"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "OLS in pandas requires to pass a `y` series and an `x` series to do a fit of the form `y ~ x`. But the formula can be more complex by providing a `DataFrame` for x and reproduce a formula of the form `y ~ x1 + x2`. \n",
-    "\n",
-    "Also, OLS in pandas allows to do rolling and expanding OLS:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "collapsed": false
-   },
-   "outputs": [],
-   "source": [
-    "from pandas.stats.api import ols as pdols"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "collapsed": true
-   },
-   "outputs": [],
-   "source": [
-    "# Same fit as above:\n",
-    "pd_model = pdols(y=mean_sea_level[\"mean_global\"], x=mean_sea_level[[\"northern_hem\", \"southern_hem\"]])\n",
-    "pd_model"
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -3140,7 +3108,7 @@
    "source": [
     "# Not constant reads apparently. Let's downscale the frequency of the sea levels \n",
     "# to monthly, like the temperature reads we have:\n",
-    "monthly_mean_sea_level = mean_sea_level.resample(\"MS\").to_period()\n",
+    "monthly_mean_sea_level = mean_sea_level.resample(\"MS\").mean().to_period()\n",
     "monthly_mean_sea_level"
    ]
   },
@@ -3259,8 +3227,9 @@
    },
    "outputs": [],
    "source": [
-    "model = sm.ols(\"southern_hem ~ global_temp\", data=aligned_monthly_data).fit()\n",
-    "model.rsquared"
+    "model = sm.ols(\"southern_hem ~ global_temp\", data=aligned_monthly_data)\n",
+    "params = model.fit()\n",
+    "params.rsquared"
    ]
   },
   {
@@ -3278,7 +3247,7 @@
    },
    "outputs": [],
    "source": [
-    "aligned_yearly_data = aligned_monthly_data.resample(\"A\")\n",
+    "aligned_yearly_data = aligned_monthly_data.resample(\"A\").mean()\n",
     "aligned_yearly_data.plot()"
    ]
   },
@@ -3329,7 +3298,7 @@
    "source": [
     "import statsmodels as sm\n",
     "# Let's remove seasonal variations by resampling annually\n",
-    "data = giss_temp_series.resample(\"A\").to_timestamp()\n",
+    "data = giss_temp_series.resample(\"A\").mean().to_timestamp()\n",
     "ar_model = sm.tsa.ar_model.AR(data, freq='A')\n",
     "ar_res = ar_model.fit(maxlag=60, disp=True)"
    ]
@@ -3370,36 +3339,6 @@
    "source": [
     "# Your code here"
    ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Want to practice more?"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "**EXERCISE (computations):** Refer to `exercises/stock_returns/stock_returns.py`"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "**EXERCISE (stats, groupby, timeseries):** Refer to `exercises/pandas_wind_statistics/pandas_wind_statistics.py`"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {
-    "collapsed": false
-   },
-   "outputs": [],
-   "source": []
   }
  ],
  "metadata": {