Skip to content

DOC: fix-up docs for 0.15.2 release #9058

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions doc/source/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3403,7 +3403,7 @@ writes ``data`` to the database in batches of 1000 rows at a time:
data.to_sql('data_chunked', engine, chunksize=1000)

SQL data types
""""""""""""""
++++++++++++++

:func:`~pandas.DataFrame.to_sql` will try to map your data to an appropriate
SQL data type based on the dtype of the data. When you have columns of dtype
Expand Down Expand Up @@ -3801,7 +3801,7 @@ is lost when exporting.
Labeled data can similarly be imported from *Stata* data files as ``Categorical``
variables using the keyword argument ``convert_categoricals`` (``True`` by default).
The keyword argument ``order_categoricals`` (``True`` by default) determines
whether imported ``Categorical`` variables are ordered.
whether imported ``Categorical`` variables are ordered.

.. note::

Expand Down
4 changes: 3 additions & 1 deletion doc/source/release.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,9 @@ pandas 0.15.2

**Release date:** (December 12, 2014)

This is a minor release from 0.15.1 and includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes.
This is a minor release from 0.15.1 and includes a large number of bug fixes
along with several new features, enhancements, and performance improvements.
A small number of API changes were necessary to fix existing bugs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok by me


See the :ref:`v0.15.2 Whatsnew <whatsnew_0152>` overview for an extensive list
of all API changes, enhancements and bugs that have been fixed in 0.15.2.
Expand Down
157 changes: 75 additions & 82 deletions doc/source/whatsnew/v0.15.2.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,10 @@
v0.15.2 (December 12, 2014)
---------------------------

This is a minor release from 0.15.1 and includes a small number of API changes, several new features,
enhancements, and performance improvements along with a large number of bug fixes. We recommend that all
users upgrade to this version.
This is a minor release from 0.15.1 and includes a large number of bug fixes
along with several new features, enhancements, and performance improvements.
A small number of API changes were necessary to fix existing bugs.
We recommend that all users upgrade to this version.

- :ref:`Enhancements <whatsnew_0152.enhancements>`
- :ref:`API Changes <whatsnew_0152.api>`
Expand All @@ -16,6 +17,7 @@ users upgrade to this version.

API changes
~~~~~~~~~~~

- Indexing in ``MultiIndex`` beyond lex-sort depth is now supported, though
a lexically sorted index will have a better performance. (:issue:`2646`)

Expand All @@ -38,24 +40,30 @@ API changes
df2.index.lexsort_depth
df2.loc[(1,'z')]

- Bug in concat of Series with ``category`` dtype which were coercing to ``object``. (:issue:`8641`)

- Bug in unique of Series with ``category`` dtype, which returned all categories regardless
whether they were "used" or not (see :issue:`8559` for the discussion).
Previous behaviour was to return all categories:

- ``Series.all`` and ``Series.any`` now support the ``level`` and ``skipna`` parameters. ``Series.all``, ``Series.any``, ``Index.all``, and ``Index.any`` no longer support the ``out`` and ``keepdims`` parameters, which existed for compatibility with ndarray. Various index types no longer support the ``all`` and ``any`` aggregation functions and will now raise ``TypeError``. (:issue:`8302`):
.. code-block:: python

.. ipython:: python
In [3]: cat = pd.Categorical(['a', 'b', 'a'], categories=['a', 'b', 'c'])

s = pd.Series([False, True, False], index=[0, 0, 1])
s.any(level=0)
In [4]: cat
Out[4]:
[a, b, a]
Categories (3, object): [a < b < c]

- ``Panel`` now supports the ``all`` and ``any`` aggregation functions. (:issue:`8302`):
In [5]: cat.unique()
Out[5]: array(['a', 'b', 'c'], dtype=object)

Now, only the categories that do effectively occur in the array are returned:

.. ipython:: python

p = pd.Panel(np.random.rand(2, 5, 4) > 0.1)
p.all()
cat = pd.Categorical(['a', 'b', 'a'], categories=['a', 'b', 'c'])
cat.unique()

- ``Series.all`` and ``Series.any`` now support the ``level`` and ``skipna`` parameters. ``Series.all``, ``Series.any``, ``Index.all``, and ``Index.any`` no longer support the ``out`` and ``keepdims`` parameters, which existed for compatibility with ndarray. Various index types no longer support the ``all`` and ``any`` aggregation functions and will now raise ``TypeError``. (:issue:`8302`).

- Allow equality comparisons of Series with a categorical dtype and object dtype; previously these would raise ``TypeError`` (:issue:`8938`)

Expand Down Expand Up @@ -90,25 +98,70 @@ API changes

- ``Timestamp('now')`` is now equivalent to ``Timestamp.now()`` in that it returns the local time rather than UTC. Also, ``Timestamp('today')`` is now equivalent to ``Timestamp.today()`` and both have ``tz`` as a possible argument. (:issue:`9000`)

- Fix negative step support for label-based slices (:issue:`8753`)

Old behavior:

.. code-block:: python

In [1]: s = pd.Series(np.arange(3), ['a', 'b', 'c'])
Out[1]:
a 0
b 1
c 2
dtype: int64

In [2]: s.loc['c':'a':-1]
Out[2]:
c 2
dtype: int64

New behavior:

.. ipython:: python

s = pd.Series(np.arange(3), ['a', 'b', 'c'])
s.loc['c':'a':-1]


.. _whatsnew_0152.enhancements:

Enhancements
~~~~~~~~~~~~

``Categorical`` enhancements:

- Added ability to export Categorical data to Stata (:issue:`8633`). See :ref:`here <io.stata-categorical>` for limitations of categorical variables exported to Stata data files.
- Added flag ``order_categoricals`` to ``StataReader`` and ``read_stata`` to select whether to order imported categorical data (:issue:`8836`). See :ref:`here <io.stata-categorical>` for more information on importing categorical variables from Stata data files.
- Added ability to export Categorical data to to/from HDF5 (:issue:`7621`). Queries work the same as if it was an object array. However, the ``category`` dtyped data is stored in a more efficient manner. See :ref:`here <io.hdf5-categorical>` for an example and caveats w.r.t. prior versions of pandas.
- Added support for ``searchsorted()`` on `Categorical` class (:issue:`8420`).

Other enhancements:

- Added the ability to specify the SQL type of columns when writing a DataFrame
to a database (:issue:`8778`).
For example, specifying to use the sqlalchemy ``String`` type instead of the
default ``Text`` type for string columns:

.. code-block::
.. code-block:: python

from sqlalchemy.types import String
data.to_sql('data_dtype', engine, dtype={'Col_1': String})

- Added ability to export Categorical data to Stata (:issue:`8633`). See :ref:`here <io.stata-categorical>` for limitations of categorical variables exported to Stata data files.
- Added flag ``order_categoricals`` to ``StataReader`` and ``read_stata`` to select whether to order imported categorical data (:issue:`8836`). See :ref:`here <io.stata-categorical>` for more information on importing categorical variables from Stata data files.
- Added ability to export Categorical data to to/from HDF5 (:issue:`7621`). Queries work the same as if it was an object array. However, the ``category`` dtyped data is stored in a more efficient manner. See :ref:`here <io.hdf5-categorical>` for an example and caveats w.r.t. prior versions of pandas.
- Added support for ``searchsorted()`` on `Categorical` class (:issue:`8420`).
- ``Series.all`` and ``Series.any`` now support the ``level`` and ``skipna`` parameters (:issue:`8302`):

.. ipython:: python

s = pd.Series([False, True, False], index=[0, 0, 1])
s.any(level=0)

- ``Panel`` now supports the ``all`` and ``any`` aggregation functions. (:issue:`8302`):

.. ipython:: python

p = pd.Panel(np.random.rand(2, 5, 4) > 0.1)
p.all()

- Added support for ``utcfromtimestamp()``, ``fromtimestamp()``, and ``combine()`` on `Timestamp` class (:issue:`5351`).
- Added Google Analytics (`pandas.io.ga`) basic documentation (:issue:`8835`). See :ref:`here<remote_data.ga>`.
- ``Timedelta`` arithmetic returns ``NotImplemented`` in unknown cases, allowing extensions by custom classes (:issue:`8813`).
Expand All @@ -122,19 +175,22 @@ Enhancements
- Added ability to read table footers to read_html (:issue:`8552`)
- ``to_sql`` now infers datatypes of non-NA values for columns that contain NA values and have dtype ``object`` (:issue:`8778`).


.. _whatsnew_0152.performance:

Performance
~~~~~~~~~~~
- Reduce memory usage when skiprows is an integer in read_csv (:issue:`8681`)

- Reduce memory usage when skiprows is an integer in read_csv (:issue:`8681`)
- Performance boost for ``to_datetime`` conversions with a passed ``format=``, and the ``exact=False`` (:issue:`8904`)


.. _whatsnew_0152.bug_fixes:

Bug Fixes
~~~~~~~~~

- Bug in concat of Series with ``category`` dtype which were coercing to ``object``. (:issue:`8641`)
- Bug in Timestamp-Timestamp not returning a Timedelta type and datelike-datelike ops with timezones (:issue:`8865`)
- Made consistent a timezone mismatch exception (either tz operated with None or incompatible timezone), will now return ``TypeError`` rather than ``ValueError`` (a couple of edge cases only), (:issue:`8865`)
- Bug in using a ``pd.Grouper(key=...)`` with no level/axis or level only (:issue:`8795`, :issue:`8866`)
Expand All @@ -154,95 +210,32 @@ Bug Fixes
- Bug in ``merge`` where ``how='left'`` and ``sort=False`` would not preserve left frame order (:issue:`7331`)
- Bug in ``MultiIndex.reindex`` where reindexing at level would not reorder labels (:issue:`4088`)
- Bug in certain operations with dateutil timezones, manifesting with dateutil 2.3 (:issue:`8639`)

- Fix negative step support for label-based slices (:issue:`8753`)

Old behavior:

.. code-block:: python

In [1]: s = pd.Series(np.arange(3), ['a', 'b', 'c'])
Out[1]:
a 0
b 1
c 2
dtype: int64

In [2]: s.loc['c':'a':-1]
Out[2]:
c 2
dtype: int64

New behavior:

.. ipython:: python

s = pd.Series(np.arange(3), ['a', 'b', 'c'])
s.loc['c':'a':-1]

- Regression in DatetimeIndex iteration with a Fixed/Local offset timezone (:issue:`8890`)
- Bug in ``to_datetime`` when parsing a nanoseconds using the ``%f`` format (:issue:`8989`)
- ``io.data.Options`` now raises ``RemoteDataError`` when no expiry dates are available from Yahoo and when it receives no data from Yahoo (:issue:`8761`), (:issue:`8783`).
- Fix: The font size was only set on x axis if vertical or the y axis if horizontal. (:issue:`8765`)
- Fixed division by 0 when reading big csv files in python 3 (:issue:`8621`)
- Bug in outputing a Multindex with ``to_html,index=False`` which would add an extra column (:issue:`8452`)







- Imported categorical variables from Stata files retain the ordinal information in the underlying data (:issue:`8836`).



- Defined ``.size`` attribute across ``NDFrame`` objects to provide compat with numpy >= 1.9.1; buggy with ``np.array_split`` (:issue:`8846`)


- Skip testing of histogram plots for matplotlib <= 1.2 (:issue:`8648`).






- Bug where ``get_data_google`` returned object dtypes (:issue:`3995`)

- Bug in ``DataFrame.stack(..., dropna=False)`` when the DataFrame's ``columns`` is a ``MultiIndex``
whose ``labels`` do not reference all its ``levels``. (:issue:`8844`)


- Bug in that Option context applied on ``__enter__`` (:issue:`8514`)


- Bug in resample that causes a ValueError when resampling across multiple days
and the last offset is not calculated from the start of the range (:issue:`8683`)



- Bug where ``DataFrame.plot(kind='scatter')`` fails when checking if an np.array is in the DataFrame (:issue:`8852`)



- Bug in ``pd.infer_freq/DataFrame.inferred_freq`` that prevented proper sub-daily frequency inference when the index contained DST days (:issue:`8772`).
- Bug where index name was still used when plotting a series with ``use_index=False`` (:issue:`8558`).
- Bugs when trying to stack multiple columns, when some (or all) of the level names are numbers (:issue:`8584`).
- Bug in ``MultiIndex`` where ``__contains__`` returns wrong result if index is not lexically sorted or unique (:issue:`7724`)
- BUG CSV: fix problem with trailing whitespace in skipped rows, (:issue:`8679`), (:issue:`8661`), (:issue:`8983`)
- Regression in ``Timestamp`` does not parse 'Z' zone designator for UTC (:issue:`8771`)






- Bug in `StataWriter` the produces writes strings with 244 characters irrespective of actual size (:issue:`8969`)


- Fixed ValueError raised by cummin/cummax when datetime64 Series contains NaT. (:issue:`8965`)
- Bug in Datareader returns object dtype if there are missing values (:issue:`8980`)
- Bug in plotting if sharex was enabled and index was a timeseries, would show labels on multiple axes (:issue:`3964`).

- Bug where passing a unit to the TimedeltaIndex constructor applied the to nano-second conversion twice. (:issue:`9011`).
- Bug in plotting of a period-like array (:issue:`9012`)