Skip to content

Commit 9376d85

Browse files
committed
ARROW-3910: [Python] Set date_as_objects=True as default in to_pandas methods
This does not add a deprecation warning primarily because it's a bit difficult to do (we would need to check the data types whether it's a date -- or in the case of a table, if any field is a date--, and then warn if so). `True` is the correct option though in order to accurately roundtrip data to and from pandas. Some users might have some workarounds floating around, but this is sufficiently advanced stuff already. With this patch, date data round trips with no special options ``` In [2]: import pyarrow as pa In [3]: import datetime In [4]: arr = pa.array([datetime.date(2000, 1, 1), None]) In [5]: arr Out[5]: <pyarrow.lib.Date32Array object at 0x0000022CCDB1BBD8> [ 10957, null ] In [6]: arr.to_pandas() Out[6]: array([datetime.date(2000, 1, 1), None], dtype=object) In [7]: pa.array(arr.to_pandas()) Out[7]: <pyarrow.lib.Date32Array object at 0x0000022CCDC7FE58> [ 10957, null ] ``` If others strongly feel it's worth going to the effort of raising a deprecation warning, please chime in. Author: Wes McKinney <wesm+git@apache.org> Closes #3272 from wesm/ARROW-3910 and squashes the following commits: 308afe5 <Wes McKinney> Add Windows makefile for Sphinx, add section about date conversions to pandas.rst f77c296 <Wes McKinney> Set date_as_objects=True as default in to_pandas methods
1 parent 71ccba9 commit 9376d85

File tree

7 files changed

+231
-116
lines changed

7 files changed

+231
-116
lines changed

docs/make.bat

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
@rem Licensed to the Apache Software Foundation (ASF) under one
2+
@rem or more contributor license agreements. See the NOTICE file
3+
@rem distributed with this work for additional information
4+
@rem regarding copyright ownership. The ASF licenses this file
5+
@rem to you under the Apache License, Version 2.0 (the
6+
@rem "License"); you may not use this file except in compliance
7+
@rem with the License. You may obtain a copy of the License at
8+
@rem
9+
@rem http://www.apache.org/licenses/LICENSE-2.0
10+
@rem
11+
@rem Unless required by applicable law or agreed to in writing,
12+
@rem software distributed under the License is distributed on an
13+
@rem "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
@rem KIND, either express or implied. See the License for the
15+
@rem specific language governing permissions and limitations
16+
@rem under the License.
17+
18+
@ECHO OFF
19+
20+
pushd %~dp0
21+
22+
REM Command file for Sphinx documentation
23+
24+
if "%SPHINXBUILD%" == "" (
25+
set SPHINXBUILD=sphinx-build
26+
)
27+
set SOURCEDIR=source
28+
set BUILDDIR=_build
29+
30+
if "%1" == "" goto help
31+
32+
%SPHINXBUILD% >NUL 2>NUL
33+
if errorlevel 9009 (
34+
echo.
35+
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
36+
echo.installed, then set the SPHINXBUILD environment variable to point
37+
echo.to the full path of the 'sphinx-build' executable. Alternatively you
38+
echo.may add the Sphinx directory to PATH.
39+
echo.
40+
echo.If you don't have Sphinx installed, grab it from
41+
echo.http://sphinx-doc.org/
42+
exit /b 1
43+
)
44+
45+
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS%
46+
goto end
47+
48+
:help
49+
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS%
50+
51+
:end
52+
popd

docs/source/building.rst

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
.. Licensed to the Apache Software Foundation (ASF) under one
2+
.. or more contributor license agreements. See the NOTICE file
3+
.. distributed with this work for additional information
4+
.. regarding copyright ownership. The ASF licenses this file
5+
.. to you under the Apache License, Version 2.0 (the
6+
.. "License"); you may not use this file except in compliance
7+
.. with the License. You may obtain a copy of the License at
8+
9+
.. http://www.apache.org/licenses/LICENSE-2.0
10+
11+
.. Unless required by applicable law or agreed to in writing,
12+
.. software distributed under the License is distributed on an
13+
.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
.. KIND, either express or implied. See the License for the
15+
.. specific language governing permissions and limitations
16+
.. under the License.
17+
18+
Building the Documentation
19+
==========================
20+
21+
Prerequisites
22+
-------------
23+
24+
The documentation build process uses `Doxygen <http://www.doxygen.nl/>`_ and
25+
`Sphinx <http://www.sphinx-doc.org/>`_ along with a few extensions.
26+
27+
If you're using Conda, the required software can be installed in a single line:
28+
29+
.. code-block:: shell
30+
31+
conda install -c conda-forge --file ci/conda_env_sphinx.yml
32+
33+
Otherwise, you'll first need to install `Doxygen <http://www.doxygen.nl/>`_
34+
yourself (for example from your distribution's official repositories, if
35+
using Linux). Then you can install the Python-based requirements with the
36+
following command:
37+
38+
.. code-block:: shell
39+
40+
pip install -r docs/requirements.txt
41+
42+
Building
43+
--------
44+
45+
.. note::
46+
47+
If you are building the documentation on Windows, not all sections
48+
may build properly.
49+
50+
These two steps are mandatory and must be executed in order.
51+
52+
#. Process the C++ API using Doxygen
53+
54+
.. code-block:: shell
55+
56+
pushd cpp/apidoc
57+
doxygen
58+
popd
59+
60+
#. Build the complete documentation using Sphinx
61+
62+
.. code-block:: shell
63+
64+
pushd docs
65+
make html
66+
popd
67+
68+
After these steps are completed, the documentation is rendered in HTML
69+
format in ``docs/_build/html``. In particular, you can point your browser
70+
at ``docs/_build/html/index.html`` to read the docs and review any changes
71+
you made.

docs/source/index.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,3 +40,9 @@ messaging and interprocess communication.
4040

4141
cpp/index
4242
python/index
43+
44+
.. toctree::
45+
:maxdepth: 2
46+
:caption: Other Topics
47+
48+
building

docs/source/python/development.rst

Lines changed: 0 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -364,53 +364,3 @@ Getting ``python-test.exe`` to run is a bit tricky because your
364364
set PYTHONHOME=%CONDA_PREFIX%
365365
366366
Now ``python-test.exe`` or simply ``ctest`` (to run all tests) should work.
367-
368-
Building the Documentation
369-
==========================
370-
371-
Prerequisites
372-
-------------
373-
374-
The documentation build process uses `Doxygen <http://www.doxygen.nl/>`_ and
375-
`Sphinx <http://www.sphinx-doc.org/>`_ along with a few extensions.
376-
377-
If you're using Conda, the required software can be installed in a single line:
378-
379-
.. code-block:: shell
380-
381-
conda install -c conda-forge --file ci/conda_env_sphinx.yml
382-
383-
Otherwise, you'll first need to install `Doxygen <http://www.doxygen.nl/>`_
384-
yourself (for example from your distribution's official repositories, if
385-
using Linux). Then you can install the Python-based requirements with the
386-
following command:
387-
388-
.. code-block:: shell
389-
390-
pip install -r docs/requirements.txt
391-
392-
Building
393-
--------
394-
395-
These two steps are mandatory and must be executed in order.
396-
397-
#. Process the C++ API using Doxygen
398-
399-
.. code-block:: shell
400-
401-
pushd cpp/apidoc
402-
doxygen
403-
popd
404-
405-
#. Build the complete documentation using Sphinx
406-
407-
.. code-block:: shell
408-
409-
pushd docs
410-
make html
411-
popd
412-
413-
After these steps are completed, the documentation is rendered in HTML
414-
format in ``docs/_build/html``. In particular, you can point your browser
415-
at ``docs/_build/html/index.html`` to read the docs and review any changes
416-
you made.

docs/source/python/pandas.rst

Lines changed: 67 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,13 @@ to them.
2929
(such as a different type system, and support for null values) that this
3030
is a separate topic from :ref:`numpy_interop`.
3131

32+
To follow examples in this document, make sure to run:
33+
34+
.. ipython:: python
35+
36+
import pandas as pd
37+
import pyarrow as pa
38+
3239
DataFrames
3340
----------
3441

@@ -120,5 +127,64 @@ Arrow -> pandas Conversion
120127
+-------------------------------------+--------------------------------------------------------+
121128
| ``TIMESTAMP(unit=*)`` | ``pd.Timestamp`` (``np.datetime64[ns]``) |
122129
+-------------------------------------+--------------------------------------------------------+
123-
| ``DATE`` | ``pd.Timestamp`` (``np.datetime64[ns]``) |
130+
| ``DATE`` | ``object``(with ``datetime.date`` objects) |
124131
+-------------------------------------+--------------------------------------------------------+
132+
133+
Categorical types
134+
~~~~~~~~~~~~~~~~~
135+
136+
TODO
137+
138+
Datetime (Timestamp) types
139+
~~~~~~~~~~~~~~~~~~~~~~~~~~
140+
141+
TODO
142+
143+
Date types
144+
~~~~~~~~~~
145+
146+
While dates can be handled using the ``datetime64[ns]`` type in
147+
pandas, some systems work with object arrays of Python's built-in
148+
``datetime.date`` object:
149+
150+
.. ipython:: python
151+
152+
from datetime import date
153+
s = pd.Series([date(2018, 12, 31), None, date(2000, 1, 1)])
154+
s
155+
156+
When converting to an Arrow array, the ``date32`` type will be used by
157+
default:
158+
159+
.. ipython:: python
160+
161+
arr = pa.array(s)
162+
arr.type
163+
arr[0]
164+
165+
To use the 64-bit ``date64``, specify this explicitly:
166+
167+
.. ipython:: python
168+
169+
arr = pa.array(s, type='date64')
170+
arr.type
171+
172+
When converting back with ``to_pandas``, object arrays of
173+
``datetime.date`` objects are returned:
174+
175+
.. ipython:: python
176+
177+
arr.to_pandas()
178+
179+
If you want to use NumPy's ``datetime64`` dtype instead, pass
180+
``date_as_object=False``:
181+
182+
.. ipython:: python
183+
184+
s2 = pd.Series(arr.to_pandas(date_as_object=False))
185+
s2.dtype
186+
187+
Time types
188+
~~~~~~~~~~
189+
190+
TODO

python/pyarrow/array.pxi

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -343,10 +343,8 @@ cdef class _PandasConvertible:
343343

344344
def to_pandas(self, categories=None, bint strings_to_categorical=False,
345345
bint zero_copy_only=False, bint integer_object_nulls=False,
346-
bint date_as_object=False,
347-
bint use_threads=True,
348-
bint deduplicate_objects=True,
349-
bint ignore_metadata=False):
346+
bint date_as_object=True, bint use_threads=True,
347+
bint deduplicate_objects=True, bint ignore_metadata=False):
350348
"""
351349
Convert to a pandas-compatible NumPy array or DataFrame, as appropriate
352350

0 commit comments

Comments
 (0)