Skip to content

Commit febfe3d

Browse files
xhochywesm
authored andcommitted
ARROW-2680: [Python] Add documentation about type inference in Table.from_pandas
Add note about passing explicit schema
1 parent 27b869a commit febfe3d

File tree

1 file changed

+14
-1
lines changed

1 file changed

+14
-1
lines changed

python/pyarrow/table.pxi

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -924,7 +924,20 @@ cdef class Table:
924924
def from_pandas(cls, df, Schema schema=None, bint preserve_index=True,
925925
nthreads=None, columns=None):
926926
"""
927-
Convert pandas.DataFrame to an Arrow Table
927+
Convert pandas.DataFrame to an Arrow Table.
928+
929+
The column types in the resulting Arrow Table are inferred from the
930+
dtypes of the pandas.Series in the DataFrame. In the case of non-object
931+
Series, the NumPy dtype is translated to its Arrow equivalent. In the
932+
case of `object`, we need to guess the datatype by looking at the
933+
Python objects in this Series.
934+
935+
Be aware that Series of the `object` dtype don't carry enough
936+
information to always lead to a meaningful Arrow type. In the case that
937+
we cannot infer a type, e.g. because the DataFrame is of length 0 or
938+
the Series only contains None/nan objects, the type is set to
939+
null. This behavior can be avoided by constructing an explicit schema
940+
and passing it to this function.
928941
929942
Parameters
930943
----------

0 commit comments

Comments
 (0)