Skip to content

API: scope of the public types module #16042

Closed
@jorisvandenbossche

Description

@jorisvandenbossche

Currently, the public pandas.api.types modules holds the following functions:

In [11]: [f for f in dir(pd.api.types) if not f.startswith('_')]
Out[11]: 
['infer_dtype',
 'is_any_int_dtype',
 'is_bool',
 'is_bool_dtype',
 'is_categorical',
 'is_categorical_dtype',
 'is_complex',
 'is_complex_dtype',
 'is_datetime64_any_dtype',
 'is_datetime64_dtype',
 'is_datetime64_ns_dtype',
 'is_datetime64tz_dtype',
 'is_datetimetz',
 'is_dict_like',
 'is_dtype_equal',
 'is_extension_type',
 'is_file_like',
 'is_float',
 'is_float_dtype',
 'is_floating_dtype',
 'is_hashable',
 'is_int64_dtype',
 'is_integer',
 'is_integer_dtype',
 'is_interval',
 'is_interval_dtype',
 'is_iterator',
 'is_list_like',
 'is_named_tuple',
 'is_number',
 'is_numeric_dtype',
 'is_object_dtype',
 'is_period',
 'is_period_dtype',
 'is_re',
 'is_re_compilable',
 'is_scalar',
 'is_sequence',
 'is_signed_integer_dtype',
 'is_sparse',
 'is_string_dtype',
 'is_timedelta64_dtype',
 'is_timedelta64_ns_dtype',
 'is_unsigned_integer_dtype',
 'pandas_dtype',
 'union_categoricals']

Two questions I would like to discuss a bit:

  • Do we need to expose all of them?
    Putting the functions here, means keeping them stable. So we could also limit it to the more pandas specific ones. For example, is_re, is_dict_like, is_iterator, etc are general utility functions not actually related to pandas specifics. Of course it can be handy for other projects that they can use them instead of implementing it themselves, but it increases the number of functions we have to keep stable.

- Do we need to expose more of the pandas extension types API? (xref #16099)

From comment of @wesm (#15541 (comment)):

I've been running into these internal API / external API issues lately (e.g. needing to use DatetimeTZDtype in an external library), so it might be worth documenting the pandas < 0.19 way to get some of these APIs

It is how the dtypes will be implemented in pandas 1.x, so shouldn't we just expose this officially and put it in types? pyarrow uses the datetime tz one, and in #16015 we will probably also like to add CategoricalDtype
Not sure if the others (interval, period) are also needed.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions