Skip to content

BUG: ExtensionArray formatting of datetime-like forces nanosecond precision #33319

Open
@xhochy

Description

@xhochy

When passing in an ExtensionArray column into the formatting code, we always do a cast with numpy.asarray(…) and then re-enter the if-clauses in format_array. This enables us to re-use some of the formatting code that already is there for the existing columns but also is problematic for columns that are slightly different than the existing ones but can coerce to them.

In my current example I have a FletcherContinuousDtype(timestamp[us]) with values that are not representable in the nanosecond range (e.g. year 3000 or year 9999, typical dates used in traditional DB setups for the very far future). It is passed into ExtensionArrayFormatter then transformed into an np.ndarray[datetime64[us]] and then passed again into format_array where it steps into the Datetime64Formatter. There a forceful cast to nanoseconds is done through

if not isinstance(values, DatetimeIndex):
values = DatetimeIndex(values)

We could either fix this by:

  • The specific approach: Somehow allow non-nanoseconds timestamps in the Datetime64Formatter
  • The general approach: For an ExtensionArray the formatting should be fmt_values = [values._formatter(x) for x in self.values]

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugExtensionArrayExtending pandas with custom dtypes or arrays.Output-Formatting__repr__ of pandas objects, to_string

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions