Skip to content

ENH: DataFrame Constructions from Data Classes #37577

Closed
@daskol

Description

@daskol

Is your feature request related to a problem?

I wish to construct pandas.DataFrame from iterable of dataclasses.dataclass as from iterable of tuples DataFrame.from_records. The rationale behind is that data classes is more typed object than general tuple or dictionary. Also, data classes more memory efficient than tuple's. It makes data classes attractive to use them instead of dict's or tuple's whenever schema is known.

Describe the solution you'd like

I would like class method .from_dataclasses which allows DataFrame construction and type inference from uniform (for simplicity) sequence of data classes. See example below.

import pandas as pd
from dataclasses import dataclass


@dataclass
class Record:
    id: int
    name: str
    constant: float

df = pd.DataFrame.from_dataclasses([
    Record(0, 'Landau', 3.1415926),
    Record(1, 'Kapitsa', 2.718281828459045),
    Record(2, 'Bogolyubov', 6.62607015),
])

print(df.dtypes)
#  id            int64
#  name         object
#  constant    float64
#  dtype: object

In the example above schema of DataFrame is infered with Record.__annotations__ dictionary which contains type user provided type information. API could also provide ways to validate schema in runtime by comparying type of actual type and specified type for a column.

API breaking implications

There is no API breaking in general but there is requirements to minimum Python version (which is 3.7).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions