Description
Is your feature request related to a problem?
I wish to construct pandas.DataFrame
from iterable of dataclasses.dataclass
as from iterable of tuples DataFrame.from_records
. The rationale behind is that data classes is more typed object than general tuple or dictionary. Also, data classes more memory efficient than tuple
's. It makes data classes attractive to use them instead of dict
's or tuple
's whenever schema is known.
Describe the solution you'd like
I would like class method .from_dataclasses
which allows DataFrame
construction and type inference from uniform (for simplicity) sequence of data classes. See example below.
import pandas as pd
from dataclasses import dataclass
@dataclass
class Record:
id: int
name: str
constant: float
df = pd.DataFrame.from_dataclasses([
Record(0, 'Landau', 3.1415926),
Record(1, 'Kapitsa', 2.718281828459045),
Record(2, 'Bogolyubov', 6.62607015),
])
print(df.dtypes)
# id int64
# name object
# constant float64
# dtype: object
In the example above schema of DataFrame
is infered with Record.__annotations__
dictionary which contains type user provided type information. API could also provide ways to validate schema in runtime by comparying type of actual type and specified type for a column.
API breaking implications
There is no API breaking in general but there is requirements to minimum Python version (which is 3.7).