Skip to content

ENH: Columns formatted as "Text" in Excel are read as numbers #61539

Open
@pranay-sa

Description

@pranay-sa

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

When reading Excel files, pandas ignores Excel's "Text" cell formatting and converts text-formatted numbers (e.g., IDs, codes) to numeric types (int/float).
This requires manual conversion back to strings, which can be inefficient , for a huge dataset and prone to errors.

Image

Feature Description

Add an option in pd.read_excel() to respect Excel's cell formatting
(e.g., dtype_from_format=True), or set it to true by default , preserving text-formatted columns as strings.

Alternative Solutions

OpenPyXL/Xlrd Engine + Format Detection
Read cell formats directly (requires manual parsing):

from openpyxl import load_workbook

wb = load_workbook("data.xlsx", data_only=False)
sheet = wb.active
text_columns = [col for col in sheet.columns if sheet.cell(row=1, column=col[0].column).number_format == "@"]

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementNeeds TriageIssue that has not been reviewed by a pandas team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions