BUG: df.to_excel() with openpyxl engine doesn't use write-optimized mode, resulting in higher memory consumption

- [x] I have checked that this issue has not already been reported.

- [x] I have confirmed this bug exists on the latest version of pandas.

---


#### Problem description

`openpyxl` has fairly high memory consumption by default. To address this for write-only operations, they recommend two things:

1. Install the `lxml` package, which is more optimized than the default xml parser. 
2. Use the `write_only` mode on the workbook.

In `pandas/pandas/io/excel/_openpyxl.py` we have the [a comment](https://github.com/pandas-dev/pandas/blob/d662e9740a8cd738970cce97e341640b00cadcf8/pandas/io/excel/_openpyxl.py#L64) which states intent - `# Create workbook object with default optimized_write=True.` 

However `openpyxl` depreciated the `optimized_write` option in [version 2.4](https://openpyxl.readthedocs.io/en/stable/changes.html?highlight=optimized_write%3DTrue.#a1-2016-04-11) and now recommends setting `write_only=True` on workbook creation. Pandas requires 2.6 so the result is that writes to excel use more memory than necessary. Switching to `write_only=True` has [other ramifications](https://openpyxl.readthedocs.io/en/stable/optimized.html#write-only-mode), like only being able to write out once, so would need to be tested.

#### Proposed solution

1. Print out a warning about `lxml` being missing if writing to excel in an environment that doesn't have it installed. 
2. Use the `write_only` mode if possible. 


#### other context

Running on a GCP cloud function, I had a `df = pd.DataFrame()` of size ~150mb. `df.to_excel()` without lxml used >2gb memory, crashing the function. With lxml it used ~1.4gb memory, returning successfully but getting too close. Adding in write-only should lower that by an order of magnitude to <100mb. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: df.to_excel() with openpyxl engine doesn't use write-optimized mode, resulting in higher memory consumption #41681

Problem description

Proposed solution

other context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

BUG: df.to_excel() with openpyxl engine doesn't use write-optimized mode, resulting in higher memory consumption #41681

Description

Problem description

Proposed solution

other context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions