Skip to content

Conversation

@oleibman
Copy link
Collaborator

Fix #4647. Xls file is corrupt - PhpSpreadsheet tries to extract a substring using array notation, but the index is out of bounds. Php treats this as a warning situation, so continues to process, leading to an onslaught of warning messages. We could change to use the substr function rather than array notation, but that seems inappropriate - it would be better to throw an exception and have the user fix the file. In the file posted with the issue, opening it with Excel, and responding yes when it asks if it's okay to clean up the corruption, yields a usable file. Unfortunately, that file weighs in at 28MB, much too large for our test suite. So, no new unit tests accompany this change, but it has been tested.

Tests are added to getUint2d, which seems to be the source of the problem in the sample file, and, for good measure, getInt2d and getInt4d. There may be other sources of similar corruption, but we'll stick with what's in front of our nose.

This is:

  • a bugfix
  • a new feature
  • refactoring
  • additional unit tests

Checklist:

  • Changes are covered by unit tests
    • Changes are covered by existing unit tests
    • New unit tests have been added
  • Code style is respected
  • Commit message explains why the change is made (see https://github.com/erlang/otp/wiki/Writing-good-commit-messages)
  • CHANGELOG.md contains a short summary of the change and a link to the pull request if applicable
  • Documentation is updated as necessary

Fix PHPOffice#4647. Xls file is corrupt - PhpSpreadsheet tries to extract a substring using array notation, but the index is out of bounds. Php treats this as a warning situation, so continues to process, leading to an onslaught of warning messages. We could change to use the `substr` function rather than array notation, but that seems inappropriate - it would be better to throw an exception and have the user fix the file. In the file posted with the issue, opening it with Excel, and responding yes when it asks if it's okay to clean up the corruption, yields a usable file. Unfortunately, that file weighs in at 28MB, much too large for our test suite. So, no new unit tests accompany this change, but it has been tested.

Tests are added to `getUint2d`, which seems to be the source of the problem in the sample file, and, for good measure, `getInt2d` and `getInt4d`. There may be other sources of similar corruption, but we'll stick with what's in front of our nose.
@oleibman oleibman enabled auto-merge September 21, 2025 02:46
@oleibman oleibman added this pull request to the merge queue Sep 21, 2025
Merged via the queue into PHPOffice:master with commit b982ef9 Sep 21, 2025
13 checks passed
@oleibman oleibman deleted the issue4647 branch September 21, 2025 02:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

Importing some bad XLS file it could generate bunch of warnings "Uninitialized string offset"

1 participant