-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Xlsx Namespace Handling of Drawings, RowAndColumnAttributes, MergeCells #3137
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…tes, MergeCells Fix PHPOffice#1482 (actually fix a problem recently attached to that ticket long after it closed). There were problems processing a spreadsheet generated by third party software. That spreadsheet used unexpected namespacing, and absolute paths within the zip file where relative paths were expected. Xlsx Reader handles most, but not all, of its processing in a namespace-aware manner. Two versions of a worksheet's xml are available - `$xmlSheet` is not namespace aware and `$xmlSheetNS` is aware. This was necessary in order to add namespace support in an incremental manner. The primary reason to continue to use the unaware version is the absence of test cases. In particular, drawings, row and column attributes, and merge cells continue to use the unaware version; this PR changes those to use the aware version. As noted in the summary above, a couple of new places in the handling of the those items were expecting file locations to be specified as relative paths in the zip file, but the file used absolute paths instead. Those unexpected usages are now addressed. The user reporting the new problem tried a change which effectively made all uses of `$xmlSheet` namespace aware, and that seemed helpful. It may be time eliminate its usage altogether, whether or not we have appropriate examples of unexpected namespaces to test with. I will not do that with this change, but I may add a new PR to do so after this one is merged. Remaining areas which still use the unaware version include conditional formatting (internal or external), sheet view options, sheet protection, auto filters, unparsed loaded data, data validation (internal or external), alternate content, and header/footer images. There is an interesting anomaly with the new test file. When I load it and save it, the appearance of the output file does not quite match the input. Oddly, the output file seems much better than the input - the picture no longer covers any data, for example. This is because, in particular, the output file row heights and column widths seem to match the xml, but the input file does not. For example, the xml in both files seems to indicate that row 5 should have a height of 234, which it does in the output file, but the height of that row when the input file is opened is 156. It appears that all row heights and column widths when the input file is opened are very close to 2/3 of what is expected. I will continue to research that anomaly for a few days, but I will not let it prevent me from moving forward with this PR if I don't find the explanation. Whatever that problem is, it seems distinct from the namespacing/pathing problems which the PR addresses.
Eliminate them with annotations.
No intention of acting on Scrutinizer "complexity" complain. |
oleibman
added a commit
to oleibman/PhpSpreadsheet
that referenced
this pull request
Oct 25, 2022
Fix PHPOffice#3126. A worksheet contained an image in its footer. It could be loaded and saved as another spreadsheet. However, if you tried to load and save that spreadsheet, PhpSpreadsheet would be unable to find the footer image and would therefore throw an exception. This error was introduced a long time ago, in PhpSpreadsheet 1.3.0. The apparent cause of the problem was PR PHPOffice#435, sometime around June 2018. That change was very useful, but it had problems which exposed themselves only with a third generation copy. An additional contributor to the issue at hand was PR PHPOffice#1690 (December 2020), which again exposed itself with a third generation copy. The issue from 1690 is easier to explain and deal with. It added a 'ps' suffix to printer settings resources in Xlsx Reader (to avoid name conflicts), but did not limit itself to a single addition (so subseqent generations would have multiple ps's). It also neglected to add the suffix in Reader/Xlsx/PageSetup. As for 435, it loops through all the worksheet relationships, and uses the last that it finds as the base for header/footer drawings. It has been changed to use only the relationship whose `rId` matches the worksheet's `legacyDrawingHF` `rId`. It also needs a bit extra validation to make sure a drawing exists before adding it to its array of header/footer images. It also meant that Xlsx/Writer/Rels might write an entry with the same rId twice. I have also changed the header/footer image processing to be namespace aware (see PR PHPOffice#3137).
7 tasks
oleibman
added a commit
that referenced
this pull request
Nov 2, 2022
* Generation3 Copy With Image in Footer Fix #3126. A worksheet contained an image in its footer. It could be loaded and saved as another spreadsheet. However, if you tried to load and save that spreadsheet, PhpSpreadsheet would be unable to find the footer image and would therefore throw an exception. This error was introduced a long time ago, in PhpSpreadsheet 1.3.0. The apparent cause of the problem was PR #435, sometime around June 2018. That change was very useful, but it had problems which exposed themselves only with a third generation copy. An additional contributor to the issue at hand was PR #1690 (December 2020), which again exposed itself with a third generation copy. The issue from 1690 is easier to explain and deal with. It added a 'ps' suffix to printer settings resources in Xlsx Reader (to avoid name conflicts), but did not limit itself to a single addition (so subseqent generations would have multiple ps's). It also neglected to add the suffix in Reader/Xlsx/PageSetup. As for 435, it loops through all the worksheet relationships, and uses the last that it finds as the base for header/footer drawings. It has been changed to use only the relationship whose `rId` matches the worksheet's `legacyDrawingHF` `rId`. It also needs a bit extra validation to make sure a drawing exists before adding it to its array of header/footer images. It also meant that Xlsx/Writer/Rels might write an entry with the same rId twice. I have also changed the header/footer image processing to be namespace aware (see PR #3137). * Minor Change I didn't like the way I performed one operation. * Fix Test Array index should not have been constant.
MarkBaker
added a commit
that referenced
this pull request
Dec 21, 2022
### Added - Extended flag options for the Reader `load()` and Writer `save()` methods - Apply Row/Column limits (1048576 and XFD) in ReferenceHelper [PR #3213](#3213) - Allow the creation of In-Memory Drawings from a string of binary image data, or from a stream. [PR #3157](#3157) - Xlsx Reader support for Pivot Tables [PR #2829](#2829) - Permit Date/Time Entered on Spreadsheet to be calculated as Float [Issue #1416](#1416) [PR #3121](#3121) ### Changed - Nothing ### Deprecated - Direct update of Calculation::suppressFormulaErrors is replaced with setter. - Font public static variable defaultColumnWidths replaced with constant DEFAULT_COLUMN_WIDTHS. - ExcelError public static variable errorCodes replaced with constant ERROR_CODES. - NumberFormat constant FORMAT_DATE_YYYYMMDD2 replaced with existing identical FORMAT_DATE_YYYYMMDD. ### Removed - Nothing ### Fixed - Fixed handling for `_xlws` prefixed functions from Office365 [Issue #3245](#3245) [PR #3247](#3247) - Conditionals formatting rules applied to rows/columns are removed [Issue #3184](#3184) [PR #3213](#3213) - Treat strings containing currency or accounting values as floats in Calculation Engine operations [Issue #3165](#3165) [PR #3189](#3189) - Treat strings containing percentage values as floats in Calculation Engine operations [Issue #3155](#3155) [PR #3156](#3156) and [PR #3164](#3164) - Xlsx Reader Accept Palette of Fewer than 64 Colors [Issue #3093](#3093) [PR #3096](#3096) - Use Locale-Independent Float Conversion for Xlsx Writer Custom Property [Issue #3095](#3095) [PR #3099](#3099) - Allow setting AutoFilter range on a single cell or row [Issue #3102](#3102) [PR #3111](#3111) - Xlsx Reader External Data Validations Flag Missing [Issue #2677](#2677) [PR #3078](#3078) - Reduces extra memory usage on `__destruct()` calls [PR #3092](#3092) - Additional properties for Trendlines [Issue #3011](#3011) [PR #3028](#3028) - Calculation suppressFormulaErrors fix [Issue #1531](#1531) [PR #3092](#3092) - Permit Date/Time Entered on Spreadsheet to be Calculated as Float [Issue #1416](#1416) [PR #3121](#3121) - Incorrect Handling of Data Validation Formula Containing Ampersand [Issue #3145](#3145) [PR #3146](#3146) - Xlsx Namespace Handling of Drawings, RowAndColumnAttributes, MergeCells [Issue #3138](#3138) [PR #3136](#3137) - Generation3 Copy With Image in Footer [Issue #3126](#3126) [PR #3140](#3140) - MATCH Function Problems with Int/Float Compare and Wildcards [Issue #3141](#3141) [PR #3142](#3142) - Fix ODS Read Filter on number-columns-repeated cell [Issue #3148](#3148) [PR #3149](#3149) - Problems Formatting Very Small and Very Large Numbers [Issue #3128](#3128) [PR #3152](#3152) - XlsxWrite preserve line styles for y-axis, not just x-axis [PR #3163](#3163) - Xlsx Namespace Handling of Drawings, RowAndColumnAttributes, MergeCells [Issue #3138](#3138) [PR #3137](#3137) - More Detail for Cyclic Error Messages [Issue #3169](#3169) [PR #3170](#3170) - Improved Documentation for Deprecations - many PRs [Issue #3162](#3162)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix #3138 (originally attached to long-closed #1482). There were problems processing a spreadsheet generated by third party software. That spreadsheet used unexpected namespacing, and absolute paths within the zip file where relative paths were expected.
Xlsx Reader handles most, but not all, of its processing in a namespace-aware manner. Two versions of a worksheet's xml are available -
$xmlSheet
is not namespace aware and$xmlSheetNS
is aware. This was necessary in order to add namespace support in an incremental manner. The primary reason to continue to use the unaware version is the absence of test cases. In particular, drawings, row and column attributes, and merge cells continue to use the unaware version; this PR changes those to use the aware version.As noted in the summary above, a couple of new places in the handling of the those items were expecting file locations to be specified as relative paths in the zip file, but the file used absolute paths instead. Those unexpected usages are now addressed.
The user reporting the new problem tried a change which effectively made all uses of
$xmlSheet
namespace aware, and that seemed helpful. It may be time eliminate its usage altogether, whether or not we have appropriate examples of unexpected namespaces to test with. I will not do that with this change, but I may add a new PR to do so after this one is merged. Remaining areas which still use the unaware version include conditional formatting (internal or external), sheet view options, sheet protection, auto filters, unparsed loaded data, data validation (internal or external), alternate content, and header/footer images.There is an interesting anomaly with the new test file. When I load it and save it, the appearance of the output file does not quite match the input. Oddly, the output file seems much better than the input - the picture no longer covers any data, for example. This is because, in particular, the output file row heights and column widths seem to match the xml, but the input file does not. For example, the xml in both files seems to indicate that row 5 should have a height of 234, which it does in the output file, but the height of that row when the input file is opened is 156. It appears that all row heights and column widths when the input file is opened are very close to 2/3 of what is expected. I will continue to research that anomaly for a few days, but I will not let it prevent me from moving forward with this PR if I don't find the explanation. Whatever that problem is, it seems distinct from the namespacing/pathing problems which the PR addresses.
This is:
Checklist:
Why this change is needed?
Provide an explanation of why this change is needed, with links to any Issues (if appropriate).
If this is a bugfix or a new feature, and there are no existing Issues, then please also create an issue that will make it easier to track progress with this PR.