-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Fix BIFF8 compatibility issues in XLS writer #4686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix BIFF8 compatibility issues in XLS writer #4686
Conversation
The XLS writer incorrectly used 1-based column indices in the BIFF8 DIMENSIONS record, violating the Microsoft Excel Binary File Format specification which requires 0-based indices. This bug caused an extra empty column to appear when converting XLS files to other formats (e.g., CSV). Changes: - Modified column index initialization to subtract 1 from the result of Coordinate::columnIndexFromString() to convert from 1-based to 0-based indexing - Updated COLINFO loop to use the corrected 0-based lastColumnIndex Per BIFF8 specification: - colMic (first column) must be 0-based - colMac (column after last column) must be 0-based Example: For columns A-G (7 columns): - Before: colMic=1, colMac=8 (incorrect) - After: colMic=0, colMac=7 (correct) Fixes PHPOffice#4682
Old PHPExcel always wrote 0x0000 for both zoom_factor_page_break and zoom_factor_normal in the WINDOW2 record. PhpSpreadsheet changed this to write actual zoom values, breaking byte-identical compatibility. This restores the Excel 5 (BIFF8) behavior by setting both zoom factors to 0x0000, matching old PHPExcel output exactly. Related to PHPOffice#4682
PhpSpreadsheet writes the PageLayoutView record (0x088B), but old PHPExcel did not include this record in Excel 5 (BIFF8) format. This removes the writePageLayoutView() call to match old PHPExcel behavior and reduce file size by 20 bytes per sheet. Related to PHPOffice#4682
Old PHPExcel used inverted protection logic and always wrote 0x7FFF for the options bitmask (all protection features disabled except sheet protection itself). PhpSpreadsheet changed this to dynamic calculation resulting in different values (typically 0x4403), breaking compatibility. This restores the Excel 5 (BIFF8) behavior by setting options to 0x7FFF to match old PHPExcel output exactly. Related to PHPOffice#4682
PhpSpreadsheet writes CFHEADER and CFRULE records for conditional formatting. Old PHPExcel did not write these records in Excel 5 (BIFF8) format. This removes the writeConditionalFormatting() call to match old PHPExcel behavior and avoid file size inflation. Related to PHPOffice#4682
PhpSpreadsheet generates COLINFO records only for used columns, but old PHPExcel always generated 256 COLINFO records (columns 0-255) regardless of actual usage. This causes structural incompatibility where files have different numbers of COLINFO records, breaking byte-identical compatibility and potentially causing issues with legacy readers. This changes the COLINFO generation to always output 256 records (0-255) to match old PHPExcel behavior exactly. Related to PHPOffice#4682
|
Thank you for your effort thus far. Here are my initial impressions. Up front, I will tell you that compatibility with PHPExcel is not a priority for us. It is more than 6 years behind our current code base, and is therefore that far behind on major and minor fixes.
Future 1. Same question as always. Future 2. OLE Container. I don't know much about OLE, so I'm willing to hear more about this, but subject to the usual questions. In summary, I would suggest creating a separate PR for (1). We can continue discussing the other items in this PR if you wish. |
|
Hi @oleibman, Thank you for the feedback! I've created a separate PR for the DIMENSIONS fix as you suggested: This PR focuses solely on the 0-based column indices issue and includes the real-world context about legacy Excel parser compatibility. Regarding the other commits, I understand your concerns about PHPExcel compatibility not being a priority. Unfortunately, I don't have the bandwidth to provide additional I really appreciate your time reviewing this and the guidance on splitting out the DIMENSIONS fix! Thanks again! |
|
PR #4687, which addresses your main problem, is now merged. I do not think it is likely that we will address any of the other issues, so I am closing this PR now. Feel free to open new issues, bearing in mind that we will address cases where PhpSpreadsheet is generating incorrect results, but we will not address cases just because they don't happen to match PHPExcel. |
Fix BIFF8 compatibility issues in XLS writer
Summary
This PR fixes 6 BIFF8 (Excel 5 Binary File Format) compatibility issues in the XLS writer to restore compatibility with old PHPExcel Excel5 writer behavior.
Each fix is in a separate commit for easy cherry-picking.
Commits (Each Can Be Cherry-Picked Independently)
1. Fix DIMENSIONS Record - 0-based Column Indices⚠️ CRITICAL
Commit:
0c18ab184Problem: DIMENSIONS record incorrectly used 1-based column indices instead of 0-based as required by BIFF8 specification.
Impact: Extra empty column appears when converting XLS to CSV.
Example: For columns A-G (7 columns):
colMic=1, colMac=8❌colMic=0, colMac=7✅Test Result:
2. Fix WINDOW2 Record - Zoom Factors
Commit:
df12805bbProblem: PhpSpreadsheet wrote non-zero zoom factors. Old PHPExcel always wrote 0x0000.
Fix: Set both
zoom_factor_page_breakandzoom_factor_normalto0x0000.3. Remove PageLayoutView Record
Commit:
9ec328ff6Problem: PhpSpreadsheet writes PageLayoutView record (0x088B). Old PHPExcel did not.
Fix: Removed
writePageLayoutView()call to match Excel 5 format.Benefit: Reduces file size by 20 bytes per sheet.
4. Fix SHEETPROTECTION Options Bitmask
Commit:
eadade52dProblem: PhpSpreadsheet uses dynamic calculation (typically 0x4403). Old PHPExcel used inverted logic (0x7FFF).
Fix: Set
options = 0x7FFFto match old PHPExcel behavior.5. Remove ConditionalFormatting Records
Commit:
82f272868Problem: PhpSpreadsheet writes CFHEADER/CFRULE records. Old PHPExcel did not.
Fix: Removed
writeConditionalFormatting()call to match Excel 5 format.Note: 4 tests fail because they expect CF records. Old PHPExcel didn't write these. Tests may need updating or skipping if this commit is accepted.
6. Fix COLINFO Records - Generate 256 Entries
Commit:
6f7f5e406Problem: PhpSpreadsheet generates COLINFO only for used columns. Old PHPExcel generated 256 (0-255) always.
Fix: Changed COLINFO loop to always generate 256 records.
Impact: Structural compatibility with old PHPExcel files.
Test Results
✅ DIMENSIONS Fix Verified
Custom test confirms DIMENSIONS record now correctly uses 0-based indices.
4 Failing Tests (All ConditionalFormatting-related):
ConditionalFontColorTest::testConditionalFontColorConditionalLimitsTest::testLimitsConditionalUnionTest::testConditionalUnionConditionalUnionTest::testIntersectionRangeReason: These tests expect ConditionalFormatting records, but old PHPExcel didn't write them. If commit #5 is accepted, these tests should be updated to not expect CF records in Excel 5 format.
BIFF8 Specification References
colMic: First column index (0-based)colMac: Last column index + 1 (0-based)Backwards Compatibility
These changes restore Excel 5 (BIFF8) format compatibility. Applications expecting the current behavior may need adjustments, but this aligns with:
Future Work - Achieving 100% Backwards Compatibility
This PR addresses all Worksheet-level BIFF8 compatibility issues. However, for 100% byte-identical compatibility with old PHPExcel Excel5 files, two additional components require fixes:
1. Shared String Table (SST) - Workbook Level
The
writeSharedStringsTable()method inWriter/Xls/Workbook.phpneeds to match old PHPExcel's exact SST encoding and chunking logic. This affects string storage and CONTINUE record handling.2. OLE Container & Timestamps - File Level
The OLE Root structure needs to match old PHPExcel's timestamp handling and container organization for deterministic file generation.
These changes are more invasive (affecting private methods and file structure) and will be proposed in a separate PR if this one is accepted. Together, all changes enable 100%
byte-identical output with old PHPExcel, which is critical for:
A working reference implementation demonstrating 100% compatibility is available and has been tested with files up to 377 KB / 20,000+ BIFF records.
Fixes #4682