Skip to content

Conversation

@patricklnz
Copy link
Member

@patricklnz patricklnz commented Feb 19, 2024

Changes and Information

Please briefly list the changes (main added features, changed items, or corrected bugs) made:

  • newer pandas versions use calamine engine(which is faster) instead of openpyxl

If need be, add additional information and what the reviewer should look out for in particular:

  • read excel should now work for pandas versions >2.2 and <2.2. I already did some testing but I think this should be verified before merging on main

Merge Request - Guideline Checklist

Please check our git workflow. Use the draft feature if the Pull Request is not yet ready to review.

Checks by code author

  • Every addressed issue is linked (use the "Closes #ISSUE" keyword below)
  • New code adheres to coding guidelines
  • No large data files have been added (files should in sum not exceed 100 KB, avoid PDFs, Word docs, etc.)
  • Tests are added for new functionality and a local test run was successful (with and without OpenMP)
  • Appropriate documentation for new functionality has been added (Doxygen in the code and Markdown files if necessary)
  • Proper attention to licenses, especially no new third-party software with conflicting license has been added
  • (For ABM development) Checked benchmark results and ran and posted a local test above from before and after development to ensure performance is monitored.

Checks by code reviewer(s)

  • Corresponding issue(s) is/are linked and addressed
  • Code is clean of development artifacts (no deactivated or commented code lines, no debugging printouts, etc.)
  • Appropriate unit tests have been added, CI passes, code coverage and performance is acceptable (did not decrease)
  • No large data files added in the whole history of commits(files should in sum not exceed 100 KB, avoid PDFs, Word docs, etc.)
  • On merge, add 2-5 lines with the changes (main added features, changed items, or corrected bugs) to the merge-commit-message. This can be taken from the briefly-list-the-changes above (best case) or the separate commit messages (worst case).

Closes #910

@patricklnz patricklnz self-assigned this Feb 19, 2024
Copy link
Member

@annawendler annawendler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

When executing getTestingData I get the following errors in line 55:
With openpyxl: zipfile.BadZipFile: File is not a zip file
With calamine: python_calamine.CalamineError: Cannot detect file format

Can you check if this has to do with the excel engine or if this is another issue (and open a new issue in that case)?

@patricklnz
Copy link
Member Author

Nice!

When executing getTestingData I get the following errors in line 55: With openpyxl: zipfile.BadZipFile: File is not a zip file With calamine: python_calamine.CalamineError: Cannot detect file format

Can you check if this has to do with the excel engine or if this is another issue (and open a new issue in that case)?

I opened a new Issue #981 and added the error handling.

@patricklnz patricklnz requested a review from annawendler April 2, 2024 11:23
@annawendler annawendler merged commit 8ddb863 into main Apr 2, 2024
@annawendler annawendler deleted the 910-excel-engines branch April 2, 2024 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Handle engines for pandas.read_excel

3 participants