Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(python): Handle current position of file objects #17543

Merged
merged 2 commits into from
Jul 12, 2024

Conversation

ruihe774
Copy link
Contributor

@ruihe774 ruihe774 commented Jul 10, 2024

#17315 introduced an issue that if a file-like object was passed to read_* functions, its current stream position was not correctly respected. This PR fixes it by invalidating read buffering of file-like objects and using the stream position as offset when mmapping. This allows e.g. reading a table file with custom heading:

with open("table") as f:
    schema = parse_custom_header(f.readline())
    pl.read_csv(f, schema=schema, ...)

If the file is mmapped by reading functions (i.e. in get_reader_bytes()), the stream position of the file will not be updated after reading. This PR documents this behavior. Giving that reading functions usually consume all remaining file content and file objects after reading are seldom used anymore, IMO not updating stream position is not a big problem.

@github-actions github-actions bot added fix Bug fix python Related to Python Polars labels Jul 10, 2024
@ruihe774 ruihe774 force-pushed the fix-buffer branch 2 times, most recently from adc94c0 to 7ad5b41 Compare July 10, 2024 07:44
@ruihe774 ruihe774 marked this pull request as ready for review July 10, 2024 07:52
@ruihe774 ruihe774 force-pushed the fix-buffer branch 3 times, most recently from 23a0036 to 699d64e Compare July 10, 2024 12:22
Copy link

codecov bot commented Jul 10, 2024

Codecov Report

Attention: Patch coverage is 95.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 80.49%. Comparing base (daf2e49) to head (2cf1a23).
Report is 2 commits behind head on main.

Files Patch % Lines
py-polars/src/file.rs 92.85% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main   #17543   +/-   ##
=======================================
  Coverage   80.48%   80.49%           
=======================================
  Files        1483     1483           
  Lines      195118   195196   +78     
  Branches     2778     2778           
=======================================
+ Hits       157039   157115   +76     
- Misses      37568    37570    +2     
  Partials      511      511           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ritchie46 ritchie46 merged commit 7e527e9 into pola-rs:main Jul 12, 2024
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Bug fix python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants