Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Support reading empty parquet files #18392

Merged
merged 4 commits into from
Aug 27, 2024

Conversation

corwinjoy
Copy link
Contributor

Fix #13457

Add support for reading empty parquet files.
I'm new to this project so I could use help with the unit tests that I added to verify correct behavior. I'm not sure if they are in the correct place/am having trouble getting make test to pick up the new tests.

@github-actions github-actions bot added fix Bug fix python Related to Python Polars rust Related to Rust Polars labels Aug 27, 2024
@adamreeve
Copy link
Contributor

I'm not sure if they are in the correct place/am having trouble getting make test to pick up the new tests.

make test will skip tests that are marked with write_disk by default due to the pytest config:

"-m not slow and not write_disk and not release and not docs and not hypothesis and not benchmark and not ci_only",

You can run all tests from the py-polars directory with make test-all, or run pytest directly and only include the write_disk tests with something like:

make build
../.venv/bin/pytest -m 'write_disk'

To run the Rust tests you can run make test from the crates directory.

@adamreeve
Copy link
Contributor

I don't know if this is still the case but I've previously had feedback that Python tests are preferred over Rust tests due to the impact of building Rust tests on CI times (#13139 (comment)). So possibly only having the Python tests is enough.

Copy link

codecov bot commented Aug 27, 2024

Codecov Report

Attention: Patch coverage is 75.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 79.80%. Comparing base (d6703c4) to head (9642aec).
Report is 25 commits behind head on main.

Files Patch % Lines
...arquet/src/parquet/schema/io_thrift/from_thrift.rs 75.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #18392      +/-   ##
==========================================
- Coverage   79.87%   79.80%   -0.07%     
==========================================
  Files        1496     1497       +1     
  Lines      200281   200428     +147     
  Branches     2841     2844       +3     
==========================================
- Hits       159966   159956      -10     
- Misses      39790    39947     +157     
  Partials      525      525              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

py-polars/tests/unit/io/test_lazy_parquet.py Outdated Show resolved Hide resolved
crates/polars-io/src/parquet/write/writer.rs Outdated Show resolved Hide resolved
@coastalwhite
Copy link
Collaborator

Some minor nits, nothing stopping it from getting merged and me fixing those immediately afterward. @ritchie46

@ritchie46
Copy link
Member

@coastalwhite, it is fine to take the PR from here and apply those nits.

@ritchie46 ritchie46 merged commit d12131a into pola-rs:main Aug 27, 2024
26 checks passed
@corwinjoy
Copy link
Contributor Author

Thanks everyone for being so helpful to a newbie to this project! Much appreciated, @coastalwhite for the fixup and the prompt review by everyone, you are awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Bug fix python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

can write empty parquet but not read
4 participants