Skip to content

Implement _cp_file for Zonal Buckets & support custom flush interval#746

Open
suni72 wants to merge 13 commits intofsspec:mainfrom
ankitaluthra1:zonal-fix-1
Open

Implement _cp_file for Zonal Buckets & support custom flush interval#746
suni72 wants to merge 13 commits intofsspec:mainfrom
ankitaluthra1:zonal-fix-1

Conversation

@suni72
Copy link
Contributor

@suni72 suni72 commented Jan 19, 2026

Key Changes:

  • Features:
    • Implemented _cp_file logic for Zonal buckets.
    • Added support for custom flush interval for zonal writes.
  • Fixes & Maintenance:
    • Fixed aaow initialization logic in ZonalFile.
    • Updated imports and dependencies to align with the latest python-storage SDK.
    • Improved logging (byte tracking) and updated method descriptions/warnings.
  • CI & Tests:
    • Updated CI configuration to use the latest python-storage.
    • Increased ulimit in tests to prevent "too many open files" errors.
    • Configured separate regional buckets for zonal copy tests.

@suni72 suni72 marked this pull request as draft January 19, 2026 10:31
@suni72 suni72 force-pushed the zonal-fix-1 branch 7 times, most recently from 6b68697 to 2cd21fb Compare January 26, 2026 17:50
@codecov-commenter
Copy link

codecov-commenter commented Jan 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.06%. Comparing base (6e4efb8) to head (1903c34).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #746      +/-   ##
==========================================
+ Coverage   75.41%   76.06%   +0.65%     
==========================================
  Files          19       19              
  Lines        2859     2879      +20     
==========================================
+ Hits         2156     2190      +34     
+ Misses        703      689      -14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@suni72 suni72 force-pushed the zonal-fix-1 branch 5 times, most recently from 140e1c5 to da12664 Compare January 30, 2026 14:47
@suni72 suni72 changed the title Update ZonalFile commit to log warning instead of raising error for No-op cases Implement _cp_file for Zonal Buckets & support custom flush interval Jan 30, 2026
@suni72 suni72 marked this pull request as ready for review January 30, 2026 15:28
@ankitaluthra1
Copy link
Collaborator

/gcbrun

1 similar comment
@ankitaluthra1
Copy link
Collaborator

/gcbrun

suni72 and others added 2 commits February 6, 2026 14:13
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@suni72 suni72 marked this pull request as ready for review February 6, 2026 09:14
if block_size is None:
block_size = self.default_block_size
# If we are using the generic default (user didn't override it),
# switch to the Zonal-optimized default for Zonal buckets.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need zonal default to be different from regional ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python SDK uses 16 MB. I believe the 16 MB flush interval was chosen to maximize throughput, so we should align with that standard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants