Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove tmp_dir only if the output dir is not in s3 #320

Merged
merged 3 commits into from
Jun 12, 2023

Conversation

erezzarum
Copy link
Contributor

When not running local, do not remove the tmp_dir, it will be deleted by the last removal of the shard file from the tmp dir.

Fixes #319

@rom1504
Copy link
Owner

rom1504 commented Jun 11, 2023

the problem is that s3 has no concept of folders
that may not be true for other file systems, so let's exclude this removal only for s3

@erezzarum
Copy link
Contributor Author

What about other object storage (gcs, r2, etc.)?
I have no way to test this at the moment

@rom1504
Copy link
Owner

rom1504 commented Jun 11, 2023

exactly. That's why I'm suggesting to disable this removal only for s3

@rom1504 rom1504 changed the title Remove tmp_dir only if we are local Remove tmp_dir only if the output dir is not in s3 Jun 12, 2023
@rom1504 rom1504 merged commit ba85691 into rom1504:main Jun 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Temp dir removal - FileNotFoundError: ['mybucket/data/tests/test_1000_parquet/5/_tmp']
2 participants