-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support gcs caching for parallel processing #113
Conversation
814b06f
to
59bb496
Compare
Would you mind adding some text to your commit message explaining what problem this addresses, and how? I'm assuming there's an issue where the GCS client and bucket are not being sent from one process to another, but it seems like we could solve that by adding some custom pickling behavior. But maybe this caching is also intended to speed up tests? |
Also, if this fixes any of the tests, should we remove some |
This change fixes the gcs persistence tests and I already removed |
Oops, somehow I missed that -- sorry! |
Whoops, I probably lost my original commit message with explanation here 😬. I'll add one. Yeah, the issue is that pickling of client objects is not supported by google cloud storage library. This was raised by someone before and the authors decided to not support client pickling. They just added an error message if someone tried to pickle it. I tried the workaround of setting I can also see why caching is tripping you a bit. I'll add some explanation as code comment too but I'm caching buckets since that's what the |
59bb496
to
1a2d180
Compare
As we talked today morning, I changed it to use custom pickling logic by recreating the client objects when unserializing the |
GCS caching is broken for parallel processing because GCS client objects cannot be pickled even with cloudpickle. This change stops attempting to pickle those GCS objects and recreates them back in the subprocess.
1a2d180
to
8f9aab6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
No description provided.