Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery Job objects not pickle-able #5866

Closed
max-sixty opened this issue Aug 29, 2018 · 7 comments
Closed

BigQuery Job objects not pickle-able #5866

max-sixty opened this issue Aug 29, 2018 · 7 comments
Assignees
Labels
api: bigquery Issues related to the BigQuery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@max-sixty
Copy link

@tseaver suggested I start a new issue after adding to a closed issue: #3191

BigQuery Job Objects are not pickle-able:


In [16]: job = client.copy_table(source_table,source_table)

In [17]: job
Out[17]: <google.cloud.bigquery.job.CopyJob at 0x1163a3320>

In [18]: pickle.dumps(job)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-18-0c1255e2d696> in <module>()
----> 1 pickle.dumps(job)

AttributeError: Can't pickle local object 'if_exception_type.<locals>.if_exception_type_predicate'

I'm guessing that's because it represents some sort of future? IMO it would be fine if __getstate__ deleted the polling and let a reference that could be checked.

@tseaver suggests:

As a workaround, you could pickle the result of job._to_api_repr, and then reconstitute it using client.job_from_resource.

which works. This could also be done by __getstate__ itself.

Thanks

@tseaver tseaver added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. api: bigquery Issues related to the BigQuery API. labels Aug 29, 2018
@c1dc-candidate-22
Copy link

@tseaver CopyJob doesnt seem to have to_api_repr. Can you suggest which Class type does the job you are referring to belongs to ?

@tseaver
Copy link
Contributor

tseaver commented Oct 3, 2018

@c1dc-candidate-22 Hmm, looking at the code, the method I had in mind isn't called to_api_repr: it is CopyJob._build_resource. ISTM that renaming it to_api_repr would be reasonable (of course leaving behind an alias for backward compatibility). @shollyman, @tswast WDYT?

@shollyman
Copy link
Contributor

This is where I fully admit I have not fully internalized this veneer yet, so I'm happy to defer to others. _build_resource seems to revolve around ensuring a viable resource suitable for calling the backend for insertion (e.g. solely the configuration part of a job representation), whereas the goal here is the ability to serialize the whole representation of job: configuration, jobreference, statistics etc.

Exposing the full representation seems fine, but its not clear that aliasing is the right thing to do here without further reading on my part. Perhaps @tswast can offer more insight?

@tswast
Copy link
Contributor

tswast commented Oct 4, 2018

Job is the one set of resources that we didn't fully refactor during the 1.0 rewrite to follow the to_api_repr / from_api_repr pattern. I agree that having these methods is desirable.

It'll be a little bit tricky since Job mutates itself with the private reload() function. It'd be good to remove some of the fancy logic that Job has for building it's internal state and use the _properties dictionary directly, as we do for other resources.

@wilberh
Copy link

wilberh commented Oct 5, 2018

Maybe the reason for the error is the data format used is not Python-specific?
Also, don't forget to "import pickle" and "import client".

@tseaver
Copy link
Contributor

tseaver commented Oct 5, 2018

@tswast If _build_resource as it stands today were renamed to_api_repr, that would satisfy the OP's usecase.

@tswast
Copy link
Contributor

tswast commented Oct 8, 2018

Yes, it looks like _build_resource has no side effects. We can rename to to_api_repr.

tseaver added a commit that referenced this issue Oct 9, 2018
Leave '_build_resource' behind as a backward-compatibility alias.

Closes #5866.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

6 participants