Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery IO Source is not Exporting to GCS as written in documentation #19174

Open
kennknowles opened this issue Jun 3, 2022 · 1 comment
Open

Comments

@kennknowles
Copy link
Member

Did some check on the beam code and find out that DataFlow is querying BigQuery and retrieve the result using pagination [1]. As per our understanding, this means no parallelism on reading BigQuery table. It is contradictory to what the documentation is telling us [2].
 
Is this some kind of work in progress? I'm filing as a bug since documentation telling me that it is using GCS meanwhile it's using NativeSourceReader which yield data per row as iterator.
 
[1] 


[2] 
The main and side inputs are implemented differently. Reading a BigQuery table

Imported from Jira BEAM-5352. Original Jira may contain additional context.
Reported by: rendybjunior.

@rendybjunior
Copy link

Thanks for migrating the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants