Skip to content

How to follow progress of query download when stdout is not available #1654

Open
@JakeSummers

Description

@JakeSummers

Issue Summary

I would like to follow progress of query downloads.

Currently I am doing:

    query_result: QueryJob = client.query(query)

    df = query_result.result().to_dataframe(
            progress_bar_type="tqdm" 
    )

But this only supports stdout.

I would like to have some kind of mechanism to follow progress when stdout is not available.

Possible Solution 1 - Logs

A minimal solution could be to add a log statement into the code.

Maybe this would work:

Line 1819 of google/cloud/bigquery/table.py

        try:
            progress_bar = get_progress_bar(
                progress_bar_type, "Downloading", self.total_rows, "rows"
            )

            record_batches = []
            for record_batch in self.to_arrow_iterable(
                bqstorage_client=bqstorage_client
            ):
                record_batches.append(record_batch)
                
                # NEW LINE
                logger.debug("Downloaded data", completed=record_batch.num_rows, total_items=progress_bar.total or self.total_rows)

Possible Solution 2 - Callback

A better solution would be to add a call-back function to the to_dataframe function, like this:

    def log_progress(completed_items: int, total_items:int) -> None:
            # This lets me do whatever I want here :)
            logger.debug("Downloaded data", completed=completed_items, total_items=total_items)

    query_result: QueryJob = client.query(query)

    df = query_result.result().to_dataframe(
            progress_callback=log_progress
    )

Metadata

Metadata

Assignees

No one assigned

    Labels

    api: bigqueryIssues related to the googleapis/python-bigquery API.priority: p3Desirable enhancement or fix. May not be included in next release.type: feature request‘Nice-to-have’ improvement, new feature or different behavior or design.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions