Skip to content
This repository was archived by the owner on Sep 3, 2022. It is now read-only.
This repository was archived by the owner on Sep 3, 2022. It is now read-only.

Diagnose, benchmark and provide guidance for loading large dataframe from BigQuery #329

Open
@Di-Ku

Description

@Di-Ku

Customer query (via Tahir F.)

Do we have any kind of benchmarks / recommendations for the GCE set-up for the amount of data that would be brought into a pandas dataframe?
His question is as follows:
From my perspective, could you advise me the appropriate spec of GCE?
We grade up the GCE spec and it seems to use only 2% of CPU but it takes 5mins to handle 500,000rows data in pandas.

Do you have any idea to improve the performance of datalab?
Does it relate to network or disk issue?

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions