Skip to content
This repository was archived by the owner on Sep 3, 2022. It is now read-only.

Commit 880c3c7

Browse files
committed
Use http Keep-Alive, else BigQuery queries are ~seconds slower than necessary
- Before (without Keep-Alive): ~3-7s for BigQuery `select 3` with an already cached result - After (with Keep-Alive): ~1.5-3s - Query sends these 6 http requests and runtime appears to be dominated by network RTT: ``` http: method[POST], url[https://www.googleapis.com/bigquery/v2/projects/foo-bar/jobs/], body[{"kind": "bigquery#job", "configuration": {"priority": "INTERACTIVE", "query": {"allowLargeResults": false, "useLegacySql": false, "useQueryCache": true, "query": "select 3", "userDefinedFunctionResources": []}, "dryRun": false}}] http: method[GET], url[https://www.googleapis.com/bigquery/v2/projects/foo-bar/queries/job_sIi1HRqfyHkRB5_MkfWFjtZ60XM?startIndex=0&timeoutMs=30000&maxResults=0], body[None] http: method[GET], url[https://www.googleapis.com/bigquery/v2/projects/foo-bar/jobs/job_sIi1HRqfyHkRB5_MkfWFjtZ60XM], body[None] http: method[GET], url[https://www.googleapis.com/bigquery/v2/projects/foo-bar/datasets/_2f96775300d8858559d2bd23c05bad0392345e30/tables/anon921947a4e6645dc2b34411c365f9a45e0895d5a4], body[None] http: method[GET], url[https://www.googleapis.com/bigquery/v2/projects/foo-bar/datasets/_2f96775300d8858559d2bd23c05bad0392345e30/tables/anon921947a4e6645dc2b34411c365f9a45e0895d5a4], body[None] http: method[GET], url[https://www.googleapis.com/bigquery/v2/projects/foo-bar/datasets/_2f96775300d8858559d2bd23c05bad0392345e30/tables/anon921947a4e6645dc2b34411c365f9a45e0895d5a4/data?maxResults=25], body[None] ```
1 parent 15abddf commit 880c3c7

File tree

1 file changed

+12
-9
lines changed

1 file changed

+12
-9
lines changed

datalab/utils/_http.py

Lines changed: 12 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,16 @@ class Http(object):
5353
"""A helper class for making HTTP requests.
5454
"""
5555

56+
# Reuse one Http object across requests to take advantage of Keep-Alive, e.g.
57+
# for BigQuery queries that requires at least ~5 sequential http requests.
58+
#
59+
# TODO(nikhilko):
60+
# SSL cert validation seemingly fails, and workarounds are not amenable
61+
# to implementing in library code. So configure the Http object to skip
62+
# doing so, in the interim.
63+
http = httplib2.Http()
64+
http.disable_ssl_certificate_validation = True
65+
5666
def __init__(self):
5767
pass
5868

@@ -109,15 +119,8 @@ def request(url, args=None, data=None, headers=None, method=None,
109119
if method is None:
110120
method = 'GET'
111121

112-
# Create an Http object to issue requests. Associate the credentials
113-
# with it if specified to perform authorization.
114-
#
115-
# TODO(nikhilko):
116-
# SSL cert validation seemingly fails, and workarounds are not amenable
117-
# to implementing in library code. So configure the Http object to skip
118-
# doing so, in the interim.
119-
http = httplib2.Http()
120-
http.disable_ssl_certificate_validation = True
122+
# Authorize with credentials if given.
123+
http = Http.http
121124
if credentials is not None:
122125
http = credentials.authorize(http)
123126
if stats is not None:

0 commit comments

Comments
 (0)