Skip to content

BigQuery: 'test_dbapi_w_standard_sql_types' systest flakes w/ 500 #6136

@tseaver

Description

@tseaver

From: https://circleci.com/gh/GoogleCloudPlatform/google-cloud-python/8467 (3.6 system tests), started at 2018-09-27T18:16:14Z:

_________________ TestBigQuery.test_dbapi_w_standard_sql_types _________________

self = <google.cloud.bigquery.dbapi.cursor.Cursor object at 0x7fe117016748>
operation = 'SELECT 1', parameters = None, job_id = None

    def execute(self, operation, parameters=None, job_id=None):
        """Prepare and execute a database operation.
    
            .. note::
                When setting query parameters, values which are "text"
                (``unicode`` in Python2, ``str`` in Python3) will use
                the 'STRING' BigQuery type. Values which are "bytes" (``str`` in
                Python2, ``bytes`` in Python3), will use using the 'BYTES' type.
    
                A `~datetime.datetime` parameter without timezone information uses
                the 'DATETIME' BigQuery type (example: Global Pi Day Celebration
                March 14, 2017 at 1:59pm). A `~datetime.datetime` parameter with
                timezone information uses the 'TIMESTAMP' BigQuery type (example:
                a wedding on April 29, 2011 at 11am, British Summer Time).
    
                For more information about BigQuery data types, see:
                https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types
    
                ``STRUCT``/``RECORD`` and ``REPEATED`` query parameters are not
                yet supported. See:
                https://github.com/GoogleCloudPlatform/google-cloud-python/issues/3524
    
            :type operation: str
            :param operation: A Google BigQuery query string.
    
            :type parameters: Mapping[str, Any] or Sequence[Any]
            :param parameters:
                (Optional) dictionary or sequence of parameter values.
    
            :type job_id: str
            :param job_id: (Optional) The job_id to use. If not set, a job ID
                is generated at random.
            """
        self._query_data = None
        self._query_job = None
        client = self.connection._client
    
        # The DB-API uses the pyformat formatting, since the way BigQuery does
        # query parameters was not one of the standard options. Convert both
        # the query and the parameters to the format expected by the client
        # libraries.
        formatted_operation = _format_operation(
            operation, parameters=parameters)
        query_parameters = _helpers.to_query_parameters(parameters)
    
        config = job.QueryJobConfig()
        config.query_parameters = query_parameters
        config.use_legacy_sql = False
        self._query_job = client.query(
            formatted_operation, job_config=config, job_id=job_id)
    
        # Wait for the query to finish.
        try:
>           self._query_job.result()

google/cloud/bigquery/dbapi/cursor.py:155: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <google.cloud.bigquery.job.QueryJob object at 0x7fe0c61eefd0>
timeout = None, retry = <google.api_core.retry.Retry object at 0x7fe11720cb38>

    def result(self, timeout=None, retry=DEFAULT_RETRY):
        """Start the job and wait for it to complete and get the result.
    
            :type timeout: float
            :param timeout:
                How long (in seconds) to wait for job to complete before raising
                a :class:`concurrent.futures.TimeoutError`.
    
            :type retry: :class:`google.api_core.retry.Retry`
            :param retry: (Optional) How to retry the call that retrieves rows.
    
            :rtype: :class:`~google.cloud.bigquery.table.RowIterator`
            :returns:
                Iterator of row data :class:`~google.cloud.bigquery.table.Row`-s.
                During each page, the iterator will have the ``total_rows``
                attribute set, which counts the total number of rows **in the
                result set** (this is distinct from the total number of rows in
                the current page: ``iterator.page.num_items``).
    
            :raises:
                :class:`~google.cloud.exceptions.GoogleCloudError` if the job
                failed or :class:`concurrent.futures.TimeoutError` if the job did
                not complete in the given timeout.
            """
>       super(QueryJob, self).result(timeout=timeout)

google/cloud/bigquery/job.py:2685: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <google.cloud.bigquery.job.QueryJob object at 0x7fe0c61eefd0>
timeout = None

    def result(self, timeout=None):
        """Start the job and wait for it to complete and get the result.
    
            :type timeout: float
            :param timeout:
                How long (in seconds) to wait for job to complete before raising
                a :class:`concurrent.futures.TimeoutError`.
    
            :rtype: _AsyncJob
            :returns: This instance.
    
            :raises:
                :class:`~google.cloud.exceptions.GoogleCloudError` if the job
                failed or :class:`concurrent.futures.TimeoutError` if the job did
                not complete in the given timeout.
            """
        if self.state is None:
            self._begin()
        # TODO: modify PollingFuture so it can pass a retry argument to done().
>       return super(_AsyncJob, self).result(timeout=timeout)

google/cloud/bigquery/job.py:697: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <google.cloud.bigquery.job.QueryJob object at 0x7fe0c61eefd0>
timeout = None

    def result(self, timeout=None):
        """Get the result of the operation, blocking if necessary.
    
            Args:
                timeout (int):
                    How long (in seconds) to wait for the operation to complete.
                    If None, wait indefinitely.
    
            Returns:
                google.protobuf.Message: The Operation's result.
    
            Raises:
                google.api_core.GoogleAPICallError: If the operation errors or if
                    the timeout is reached before the operation completes.
            """
        self._blocking_poll(timeout=timeout)
    
        if self._exception is not None:
            # pylint: disable=raising-bad-type
            # Pylint doesn't recognize that this is valid in this case.
>           raise self._exception
E           google.api_core.exceptions.InternalServerError: 500 Error encountered during execution. Retrying may solve the problem.

../api_core/google/api_core/future/polling.py:120: InternalServerError

During handling of the above exception, another exception occurred:

self = <tests.system.TestBigQuery testMethod=test_dbapi_w_standard_sql_types>

    def test_dbapi_w_standard_sql_types(self):
        examples = self._generate_standard_sql_types_examples()
        for example in examples:
>           Config.CURSOR.execute(example['sql'])

tests/system.py:1138: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <google.cloud.bigquery.dbapi.cursor.Cursor object at 0x7fe117016748>
operation = 'SELECT 1', parameters = None, job_id = None

    def execute(self, operation, parameters=None, job_id=None):
        """Prepare and execute a database operation.
    
            .. note::
                When setting query parameters, values which are "text"
                (``unicode`` in Python2, ``str`` in Python3) will use
                the 'STRING' BigQuery type. Values which are "bytes" (``str`` in
                Python2, ``bytes`` in Python3), will use using the 'BYTES' type.
    
                A `~datetime.datetime` parameter without timezone information uses
                the 'DATETIME' BigQuery type (example: Global Pi Day Celebration
                March 14, 2017 at 1:59pm). A `~datetime.datetime` parameter with
                timezone information uses the 'TIMESTAMP' BigQuery type (example:
                a wedding on April 29, 2011 at 11am, British Summer Time).
    
                For more information about BigQuery data types, see:
                https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types
    
                ``STRUCT``/``RECORD`` and ``REPEATED`` query parameters are not
                yet supported. See:
                https://github.com/GoogleCloudPlatform/google-cloud-python/issues/3524
    
            :type operation: str
            :param operation: A Google BigQuery query string.
    
            :type parameters: Mapping[str, Any] or Sequence[Any]
            :param parameters:
                (Optional) dictionary or sequence of parameter values.
    
            :type job_id: str
            :param job_id: (Optional) The job_id to use. If not set, a job ID
                is generated at random.
            """
        self._query_data = None
        self._query_job = None
        client = self.connection._client
    
        # The DB-API uses the pyformat formatting, since the way BigQuery does
        # query parameters was not one of the standard options. Convert both
        # the query and the parameters to the format expected by the client
        # libraries.
        formatted_operation = _format_operation(
            operation, parameters=parameters)
        query_parameters = _helpers.to_query_parameters(parameters)
    
        config = job.QueryJobConfig()
        config.query_parameters = query_parameters
        config.use_legacy_sql = False
        self._query_job = client.query(
            formatted_operation, job_config=config, job_id=job_id)
    
        # Wait for the query to finish.
        try:
            self._query_job.result()
        except google.cloud.exceptions.GoogleCloudError:
>           raise exceptions.DatabaseError(self._query_job.errors)
E           google.cloud.bigquery.dbapi.exceptions.DatabaseError: [{'reason': 'backendError', 'message': 'Error encountered during execution. Retrying may solve the problem.'}]

google/cloud/bigquery/dbapi/cursor.py:157: DatabaseError

Similar failures for other 3.6 systests:

  • test_query_iter
  • test_query_many_columns
  • test_query_w_query_params
  • test_query_w_standard_sql_types

And for 2.7 snippets:

  • test_extract_table_json
  • test_client_query
  • test_client_query_destination_table_cmek
  • test_client_query_relax_column
  • test_client_query_w_named_params
  • test_client_query_w_positional_params
  • test_client_query_w_array_params
  • test_client_query_w_struct_params
  • test_query_external_gcs_permanent_table
  • test_query_external_sheets_temporary_table.

Interestingly, the 2.7 systests and 3.6 snippets runs completed without errors.

Metadata

Metadata

Assignees

Labels

api: bigqueryIssues related to the BigQuery API.flakytestingtype: processA process-related concern. May include testing, release, or the like.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions