Skip to content

Let's consider improvements to how internal errors are handled in the Pbench Server #3124

Open
@portante

Description

@portante

Summary suggestions

TBD

Initial Discussion

This is a very long description due to the nature of the stack traces discussed here. The referenced code can be found in branch https://github.com/portante/pbench/tree/PBENCH-944, the basis for PR #3081.

There are two "Tracebacks" being discussed, 1 and 2, which are in separate sections below. Both were taking from log messages emitted by the Pbench Server code

For both stack traces the core problem was a database schema that did not match the code. An attempt to delete a Dataset object failed because the DELETING state was not in the stored database schema.

This issue is NOT about the particular base exception, but IS about the Pbench Server code and its error handling / logging code paths.

Both stack traces are logged via the log_exception() method of the Flask() "app" as called by the handle_exception() method of the Flask object when an unhandled exception occurs.

The comments that follow in this issue make reference to this description.

Some Details

  1. The Flask().wsgi_app() calls Flask().full_dispatch_request() at line 2525 of flask/app.py
  2. The Flask().full_dispatch_request() calls Flask().dispatch_request() at line 1820 of flask/app.py
    • Line 1820 is the outer-most line of the stack trace because the try/except block of Flask().wsgi_app() caught the exception, handled it by calling Flask().handle_exception(), which in turn logged it by calling Flask().log_exception() which finally logged the error at line 1741 (see log_exception 1741 in the first line of each log message preceding the tracebacks below).

Traceback 1:

2022-12-20T16:45:23.046216 ERROR 118 140434560441216 pbench.server.api app log_exception 1741 -- Exception on /api/v1/datasets/delete/22a4bc5748b920c6ce271eb68f08d91c [POST]
Traceback (most recent call last):
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/engine/base.py", line 1819, in _execute_context
    self.dialect.do_execute(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.InvalidTextRepresentation: invalid input value for enum states: "DELETING"
LINE 1: UPDATE datasets SET state='DELETING', transition='2022-12-20...
                                  ^

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/pbench-server/lib/pbench/server/database/models/datasets.py", line 665, in update
    Database.db_session.commit()
  File "<string>", line 2, in commit
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 1451, in commit
    self._transaction.commit(_to_root=self.future)
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 829, in commit
    self._prepare_impl()
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 808, in _prepare_impl
    self.session.flush()
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 3383, in flush
    self._flush(objects)
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 3523, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    compat.raise_(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/util/compat.py", line 208, in raise_
    raise exception
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 3483, in _flush
    flush_context.execute()
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 456, in execute
    rec.execute(self)
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 630, in execute
    util.preloaded.orm_persistence.save_obj(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/persistence.py", line 237, in save_obj
    _emit_update_statements(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/persistence.py", line 1001, in _emit_update_statements
    c = connection._execute_20(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/engine/base.py", line 1631, in _execute_20
    return meth(self, args_10style, kwargs_10style, execution_options)
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/sql/elements.py", line 325, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/engine/base.py", line 1498, in _execute_clauseelement
    ret = self._execute_context(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/engine/base.py", line 1862, in _execute_context
    self._handle_dbapi_exception(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/engine/base.py", line 2043, in _handle_dbapi_exception
    util.raise_(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/util/compat.py", line 208, in raise_
    raise exception
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/engine/base.py", line 1819, in _execute_context
    self.dialect.do_execute(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.DataError: (psycopg2.errors.InvalidTextRepresentation) invalid input value for enum states: "DELETING"
LINE 1: UPDATE datasets SET state='DELETING', transition='2022-12-20...
                                  ^

[SQL: UPDATE datasets SET state=%(state)s, transition=%(transition)s WHERE datasets.id = %(datasets_id)s]
[parameters: {'state': 'DELETING', 'transition': datetime.datetime(2022, 12, 20, 16, 45, 23, 37775), 'datasets_id': 4}]
(Background on this error at: https://sqlalche.me/e/14/9h9h)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/pbench-server/lib/pbench/server/api/resources/query_apis/__init__.py", line 756, in _post
    report = self._analyze_bulk(results)
  File "/opt/pbench-server/lib/pbench/server/api/resources/query_apis/__init__.py", line 670, in _analyze_bulk
    for ok, response in results:
  File "/home/pbench/.local/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 319, in streaming_bulk
    for bulk_data, bulk_actions in _chunk_actions(
  File "/home/pbench/.local/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 155, in _chunk_actions
    for action, data in actions:
  File "/opt/pbench-server/lib/pbench/server/api/resources/query_apis/datasets_delete.py", line 71, in generate_actions
    dataset.advance(States.DELETING)
  File "/opt/pbench-server/lib/pbench/server/database/models/datasets.py", line 641, in advance
    self.update()
  File "/opt/pbench-server/lib/pbench/server/database/models/datasets.py", line 667, in update
    self.logger.error("Can't update {} in DB", str(self))
  File "/opt/pbench-server/lib/pbench/server/database/models/datasets.py", line 614, in __str__
    return f"({self.owner_id})|{self.name}"
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/attributes.py", line 481, in __get__
    return self.impl.get(state, dict_)
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/attributes.py", line 941, in get
    value = self._fire_loader_callables(state, key, passive)
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/attributes.py", line 972, in _fire_loader_callables
    return state._load_expired(state, passive)
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/state.py", line 710, in _load_expired
    self.manager.expired_attribute_loader(self, toload, passive)
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/loading.py", line 1451, in load_scalar_attributes
    result = load_on_ident(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/loading.py", line 407, in load_on_ident
    return load_on_pk_identity(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/loading.py", line 530, in load_on_pk_identity
    session.execute(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 1711, in execute
    conn = self._connection_for_bind(bind)
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 1552, in _connection_for_bind
    return self._transaction._connection_for_bind(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 721, in _connection_for_bind
    self._assert_active()
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 601, in _assert_active
    raise sa_exc.PendingRollbackError(
sqlalchemy.exc.PendingRollbackError: This Session's transaction has been rolled back due to a previous exception during flush. To begin a new transaction with this Session, first issue Session.rollback(). Original exception was: (psycopg2.errors.InvalidTextRepresentation) invalid input value for enum states: "DELETING"
LINE 1: UPDATE datasets SET state='DELETING', transition='2022-12-20...
                                  ^

[SQL: UPDATE datasets SET state=%(state)s, transition=%(transition)s WHERE datasets.id = %(datasets_id)s]
[parameters: {'state': 'DELETING', 'transition': datetime.datetime(2022, 12, 20, 16, 45, 23, 37775), 'datasets_id': 4}]
(Background on this error at: https://sqlalche.me/e/14/9h9h) (Background on this error at: https://sqlalche.me/e/14/7s2a)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/pbench-server/lib/pbench/server/api/resources/__init__.py", line 1637, in _dispatch
    response = execute(params, request, {"auditing": auditing})
  File "/opt/pbench-server/lib/pbench/server/api/resources/query_apis/__init__.py", line 758, in _post
    raise APIInternalError("Unexpected backend error") from e
pbench.server.api.resources.APIInternalError: Internal Pbench Server Error: log reference d839a90f-5137-4cd4-bf23-7744f13f489b

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/pbench/.local/lib/python3.9/site-packages/flask/app.py", line 1820, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/pbench/.local/lib/python3.9/site-packages/flask/app.py", line 1796, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/home/pbench/.local/lib/python3.9/site-packages/flask_restful/__init__.py", line 467, in wrapper
    resp = resource(*args, **kwargs)
  File "/home/pbench/.local/lib/python3.9/site-packages/flask/views.py", line 107, in view
    return current_app.ensure_sync(self.dispatch_request)(**kwargs)
  File "/home/pbench/.local/lib/python3.9/site-packages/flask_restful/__init__.py", line 582, in dispatch_request
    resp = meth(*args, **kwargs)
  File "/home/pbench/.local/lib/python3.9/site-packages/flask_httpauth.py", line 172, in decorated
    return self.ensure_sync(f)(*args, **kwargs)
  File "/opt/pbench-server/lib/pbench/server/api/resources/__init__.py", line 1791, in post
    return self._dispatch(ApiMethod.POST, kwargs)
  File "/opt/pbench-server/lib/pbench/server/api/resources/__init__.py", line 1648, in _dispatch
    abort(e.http_status, message=str(e))
  File "/home/pbench/.local/lib/python3.9/site-packages/flask_restful/__init__.py", line 31, in abort
    original_flask_abort(http_status_code)
  File "/home/pbench/.local/lib/python3.9/site-packages/flask/helpers.py", line 310, in abort
    current_app.aborter(code, *args, **kwargs)
  File "/home/pbench/.local/lib/python3.9/site-packages/werkzeug/exceptions.py", line 864, in __call__
    raise self.mapping[code](*args, **kwargs)
werkzeug.exceptions.InternalServerError: 500 Internal Server Error: The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.

Traceback 2:

2022-12-20T17:54:09.977014 ERROR 1052 140645495241600 pbench.server.api app log_exception 1741 -- Exception on /api/v1/datasets/delete/22a4bc5748b920c6ce271eb68f08d91c [POST]
Traceback (most recent call last):
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/engine/base.py", line 1819, in _execute_context
    self.dialect.do_execute(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.InvalidTextRepresentation: invalid input value for enum states: "DELETING"
LINE 1: UPDATE datasets SET state='DELETING', transition='2022-12-20...
                                  ^

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/pbench-server/lib/pbench/server/database/models/datasets.py", line 665, in update
    Database.db_session.commit()
  File "<string>", line 2, in commit
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 1451, in commit
    self._transaction.commit(_to_root=self.future)
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 829, in commit
    self._prepare_impl()
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 808, in _prepare_impl
    self.session.flush()
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 3383, in flush
    self._flush(objects)
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 3523, in _flush
    transaction.rollback(_capture_exception=True)
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    compat.raise_(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/util/compat.py", line 208, in raise_
    raise exception
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/session.py", line 3483, in _flush
    flush_context.execute()
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 456, in execute
    rec.execute(self)
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 630, in execute
    util.preloaded.orm_persistence.save_obj(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/persistence.py", line 237, in save_obj
    _emit_update_statements(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/orm/persistence.py", line 1001, in _emit_update_statements
    c = connection._execute_20(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/engine/base.py", line 1631, in _execute_20
    return meth(self, args_10style, kwargs_10style, execution_options)
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/sql/elements.py", line 325, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/engine/base.py", line 1498, in _execute_clauseelement
    ret = self._execute_context(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/engine/base.py", line 1862, in _execute_context
    self._handle_dbapi_exception(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/engine/base.py", line 2043, in _handle_dbapi_exception
    util.raise_(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/util/compat.py", line 208, in raise_
    raise exception
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/engine/base.py", line 1819, in _execute_context
    self.dialect.do_execute(
  File "/usr/lib64/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.DataError: (psycopg2.errors.InvalidTextRepresentation) invalid input value for enum states: "DELETING"
LINE 1: UPDATE datasets SET state='DELETING', transition='2022-12-20...
                                  ^

[SQL: UPDATE datasets SET state=%(state)s, transition=%(transition)s WHERE datasets.id = %(datasets_id)s]
[parameters: {'state': 'DELETING', 'transition': datetime.datetime(2022, 12, 20, 17, 54, 9, 969114), 'datasets_id': 4}]
(Background on this error at: https://sqlalche.me/e/14/9h9h)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/pbench-server/lib/pbench/server/api/resources/query_apis/__init__.py", line 756, in _post
    report = self._analyze_bulk(results)
  File "/opt/pbench-server/lib/pbench/server/api/resources/query_apis/__init__.py", line 670, in _analyze_bulk
    for ok, response in results:
  File "/home/pbench/.local/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 319, in streaming_bulk
    for bulk_data, bulk_actions in _chunk_actions(
  File "/home/pbench/.local/lib/python3.9/site-packages/elasticsearch/helpers/actions.py", line 155, in _chunk_actions
    for action, data in actions:
  File "/opt/pbench-server/lib/pbench/server/api/resources/query_apis/datasets_delete.py", line 71, in generate_actions
    dataset.advance(States.DELETING)
  File "/opt/pbench-server/lib/pbench/server/database/models/datasets.py", line 641, in advance
    self.update()
  File "/opt/pbench-server/lib/pbench/server/database/models/datasets.py", line 669, in update
    raise DatasetSqlError("updating", name=self.name)
pbench.server.database.models.datasets.DatasetSqlError: Error updating dataset name=fio_rw_2018.02.01T22.40.57

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/pbench-server/lib/pbench/server/api/resources/__init__.py", line 1637, in _dispatch
    response = execute(params, request, {"auditing": auditing})
  File "/opt/pbench-server/lib/pbench/server/api/resources/query_apis/__init__.py", line 758, in _post
    raise APIInternalError("Unexpected backend error") from e
pbench.server.api.resources.APIInternalError: Internal Pbench Server Error: log reference 69ee6b38-cdbd-4dd4-b1c5-499f46aa95cc

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/pbench/.local/lib/python3.9/site-packages/flask/app.py", line 1820, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/pbench/.local/lib/python3.9/site-packages/flask/app.py", line 1796, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/home/pbench/.local/lib/python3.9/site-packages/flask_restful/__init__.py", line 467, in wrapper
    resp = resource(*args, **kwargs)
  File "/home/pbench/.local/lib/python3.9/site-packages/flask/views.py", line 107, in view
    return current_app.ensure_sync(self.dispatch_request)(**kwargs)
  File "/home/pbench/.local/lib/python3.9/site-packages/flask_restful/__init__.py", line 582, in dispatch_request
    resp = meth(*args, **kwargs)
  File "/home/pbench/.local/lib/python3.9/site-packages/flask_httpauth.py", line 172, in decorated
    return self.ensure_sync(f)(*args, **kwargs)
  File "/opt/pbench-server/lib/pbench/server/api/resources/__init__.py", line 1791, in post
    return self._dispatch(ApiMethod.POST, kwargs)
  File "/opt/pbench-server/lib/pbench/server/api/resources/__init__.py", line 1648, in _dispatch
    abort(e.http_status, message=str(e))
  File "/home/pbench/.local/lib/python3.9/site-packages/flask_restful/__init__.py", line 31, in abort
    original_flask_abort(http_status_code)
  File "/home/pbench/.local/lib/python3.9/site-packages/flask/helpers.py", line 310, in abort
    current_app.aborter(code, *args, **kwargs)
  File "/home/pbench/.local/lib/python3.9/site-packages/werkzeug/exceptions.py", line 864, in __call__
    raise self.mapping[code](*args, **kwargs)
werkzeug.exceptions.InternalServerError: 500 Internal Server Error: The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

To Do

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions