Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Intermittent Twitcher/Magpie error when caching on (Magpie version 3.14.0) #466

Closed
tlvu opened this issue Sep 17, 2021 · 3 comments
Closed
Assignees
Labels
acl Features or issues related to access resolution. bug Problem, error, or invalid behaviour cache Features or issues related to caching.

Comments

@tlvu
Copy link
Contributor

tlvu commented Sep 17, 2021

Describe the bug
Many random hard to reproduce errors with several services behind Twitcher/Magpie when caching is turned on for Twitcher.

With caching not working, this issue bird-house/twitcher#97 might have to be re-open?

To Reproduce
Steps to reproduce the behavior:

  1. PAVICS at tag https://github.com/bird-house/birdhouse-deploy/tree/1.14.2 (PR re-enable caching feature of twitcher bird-house/birdhouse-deploy#188)
  2. pavics_thredds.ipynb, stress-test.ipynb in pavics-sdi repo failure
  3. Result

===Problem 1===

11:31:17  _____ pavics-sdi-master/docs/source/notebooks/pavics_thredds.ipynb::Cell 4 _____
11:31:17  Notebook cell execution failed
11:31:17  Cell 4: Cell execution caused an exception
11:31:17  
11:31:17  Input:
11:31:17  # NBVAL_IGNORE_OUTPUT
11:31:17  
11:31:17  AUTH_USR = os.getenv("TEST_MAGPIE_AUTHTEST_USERNAME", "authtest")
11:31:17  AUTH_PWD = os.getenv("TEST_MAGPIE_AUTHTEST_PASSWORD", "authtest1234")
11:31:17  
11:31:17  # Open session
11:31:17  with requests.Session() as session:
11:31:17      session.auth = MagpieAuth(f"https://{PAVICS_HOST}/magpie", AUTH_USR, AUTH_PWD)
11:31:17      # Open a PyDAP data store and pass it to xarray
11:31:17      store = xr.backends.PydapDataStore.open(SECURED_URL, session=session)
11:31:17      ds = xr.open_dataset(store, decode_cf=False)  # Attributes are problematic with this file.
11:31:17  ds
11:31:17  
11:31:17  Traceback:
11:31:17  
11:31:17  ---------------------------------------------------------------------------
11:31:17  HTTPError                                 Traceback (most recent call last)
11:31:17  /tmp/ipykernel_405/1184836852.py in <module>
11:31:17        9     # Open a PyDAP data store and pass it to xarray
11:31:17       10     store = xr.backends.PydapDataStore.open(SECURED_URL, session=session)
11:31:17  ---> 11     ds = xr.open_dataset(store, decode_cf=False)  # Attributes are problematic with this file.
11:31:17       12 ds
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/backends/api.py in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, backend_kwargs, *args, **kwargs)
11:31:17      499         drop_variables=drop_variables,
11:31:17      500         **decoders,
11:31:17  --> 501         **kwargs,
11:31:17      502     )
11:31:17      503     ds = _dataset_from_backend_dataset(
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/backends/store.py in open_dataset(self, store, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta)
11:31:17       37         )
11:31:17       38 
11:31:17  ---> 39         ds = Dataset(vars, attrs=attrs)
11:31:17       40         ds = ds.set_coords(coord_names.intersection(vars))
11:31:17       41         ds.set_close(store.close)
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/core/dataset.py in __init__(self, data_vars, coords, attrs)
11:31:17      753 
11:31:17      754         variables, coord_names, dims, indexes, _ = merge_data_and_coords(
11:31:17  --> 755             data_vars, coords, compat="broadcast_equals"
11:31:17      756         )
11:31:17      757 
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/core/merge.py in merge_data_and_coords(data, coords, compat, join)
11:31:17      483     indexes = dict(_extract_indexes_from_coords(coords))
11:31:17      484     return merge_core(
11:31:17  --> 485         objects, compat, join, explicit_coords=explicit_coords, indexes=indexes
11:31:17      486     )
11:31:17      487 
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/core/merge.py in merge_core(objects, compat, join, combine_attrs, priority_arg, explicit_coords, indexes, fill_value)
11:31:17      630         coerced, join=join, copy=False, indexes=indexes, fill_value=fill_value
11:31:17      631     )
11:31:17  --> 632     collected = collect_variables_and_indexes(aligned)
11:31:17      633 
11:31:17      634     prioritized = _get_priority_vars_and_indexes(aligned, priority_arg, compat=compat)
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/core/merge.py in collect_variables_and_indexes(list_of_mappings)
11:31:17      292                 append_all(coords, indexes)
11:31:17      293 
11:31:17  --> 294             variable = as_variable(variable, name=name)
11:31:17      295             if variable.dims == (name,):
11:31:17      296                 variable = variable.to_index_variable()
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/core/variable.py in as_variable(obj, name)
11:31:17      161                 "conflict with the coordinates used to label dimensions."
11:31:17      162             )
11:31:17  --> 163         obj = obj.to_index_variable()
11:31:17      164 
11:31:17      165     return obj
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/core/variable.py in to_index_variable(self)
11:31:17      532         """Return this variable as an xarray.IndexVariable"""
11:31:17      533         return IndexVariable(
11:31:17  --> 534             self.dims, self._data, self._attrs, encoding=self._encoding, fastpath=True
11:31:17      535         )
11:31:17      536 
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/core/variable.py in __init__(self, dims, data, attrs, encoding, fastpath)
11:31:17     2573         # Unlike in Variable, always eagerly load values into memory
11:31:17     2574         if not isinstance(self._data, PandasIndex):
11:31:17  -> 2575             self._data = PandasIndex(self._data)
11:31:17     2576 
11:31:17     2577     def __dask_tokenize__(self):
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/core/indexes.py in __init__(self, array, dtype, coord_name)
11:31:17      151         super().__init__(coord_name)
11:31:17      152 
11:31:17  --> 153         self.array = utils.safe_cast_to_index(array)
11:31:17      154 
11:31:17      155         if dtype is None:
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/core/utils.py in safe_cast_to_index(array)
11:31:17      116         if hasattr(array, "dtype") and array.dtype.kind == "O":
11:31:17      117             kwargs["dtype"] = object
11:31:17  --> 118         index = pd.Index(np.asarray(array), **kwargs)
11:31:17      119     return _maybe_cast_to_cftimeindex(index)
11:31:17      120 
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/core/indexing.py in __array__(self, dtype)
11:31:17      417     def __array__(self, dtype=None):
11:31:17      418         array = as_indexable(self.array)
11:31:17  --> 419         return np.asarray(array[self.key], dtype=None)
11:31:17      420 
11:31:17      421     def transpose(self, order):
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/conventions.py in __getitem__(self, key)
11:31:17       60 
11:31:17       61     def __getitem__(self, key):
11:31:17  ---> 62         return np.asarray(self.array[key], dtype=self.dtype)
11:31:17       63 
11:31:17       64 
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/core/indexing.py in __array__(self, dtype)
11:31:17      417     def __array__(self, dtype=None):
11:31:17      418         array = as_indexable(self.array)
11:31:17  --> 419         return np.asarray(array[self.key], dtype=None)
11:31:17      420 
11:31:17      421     def transpose(self, order):
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/backends/pydap_.py in __getitem__(self, key)
11:31:17       38     def __getitem__(self, key):
11:31:17       39         return indexing.explicit_indexing_adapter(
11:31:17  ---> 40             key, self.shape, indexing.IndexingSupport.BASIC, self._getitem
11:31:17       41         )
11:31:17       42 
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/core/indexing.py in explicit_indexing_adapter(key, shape, indexing_support, raw_indexing_method)
11:31:17      708     """
11:31:17      709     raw_key, numpy_indices = decompose_indexer(key, shape, indexing_support)
11:31:17  --> 710     result = raw_indexing_method(raw_key.tuple)
11:31:17      711     if numpy_indices.tuple:
11:31:17      712         # index the loaded np.ndarray
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/backends/pydap_.py in _getitem(self, key)
11:31:17       45         # downloading coordinate data twice
11:31:17       46         array = getattr(self.array, "array", self.array)
11:31:17  ---> 47         result = robust_getitem(array, key, catch=ValueError)
11:31:17       48         # in some cases, pydap doesn't squeeze axes automatically like numpy
11:31:17       49         axis = tuple(n for n, k in enumerate(key) if isinstance(k, integer_types))
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/backends/common.py in robust_getitem(array, key, catch, max_retries, initial_delay)
11:31:17       63     for n in range(max_retries + 1):
11:31:17       64         try:
11:31:17  ---> 65             return array[key]
11:31:17       66         except catch:
11:31:17       67             if n == max_retries:
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/pydap/model.py in __getitem__(self, index)
11:31:17      318     def __getitem__(self, index):
11:31:17      319         out = copy.copy(self)
11:31:17  --> 320         out.data = self._get_data_index(index)
11:31:17      321         return out
11:31:17      322 
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/pydap/model.py in _get_data_index(self, index)
11:31:17      347             return np.vectorize(decode_np_strings)(self._data[index])
11:31:17      348         else:
11:31:17  --> 349             return self._data[index]
11:31:17      350 
11:31:17      351     def _get_data(self):
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/pydap/handlers/dap.py in __getitem__(self, index)
11:31:17      140         logger.info("Fetching URL: %s" % url)
11:31:17      141         r = GET(url, self.application, self.session, timeout=self.timeout)
11:31:17  --> 142         raise_for_status(r)
11:31:17      143         dds, data = r.body.split(b'\nData:\n', 1)
11:31:17      144         dds = dds.decode(r.content_encoding or 'ascii')
11:31:17  
11:31:17  /opt/conda/envs/birdy/lib/python3.7/site-packages/pydap/net.py in raise_for_status(response)
11:31:17       37             detail=response.status+'\n'+response.text,
11:31:17       38             headers=response.headers,
11:31:17  ---> 39             comment=response.body
11:31:17       40         )
11:31:17       41 
11:31:17  
11:31:17  HTTPError: 500 Internal Server Error
11:31:17  <?xml version="1.0" encoding="utf-8"?>
11:31:17  <ExceptionReport version="1.0.0"
11:31:17      xmlns="http://www.opengis.net/ows/1.1"
11:31:17      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
11:31:17      xsi:schemaLocation="http://www.opengis.net/ows/1.1 http://schemas.opengis.net/ows/1.1.0/owsExceptionReport.xsd">
11:31:17      <Exception exceptionCode="NoApplicableCode" locator="NoApplicableCode">
11:31:17          <ExceptionText>Instance &lt;Directory at 0x7f7bf8ac8f90&gt; is not bound to a Session; attribute refresh operation cannot proceed (Background on this error at: http://sqlalche.me/e/13/bhk3)</ExceptionText>
11:31:17      </Exception>
11:31:17  </ExceptionReport>

Matching twitcher server-side logs:

2021-09-09 15:24:02,563 ERROR [TWITCHER:33][waitress-0] unknown error
Traceback (most recent call last):
  File "/opt/birdhouse/src/twitcher/twitcher/tweens.py", line 27, in ows_security_tween
    security.check_request(request)
  File "/opt/local/src/magpie/magpie/adapter/magpieowssecurity.py", line 163, in check_request
    has_permission = authz_policy.permits(service_impl, principals, permission_requested)
  File "/usr/local/lib/python3.7/site-packages/pyramid/authorization.py", line 74, in permits
    acl = location.__acl__
  File "/opt/local/src/magpie/magpie/services.py", line 175, in __acl__
    acl = self._get_acl_cached(*cache_keys)
  File "/usr/local/lib/python3.7/site-packages/beaker/cache.py", line 601, in cached
    return cache[0].get_value(cache_key, createfunc=go)
  File "/usr/local/lib/python3.7/site-packages/beaker/cache.py", line 322, in get
    return self._get_value(key, **kw).get_value()
  File "/usr/local/lib/python3.7/site-packages/beaker/container.py", line 380, in get_value
    v = self.createfunc()
  File "/usr/local/lib/python3.7/site-packages/beaker/cache.py", line 597, in go
    return func(*args, **kwargs)
  File "/opt/local/src/magpie/magpie/services.py", line 213, in _get_acl_cached
    return self._get_acl(user, resource, permissions, allow_match=is_target)
  File "/opt/local/src/magpie/magpie/services.py", line 223, in _get_acl
    permissions = self.effective_permissions(user, resource, permissions, allow_match)
  File "/opt/local/src/magpie/magpie/services.py", line 328, in effective_permissions
    cur_res_perms = ResourceService.perms_for_user(resource, user, db_session=db_session)
  File "/usr/local/lib/python3.7/site-packages/ziggurat_foundations/models/services/resource.py", line 86, in perms_for_user
    if instance.owner_user_id == user.id:
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/attributes.py", line 294, in __get__
    return self.impl.get(instance_state(instance), dict_)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/attributes.py", line 725, in get\
    value = state._load_expired(state, passive)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/state.py", line 652, in _load_expired
    self.manager.deferred_scalar_loader(self, toload)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/loading.py", line 944, in load_scalar_attributes
    "attribute refresh operation cannot proceed" % (state_str(state))
sqlalchemy.orm.exc.DetachedInstanceError: Instance <Directory at 0x7f7bf8ac8f90> is not bound to a Session; attribute refresh operation cannot proceed (Background on this error at: http://sqlalche.me/e/13/bhk3)

===Problem 2===

This access denied error should never happend, no logs on twitcher server side

20:43:35  _____ pavics-sdi-master/docs/source/notebooks/pavics_thredds.ipynb::Cell 4 _____
20:43:35  Notebook cell execution failed
20:43:35  Cell 4: Cell execution caused an exception
20:43:35  
20:43:35  Input:
20:43:35  # NBVAL_IGNORE_OUTPUT
20:43:35  
20:43:35  AUTH_USR = os.getenv("TEST_MAGPIE_AUTHTEST_USERNAME", "authtest")
20:43:35  AUTH_PWD = os.getenv("TEST_MAGPIE_AUTHTEST_PASSWORD", "authtest1234")
20:43:35  
20:43:35  # Open session
20:43:35  with requests.Session() as session:
20:43:35      session.auth = MagpieAuth(f"https://{PAVICS_HOST}/magpie", AUTH_USR, AUTH_PWD)
20:43:35      # Open a PyDAP data store and pass it to xarray
20:43:35      store = xr.backends.PydapDataStore.open(SECURED_URL, session=session)
20:43:35      ds = xr.open_dataset(store, decode_cf=False)  # Attributes are problematic with this file.
20:43:35  ds
20:43:35  
20:43:35  Traceback:
20:43:35  
20:43:35  ---------------------------------------------------------------------------
20:43:35  HTTPError                                 Traceback (most recent call last)
20:43:35  <ipython-input-5-fad2ffe5f696> in <module>
20:43:35        8     session.auth = MagpieAuth(f"https://{PAVICS_HOST}/magpie", AUTH_USR, AUTH_PWD)
20:43:35        9     # Open a PyDAP data store and pass it to xarray
20:43:35  ---> 10     store = xr.backends.PydapDataStore.open(SECURED_URL, session=session)
20:43:35       11     ds = xr.open_dataset(store, decode_cf=False)  # Attributes are problematic with this file.
20:43:35       12 ds
20:43:35  
20:43:35  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/backends/pydap_.py in open(cls, url, session)
20:43:35       91     def open(cls, url, session=None):
20:43:35       92 
20:43:35  ---> 93         ds = pydap.client.open_url(url, session=session)
20:43:35       94         return cls(ds)
20:43:35       95 
20:43:35  
20:43:35  /opt/conda/envs/birdy/lib/python3.7/site-packages/pydap/client.py in open_url(url, application, session, output_grid, timeout)
20:43:35       65     """
20:43:35       66     dataset = DAPHandler(url, application, session, output_grid,
20:43:35  ---> 67                          timeout).dataset
20:43:35       68 
20:43:35       69     # attach server-side functions
20:43:35  
20:43:35  /opt/conda/envs/birdy/lib/python3.7/site-packages/pydap/handlers/dap.py in __init__(self, url, application, session, output_grid, timeout)
20:43:35       52         ddsurl = urlunsplit((scheme, netloc, path + '.dds', query, fragment))
20:43:35       53         r = GET(ddsurl, application, session, timeout=timeout)
20:43:35  ---> 54         raise_for_status(r)
20:43:35       55         if not r.charset:
20:43:35       56             r.charset = 'ascii'
20:43:35  
20:43:35  /opt/conda/envs/birdy/lib/python3.7/site-packages/pydap/net.py in raise_for_status(response)
20:43:35       37             detail=response.status+'\n'+response.text,
20:43:35       38             headers=response.headers,
20:43:35  ---> 39             comment=response.body
20:43:35       40         )
20:43:35       41 
20:43:35  
20:43:35  HTTPError: 403 Forbidden
20:43:35  <?xml version="1.0" encoding="utf-8"?>
20:43:35  <ExceptionReport version="1.0.0"
20:43:35      xmlns="http://www.opengis.net/ows/1.1"
20:43:35      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
20:43:35      xsi:schemaLocation="http://www.opengis.net/ows/1.1 http://schemas.opengis.net/ows/1.1.0/owsExceptionReport.xsd">
20:43:35      <Exception exceptionCode="NoApplicableCode" locator="AccessForbidden">
20:43:35          <ExceptionText>Not authorized to access this resource. User does not meet required permissions.</ExceptionText>
20:43:35      </Exception>
20:43:35  </ExceptionReport>

===Problem 3===

408 Error in stress-test.ipynb reproduced on CRIM Jenkins as well https://daccs-jenkins.crim.ca/job/PAVICS-e2e-workflow-tests/job/master/570/console (all errors in stress-test.ipynb was gone for 3 weeks when caching was turned off, see #433).

===Problem 4===

"Remote end closed connection without response" without any errors in Twitcher logs:

14:31:47  ___ pavics-sdi-master/docs/source/notebooks/subset-user-input.ipynb::Cell 14 ___
14:31:47  Notebook cell execution failed
14:31:47  Cell 14: Cell execution caused an exception
14:31:47  
14:31:47  Input:
14:31:47  lon0 = float(bounds.minx)
14:31:47  lon1 = float(bounds.maxx)
14:31:47  lat0 = float(bounds.miny)
14:31:47  lat1 = float(bounds.maxy)
14:31:47  
14:31:47  result_tasmin = finch.subset_bbox(
14:31:47      resource=data,
14:31:47      variable="tasmin",
14:31:47      lon0=lon0,
14:31:47      lon1=lon1,
14:31:47      lat0=lat0,
14:31:47      lat1=lat1,
14:31:47      start_date="1958-01-01",
14:31:47      end_date="1958-12-31",
14:31:47  )
14:31:47  
14:31:47  Traceback:
14:31:47  
14:31:47  ---------------------------------------------------------------------------
14:31:47  ServiceException                          Traceback (most recent call last)
14:31:47  /tmp/ipykernel_626/857013960.py in <module>
14:31:47       12     lat1=lat1,
14:31:47       13     start_date="1958-01-01",
14:31:47  ---> 14     end_date="1958-12-31",
14:31:47       15 )
14:31:47  
14:31:47  </opt/conda/envs/birdy/lib/python3.7/site-packages/birdy/client/base.py-335> in subset_bbox(self, resource, lon0, lon1, lat0, lat1, start_date, end_date, variable)
14:31:47  
14:31:47  /opt/conda/envs/birdy/lib/python3.7/site-packages/birdy/client/base.py in _execute(self, pid, **kwargs)
14:31:47      371                 output=wps_outputs,
14:31:47      372                 mode=mode,
14:31:47  --> 373                 lineage=self._lineage,
14:31:47      374             )
14:31:47      375 
14:31:47  
14:31:47  /opt/conda/envs/birdy/lib/python3.7/site-packages/owslib/wps.py in execute(self, identifier, inputs, output, mode, lineage, request, response)
14:31:47      357         # submit the request to the live server
14:31:47      358         if response is None:
14:31:47  --> 359             response = execution.submitRequest(request)
14:31:47      360         else:
14:31:47      361             response = etree.fromstring(response)
14:31:47  
14:31:47  /opt/conda/envs/birdy/lib/python3.7/site-packages/owslib/wps.py in submitRequest(self, request)
14:31:47      920         reader = WPSExecuteReader(verbose=self.verbose, timeout=self.timeout, auth=self.auth)
14:31:47      921         response = reader.readFromUrl(
14:31:47  --> 922             self.url, request, method='Post', headers=self.headers)
14:31:47      923         self.response = response
14:31:47      924         return response
14:31:47  
14:31:47  /opt/conda/envs/birdy/lib/python3.7/site-packages/owslib/wps.py in readFromUrl(self, url, data, method, username, password, headers, verify, cert)
14:31:47      601 
14:31:47      602         return self._readFromUrl(url, data, self.timeout, method, username=username, password=password,
14:31:47  --> 603                                  headers=headers, verify=verify, cert=cert)
14:31:47      604 
14:31:47      605 
14:31:47  
14:31:47  /opt/conda/envs/birdy/lib/python3.7/site-packages/owslib/wps.py in _readFromUrl(self, url, data, timeout, method, username, password, headers, verify, cert)
14:31:47      513             u = openURL(url, data, method='Post',
14:31:47      514                         username=self.auth.username, password=self.auth.password,
14:31:47  --> 515                         headers=headers, verify=self.auth.verify, cert=self.auth.cert, timeout=timeout)
14:31:47      516             return etree.fromstring(u.read())
14:31:47      517 
14:31:47  
14:31:47  /opt/conda/envs/birdy/lib/python3.7/site-packages/owslib/util.py in openURL(url_base, data, method, cookies, username, password, timeout, headers, verify, cert, auth)
14:31:47      209 
14:31:47      210     if req.status_code in [400, 401]:
14:31:47  --> 211         raise ServiceException(req.text)
14:31:47      212 
14:31:47      213     if req.status_code in [404, 500, 502, 503, 504]:    # add more if needed
14:31:47  
14:31:47  ServiceException: <?xml version="1.0" encoding="utf-8"?>
14:31:47  <ExceptionReport version="1.0.0"
14:31:47      xmlns="http://www.opengis.net/ows/1.1"
14:31:47      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
14:31:47      xsi:schemaLocation="http://www.opengis.net/ows/1.1 http://schemas.opengis.net/ows/1.1.0/owsExceptionReport.xsd">
14:31:47      <Exception exceptionCode="NoApplicableCode" locator="NotAcceptable">
14:31:47          <ExceptionText>Request failed: (&#x27;Connection aborted.&#x27;, RemoteDisconnected(&#x27;Remote end closed connection without response&#x27;))</ExceptionText>
14:31:47      </Exception>
14:31:47  </ExceptionReport>
  1. Tried many ways to enhanced reproduciblity without any giving better results
  • Tried to run the test right after Twitcher/Magpie restart
  • Tried to have 2 concurrents runs at once
  • Tried to have back to back runs
  • Tried to wait 20 - 30 mins between runs
@fmigneault
Copy link
Collaborator

Test with Magpie Auth targetting Thredds
https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/blob/master/notebooks-auth/test_thredds.ipynb

Stress test to loop requests:
https://github.com/Ouranosinc/PAVICS-e2e-workflow-tests/blob/master/notebooks/stress-tests.ipynb

The Magpie authentication/session creation step from test_thredds would need to be added in order for requests within stress-test to employ the same resolution and resources (THREDDS dirs/file).

@tlvu
Copy link
Contributor Author

tlvu commented Oct 7, 2021

Another stacktrace when reverting to Waitress with newer Magpie + debug bird-house/birdhouse-deploy#197

Traceback (most recent call last):
  File "/opt/birdhouse/src/twitcher/twitcher/tweens.py", line 27, in ows_security_tween
    security.check_request(request)
  File "/opt/local/src/magpie/magpie/adapter/magpieowssecurity.py", line 171, in check_request
    has_permission = authz_policy.permits(service_impl, principals, permission_requested)
  File "/usr/local/lib/python3.7/site-packages/pyramid/authorization.py", line 74, in permits
    acl = location.__acl__
  File "/opt/local/src/magpie/magpie/services.py", line 183, in __acl__
    acl = self._get_acl_cached(*cache_keys)
  File "/usr/local/lib/python3.7/site-packages/beaker/cache.py", line 601, in cached
    return cache[0].get_value(cache_key, createfunc=go)
  File "/usr/local/lib/python3.7/site-packages/beaker/cache.py", line 322, in get
    return self._get_value(key, **kw).get_value()
  File "/usr/local/lib/python3.7/site-packages/beaker/container.py", line 380, in get_value
    v = self.createfunc()
  File "/usr/local/lib/python3.7/site-packages/beaker/cache.py", line 597, in go
    return func(*args, **kwargs)
  File "/opt/local/src/magpie/magpie/services.py", line 220, in _get_acl_cached
    user = self.user_requested()
  File "/opt/local/src/magpie/magpie/services.py", line 157, in user_requested
    user = UserService.by_user_name(anonymous, db_session=self.request.db)
  File "/usr/local/lib/python3.7/site-packages/ziggurat_foundations/models/services/user.py", line 330, in by_user_name
    return query.first()
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3429, in first
    ret = list(self[0:1])
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3203, in __getitem__
    return list(res)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3535, in __iter__
    return self._execute_and_instances(context)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3557, in _execute_and_instances
    querycontext, self._connection_from_session, close_with_result=True
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3572, in _get_bind_args
    mapper=self._bind_mapper(), clause=querycontext.statement, **kw
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3550, in _connection_from_session
    conn = self.session.connection(**kw)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 1145, in connection
    execution_options=execution_options,
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 1151, in _connection_for_bind
    engine, execution_options
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 458, in _connection_for_bind
    self.session.dispatch.after_begin(self.session, self, conn)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/event/attr.py", line 322, in __call__
    fn(*args, **kw)
  File "/usr/local/lib/python3.7/site-packages/zope/sqlalchemy/datamanager.py", line 269, in after_begin
    session, self.initial_state, self.transaction_manager, self.keep_session
  File "/usr/local/lib/python3.7/site-packages/zope/sqlalchemy/datamanager.py", line 234, in join_transaction
    session, initial_state, transaction_manager, keep_session=keep_session
  File "/usr/local/lib/python3.7/site-packages/zope/sqlalchemy/datamanager.py", line 89, in __init__
    transaction_manager.get().join(self)
  File "/usr/local/lib/python3.7/site-packages/transaction/_manager.py", line 90, in get
    raise NoTransaction()
transaction.interfaces.NoTransaction

Full logs
twitcher-cache-error-with-waitress.txt

@tlvu
Copy link
Contributor Author

tlvu commented Oct 7, 2021

Same stacktrace above with gunicorn as well, matching this notebook error:

12:55:07  _____ pavics-sdi-master/docs/source/notebooks/pavics_thredds.ipynb::Cell 4 _____
12:55:07  Notebook cell execution failed
12:55:07  Cell 4: Cell execution caused an exception
12:55:07  
12:55:07  Input:
12:55:07  # NBVAL_IGNORE_OUTPUT
12:55:07  
12:55:07  AUTH_USR = os.getenv("TEST_MAGPIE_AUTHTEST_USERNAME", "authtest")
12:55:07  AUTH_PWD = os.getenv("TEST_MAGPIE_AUTHTEST_PASSWORD", "authtest1234")
12:55:07  
12:55:07  # Open session
12:55:07  with requests.Session() as session:
12:55:07      session.auth = MagpieAuth(f"https://{PAVICS_HOST}/magpie", AUTH_USR, AUTH_PWD)
12:55:07      # Open a PyDAP data store and pass it to xarray
12:55:07      store = xr.backends.PydapDataStore.open(SECURED_URL, session=session)
12:55:07      ds = xr.open_dataset(store, decode_cf=False)  # Attributes are problematic with this file.
12:55:07  ds
12:55:07  
12:55:07  Traceback:
12:55:07  
12:55:07  ---------------------------------------------------------------------------
12:55:07  HTTPError                                 Traceback (most recent call last)
12:55:07  /tmp/ipykernel_429/1184836852.py in <module>
12:55:07        8     session.auth = MagpieAuth(f"https://{PAVICS_HOST}/magpie", AUTH_USR, AUTH_PWD)
12:55:07        9     # Open a PyDAP data store and pass it to xarray
12:55:07  ---> 10     store = xr.backends.PydapDataStore.open(SECURED_URL, session=session)
12:55:07       11     ds = xr.open_dataset(store, decode_cf=False)  # Attributes are problematic with this file.
12:55:07       12 ds
12:55:07  
12:55:07  /opt/conda/envs/birdy/lib/python3.7/site-packages/xarray/backends/pydap_.py in open(cls, url, session)
12:55:07       91     def open(cls, url, session=None):
12:55:07       92 
12:55:07  ---> 93         ds = pydap.client.open_url(url, session=session)
12:55:07       94         return cls(ds)
12:55:07       95 
12:55:07  
12:55:07  /opt/conda/envs/birdy/lib/python3.7/site-packages/pydap/client.py in open_url(url, application, session, output_grid, timeout)
12:55:07       65     """
12:55:07       66     dataset = DAPHandler(url, application, session, output_grid,
12:55:07  ---> 67                          timeout).dataset
12:55:07       68 
12:55:07       69     # attach server-side functions
12:55:07  
12:55:07  /opt/conda/envs/birdy/lib/python3.7/site-packages/pydap/handlers/dap.py in __init__(self, url, application, session, output_grid, timeout)
12:55:07       52         ddsurl = urlunsplit((scheme, netloc, path + '.dds', query, fragment))
12:55:07       53         r = GET(ddsurl, application, session, timeout=timeout)
12:55:07  ---> 54         raise_for_status(r)
12:55:07       55         if not r.charset:
12:55:07       56             r.charset = 'ascii'
12:55:07  
12:55:07  /opt/conda/envs/birdy/lib/python3.7/site-packages/pydap/net.py in raise_for_status(response)
12:55:07       37             detail=response.status+'\n'+response.text,
12:55:07       38             headers=response.headers,
12:55:07  ---> 39             comment=response.body
12:55:07       40         )
12:55:07       41 
12:55:07  
12:55:07  HTTPError: 500 Internal Server Error
12:55:07  <?xml version="1.0" encoding="utf-8"?>
12:55:07  <ExceptionReport version="1.0.0"
12:55:07      xmlns="http://www.opengis.net/ows/1.1"
12:55:07      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
12:55:07      xsi:schemaLocation="http://www.opengis.net/ows/1.1 http://schemas.opengis.net/ows/1.1.0/owsExceptionReport.xsd">
12:55:07      <Exception exceptionCode="NoApplicableCode" locator="NoApplicableCode">
12:55:07          <ExceptionText>Unknown Error</ExceptionText>
12:55:07      </Exception>
12:55:07  </ExceptionReport>

Plus this run has a new bunch of 408 in stress-test.ipynb and I even got them on CRIM's Jenkins https://daccs-jenkins.crim.ca/job/PAVICS-e2e-workflow-tests/job/master/606/console

Full log
twitcher-cache-error-with-gunicorn.txt.gz

@fmigneault I hope you have all the debug infos you need in the 2 logs I posted. We still have problem with both waitress and gunicorn.

Maybe this only happen in a full PAVICS env. You should try to reproduce on your side in a full PAVICs env.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acl Features or issues related to access resolution. bug Problem, error, or invalid behaviour cache Features or issues related to caching.
Projects
None yet
Development

No branches or pull requests

2 participants