Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling "Ragged Arrays" #765

Open
whs92 opened this issue May 4, 2023 · 1 comment
Open

Handling "Ragged Arrays" #765

whs92 opened this issue May 4, 2023 · 1 comment

Comments

@whs92
Copy link
Member

whs92 commented May 4, 2023

We have a detector which produces per event variable length arrays of coordinates of detected blobs.

We are trying to read that data in an EPICS waveform, which has a different number of elements read (NORD) for each point within a plan. The ophyd device is created as a EpicsSignalRO component. Running scans works fine but accessing the data through the databroker seems to cause problems because of the varying array length for each point within the dataset.

Running databroker v2 through tiled gives the following error. The error is similar for databroker using intake.

The database is backed by mongoDB

db[-1].primary.read() yields:

$ python3
Python 3.8.10 (default, Mar 13 2023, 10:26:41) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from tiled.client import from_profile
>>> c = from_profile("metrixstiled")
OBJECT CACHE: Will use up to 5_006_456_832 bytes (15% of total physical RAM)
>>> run =c['40fe8cf']
>>> run.metadata['start']
{'uid': '40fe8cf4-8182-4a2a-80b8-ccb94d4b76ad', 'time': 1683034023.4697332, 'scan_id': 183, 'hostname': 'metrixs-ctrl1', 'plan_type': 'generator', 'plan_name': 'count', 'detectors': ['blobs'], 'num_points': 5, 'num_intervals': 4, 'plan_args': {'detectors': ["blobs(prefix='METRIXS:BLOBS:', name='blobs', read_attrs=['filename', 'readback'], configuration_attrs=[])"], 'num': 5}, 'command_elog': 'count([blobs], 5)', 'hints': {'dimensions': [[['time'], 'primary']]}}
>>> run.primary.read()
Traceback (most recent call last):
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/anyio/streams/memory.py", line 94, in receive
    return self.receive_nowait()
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/anyio/streams/memory.py", line 89, in receive_nowait
    raise WouldBlock
anyio.WouldBlock

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/base.py", line 78, in call_next
    message = await recv_stream.receive()
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/anyio/streams/memory.py", line 114, in receive
    raise EndOfStream
anyio.EndOfStream

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/databroker/client.py", line 197, in read
    return self["data"].read(*args, **kwargs)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/tiled/client/xarray.py", line 113, in read
    super()
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/tiled/client/xarray.py", line 104, in read
    data_vars, coords = self._build_arrays(variables, optimize_wide_table)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/tiled/client/xarray.py", line 95, in _build_arrays
    data_vars[name] = (array_client.dims, array_client.read())
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/tiled/client/array.py", line 256, in read
    return super().read(slice).compute()
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/dask/base.py", line 314, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/dask/base.py", line 599, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/dask/threaded.py", line 89, in get
    results = get_async(
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/dask/local.py", line 511, in get_async
    raise_exception(exc, tb)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/dask/local.py", line 319, in reraise
    raise exc
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/dask/local.py", line 224, in execute_task
    result = _execute_task(task, data)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/tiled/client/array.py", line 93, in _get_block
    content = self.context.get_content(
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/tiled/client/context.py", line 521, in get_content
    response = self.http_client.send(request, stream=stream)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/httpx/_client.py", line 901, in send
    response = self._send_handling_auth(
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/httpx/_client.py", line 929, in _send_handling_auth
    response = self._send_handling_redirects(
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/httpx/_client.py", line 966, in _send_handling_redirects
    response = self._send_single_request(request)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/httpx/_client.py", line 1002, in _send_single_request
    response = transport.handle_request(request)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/tiled/client/_testclient.py", line 351, in handle_request
    raise exc
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/tiled/client/_testclient.py", line 348, in handle_request
    portal.call(self.app, scope, receive, send)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/anyio/from_thread.py", line 283, in call
    return cast(T_Retval, self.start_task_soon(func, *args).result())
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 444, in result
    return self.__get_result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/anyio/from_thread.py", line 219, in _call_func
    retval = await retval
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/fastapi/applications.py", line 276, in __call__
    await super().__call__(scope, receive, send)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/applications.py", line 122, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/errors.py", line 184, in __call__
    raise exc
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/errors.py", line 162, in __call__
    await self.app(scope, receive, _send)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/base.py", line 108, in __call__
    response = await self.dispatch_func(request, call_next)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/tiled/server/app.py", line 753, in capture_metrics_prometheus
    response = await call_next(request)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/base.py", line 84, in call_next
    raise app_exc
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/base.py", line 70, in coro
    await self.app(scope, receive_or_disconnect, send_no_error)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/base.py", line 108, in __call__
    response = await self.dispatch_func(request, call_next)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/tiled/server/app.py", line 707, in capture_metrics
    response = await call_next(request)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/base.py", line 84, in call_next
    raise app_exc
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/base.py", line 70, in coro
    await self.app(scope, receive_or_disconnect, send_no_error)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/base.py", line 108, in __call__
    response = await self.dispatch_func(request, call_next)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/tiled/server/app.py", line 652, in set_cookies
    response = await call_next(request)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/base.py", line 84, in call_next
    raise app_exc
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/base.py", line 70, in coro
    await self.app(scope, receive_or_disconnect, send_no_error)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/base.py", line 108, in __call__
    response = await self.dispatch_func(request, call_next)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/tiled/server/app.py", line 643, in client_compatibility_check
    response = await call_next(request)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/base.py", line 84, in call_next
    raise app_exc
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/base.py", line 70, in coro
    await self.app(scope, receive_or_disconnect, send_no_error)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/base.py", line 108, in __call__
    response = await self.dispatch_func(request, call_next)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/tiled/server/app.py", line 603, in double_submit_cookie_csrf_protection
    response = await call_next(request)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/base.py", line 84, in call_next
    raise app_exc
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/base.py", line 70, in coro
    await self.app(scope, receive_or_disconnect, send_no_error)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/tiled/server/compression.py", line 27, in __call__
    await responder(scope, receive, send)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/tiled/server/compression.py", line 48, in __call__
    await self.app(scope, receive, self.send_compressed)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/cors.py", line 84, in __call__
    await self.app(scope, receive, send)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
    raise exc
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
    await self.app(scope, receive, sender)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
    raise e
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
    await self.app(scope, receive, send)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/routing.py", line 718, in __call__
    await route.handle(scope, receive, send)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/routing.py", line 276, in handle
    await self.app(scope, receive, send)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/routing.py", line 66, in app
    response = await func(request)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/fastapi/routing.py", line 237, in app
    raw_response = await run_endpoint_function(
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/fastapi/routing.py", line 165, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/tiled/server/router.py", line 388, in array_block
    array = entry.read_block(block, slice=slice)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/databroker/mongo_normalized.py", line 492, in read_block
    return self._dataset_adapter.read_block(self._field, block, slice=slice)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/databroker/mongo_normalized.py", line 689, in read_block
    raw_array = self.get_columns([variable], slices=slices)[variable]
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/databroker/mongo_normalized.py", line 780, in get_columns
    to_stack = self._inner_get_columns(tuple(keys), mi
n_seq_num, max_seq_num)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/databroker/mongo_normalized.py", line 912, in _inner_get_columns
    populate_columns((key,), min_, max_)
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/databroker/mongo_normalized.py", line 857, in populate_columns
    validated_column = list(
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/databroker/mongo_normalized.py", line 859, in <lambda>
    lambda item: _validate_shape(
  File "/hzb/metrixs/bluesky_databroker2/tiled_dev/.venv/lib/python3.8/site-packages/databroker/mongo_normalized.py", line 2072, in _validate_shape
    raise BadShapeMetadata(
databroker.mongo_normalized.BadShapeMetadata: For data key blobs shape (58,) does not match expected shape (54,).
>>> 

Here is the package versions we have installed in this test environment
tiled_error_freeze.txt

@danielballan
Copy link
Member

At a Scientific Python Developer Summit this week, I sat down with the lead developer of Awkward Array, and we charted a path for serving ragged data through Tiled.

Some discussion is needed on how to fit this into bluesky's data model, but this is a really good step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants