Genetics data IO performance stats/doc

This is a dump of some of the performance experiments. It's part of a larger issue of performance setup and best practices for dask/sgkit and genetic data. The goal is to share the findings and continue the discussion.

Where not otherwise stated, the test machine is a GCE VM, 16 cores and 64GB of memory, 400 SPD. Dask cluster is a single node process based. If the data is read from GCS, the bucket is in the same region as the VM:

<details>
<summary>Specs/libs</summary>

```bash
➜  ~ uname -a
Linux rav-dev 4.19.0-13-cloud-amd64 #1 SMP Debian 4.19.160-2 (2020-11-28) x86_64 GNU/Linux
➜  ~ conda list
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
aiohttp                   3.7.2            py38h1e0a361_0    conda-forge
appdirs                   1.4.4                    pypi_0    pypi
argon2-cffi               20.1.0           py38h1e0a361_2    conda-forge
asciitree                 0.3.3                    pypi_0    pypi
async-timeout             3.0.1                   py_1000    conda-forge
async_generator           1.10                       py_0    conda-forge
atk                       2.36.0               ha770c72_4    conda-forge
atk-1.0                   2.36.0               h0d5b62e_4    conda-forge
attrs                     20.2.0             pyh9f0ad1d_0    conda-forge
backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
backports                 1.0                        py_2    conda-forge
backports.functools_lru_cache 1.6.1                      py_0    conda-forge
bleach                    3.2.1              pyh9f0ad1d_0    conda-forge
blinker                   1.4                        py_1    conda-forge
bokeh                     2.2.3            py38h578d9bd_0    conda-forge
brotlipy                  0.7.0           py38h8df0ef7_1001    conda-forge
ca-certificates           2020.11.8            ha878542_0    conda-forge
cachetools                4.1.1                      py_0    conda-forge
cairo                     1.16.0            h488836b_1006    conda-forge
cbgen                     0.1.6                    pypi_0    pypi
certifi                   2020.11.8        py38h578d9bd_0    conda-forge
cffi                      1.14.3           py38h1bdcb99_1    conda-forge
chardet                   3.0.4           py38h924ce5b_1008    conda-forge
click                     7.1.2              pyh9f0ad1d_0    conda-forge
cloudpickle               1.6.0                      py_0    conda-forge
cryptography              3.2.1            py38h7699a38_0    conda-forge
cython                    0.29.21                  pypi_0    pypi
cytoolz                   0.11.0           py38h25fe258_1    conda-forge
dask                      2.30.0                     py_0    conda-forge
dask-core                 2.30.0                     py_0    conda-forge
dask-glm                  0.2.0                    pypi_0    pypi
dask-ml                   1.7.0                    pypi_0    pypi
decorator                 4.4.2                      py_0    conda-forge
defusedxml                0.6.0                      py_0    conda-forge
distributed               2.30.0                   pypi_0    pypi
entrypoints               0.3             py38h32f6830_1002    conda-forge
expat                     2.2.9                he1b5a44_2    conda-forge
fasteners                 0.15                     pypi_0    pypi
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      2.001                hab24e00_0    conda-forge
font-ttf-source-code-pro  2.030                hab24e00_0    conda-forge
font-ttf-ubuntu           0.83                 hab24e00_0    conda-forge
fontconfig                2.13.1            h7e3eb15_1002    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
freetype                  2.10.4               h7ca028e_0    conda-forge
fribidi                   1.0.10               h36c2ea0_0    conda-forge
fsspec                    0.8.4                      py_0    conda-forge
gcsfs                     0.7.1                      py_0    conda-forge
gdk-pixbuf                2.42.0               h0536704_0    conda-forge
gettext                   0.19.8.1          hf34092f_1004    conda-forge
giflib                    5.2.1                h36c2ea0_2    conda-forge
gil-load                  0.4.0                    pypi_0    pypi
glib                      2.66.3               h58526e2_0    conda-forge
gobject-introspection     1.66.1           py38h4eacb9c_3    conda-forge
google-auth               1.23.0             pyhd8ed1ab_0    conda-forge
google-auth-oauthlib      0.4.2              pyhd8ed1ab_0    conda-forge
graphite2                 1.3.13            h58526e2_1001    conda-forge
graphviz                  2.42.3               h6939c30_2    conda-forge
gtk2                      2.24.32              h194ddfc_3    conda-forge
gts                       0.7.6                h17b2bb4_1    conda-forge
harfbuzz                  2.7.2                ha5b49bf_1    conda-forge
heapdict                  1.0.1                      py_0    conda-forge
icu                       67.1                 he1b5a44_0    conda-forge
idna                      2.10               pyh9f0ad1d_0    conda-forge
importlib-metadata        2.0.0                      py_1    conda-forge
importlib_metadata        2.0.0                         1    conda-forge
iniconfig                 1.1.1                    pypi_0    pypi
ipykernel                 5.3.4            py38h1cdfbd6_1    conda-forge
ipython                   7.19.0           py38h81c977d_0    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
ipywidgets                7.5.1                    pypi_0    pypi
jedi                      0.17.2           py38h32f6830_1    conda-forge
jinja2                    2.11.2             pyh9f0ad1d_0    conda-forge
joblib                    0.17.0                   pypi_0    pypi
jpeg                      9d                   h36c2ea0_0    conda-forge
json5                     0.9.5              pyh9f0ad1d_0    conda-forge
jsonschema                3.2.0                      py_2    conda-forge
jupyter-server-proxy      1.5.0                      py_0    conda-forge
jupyter_client            6.1.7                      py_0    conda-forge
jupyter_core              4.6.3            py38h32f6830_2    conda-forge
jupyterlab                2.2.9                      py_0    conda-forge
jupyterlab_pygments       0.1.2              pyh9f0ad1d_0    conda-forge
jupyterlab_server         1.2.0                      py_0    conda-forge
lcms2                     2.11                 hcbb858e_1    conda-forge
ld_impl_linux-64          2.35                 h769bd43_9    conda-forge
libblas                   3.9.0                3_openblas    conda-forge
libcblas                  3.9.0                3_openblas    conda-forge
libffi                    3.2.1             he1b5a44_1007    conda-forge
libgcc-ng                 9.3.0               h5dbcf3e_17    conda-forge
libgfortran-ng            9.3.0               he4bcb1c_17    conda-forge
libgfortran5              9.3.0               he4bcb1c_17    conda-forge
libglib                   2.66.3               hbe7bbb4_0    conda-forge
libgomp                   9.3.0               h5dbcf3e_17    conda-forge
libiconv                  1.16                 h516909a_0    conda-forge
liblapack                 3.9.0                3_openblas    conda-forge
libopenblas               0.3.12          pthreads_h4812303_1    conda-forge
libpng                    1.6.37               h21135ba_2    conda-forge
libsodium                 1.0.18               h516909a_1    conda-forge
libstdcxx-ng              9.3.0               h2ae2ef3_17    conda-forge
libtiff                   4.1.0                h4f3a223_6    conda-forge
libtool                   2.4.6             h58526e2_1007    conda-forge
libuuid                   2.32.1            h14c3975_1000    conda-forge
libwebp                   1.1.0                h76fa15c_4    conda-forge
libwebp-base              1.1.0                h36c2ea0_3    conda-forge
libxcb                    1.13              h14c3975_1002    conda-forge
libxml2                   2.9.10               h68273f3_2    conda-forge
llvmlite                  0.34.0                   pypi_0    pypi
locket                    0.2.0                      py_2    conda-forge
lz4-c                     1.9.2                he1b5a44_3    conda-forge
markupsafe                1.1.1            py38h8df0ef7_2    conda-forge
mistune                   0.8.4           py38h1e0a361_1002    conda-forge
monotonic                 1.5                      pypi_0    pypi
msgpack-python            1.0.0            py38h82cb98a_2    conda-forge
multidict                 4.7.5            py38h1e0a361_2    conda-forge
multipledispatch          0.6.0                    pypi_0    pypi
nbclient                  0.5.1                      py_0    conda-forge
nbconvert                 6.0.7            py38h32f6830_2    conda-forge
nbformat                  5.0.8                      py_0    conda-forge
ncurses                   6.2                  he1b5a44_2    conda-forge
nest-asyncio              1.4.1                      py_0    conda-forge
notebook                  6.1.4            py38h32f6830_1    conda-forge
numba                     0.51.2                   pypi_0    pypi
numcodecs                 0.7.2                    pypi_0    pypi
numpy                     1.19.3                   pypi_0    pypi
oauthlib                  3.0.1                      py_0    conda-forge
olefile                   0.46               pyh9f0ad1d_1    conda-forge
openssl                   1.1.1h               h516909a_0    conda-forge
packaging                 20.4               pyh9f0ad1d_0    conda-forge
pandas                    1.1.4                    pypi_0    pypi
pandoc                    2.11.0.4             hd18ef5c_0    conda-forge
pandocfilters             1.4.2                      py_1    conda-forge
pango                     1.42.4               h69149e4_5    conda-forge
parso                     0.7.1              pyh9f0ad1d_0    conda-forge
partd                     1.1.0                      py_0    conda-forge
pcre                      8.44                 he1b5a44_0    conda-forge
pexpect                   4.8.0              pyh9f0ad1d_2    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pillow                    8.0.1            py38h70fbd49_0    conda-forge
pip                       20.2.4                     py_0    conda-forge
pixman                    0.38.0            h516909a_1003    conda-forge
pluggy                    0.13.1                   pypi_0    pypi
pooch                     1.2.0                    pypi_0    pypi
prometheus_client         0.8.0              pyh9f0ad1d_0    conda-forge
prompt-toolkit            3.0.8                      py_0    conda-forge
psutil                    5.7.3            py38h8df0ef7_0    conda-forge
pthread-stubs             0.4               h14c3975_1001    conda-forge
ptyprocess                0.6.0                   py_1001    conda-forge
py                        1.9.0                    pypi_0    pypi
py-spy                    0.3.3                    pypi_0    pypi
pyasn1                    0.4.8                      py_0    conda-forge
pyasn1-modules            0.2.7                      py_0    conda-forge
pycparser                 2.20               pyh9f0ad1d_2    conda-forge
pygments                  2.7.2                      py_0    conda-forge
pyjwt                     1.7.1                      py_0    conda-forge
pyopenssl                 19.1.0                     py_1    conda-forge
pyparsing                 2.4.7              pyh9f0ad1d_0    conda-forge
pyrsistent                0.17.3           py38h1e0a361_1    conda-forge
pysocks                   1.7.1            py38h924ce5b_2    conda-forge
pytest                    6.1.2                    pypi_0    pypi
python                    3.8.6           h852b56e_0_cpython    conda-forge
python-dateutil           2.8.1                      py_0    conda-forge
python-graphviz           0.15                     pypi_0    pypi
python_abi                3.8                      1_cp38    conda-forge
pytz                      2020.1                   pypi_0    pypi
pyyaml                    5.3.1                    pypi_0    pypi
pyzmq                     19.0.2           py38ha71036d_2    conda-forge
readline                  8.0                  he28a2e2_2    conda-forge
rechunker                 0.3.1                    pypi_0    pypi
requests                  2.24.0             pyh9f0ad1d_0    conda-forge
requests-oauthlib         1.3.0              pyh9f0ad1d_0    conda-forge
rsa                       4.6                pyh9f0ad1d_0    conda-forge
scikit-learn              0.23.2                   pypi_0    pypi
scipy                     1.5.3                    pypi_0    pypi
send2trash                1.5.0                      py_0    conda-forge
setuptools                49.6.0           py38h924ce5b_2    conda-forge
sgkit                     0.1.dev290+gb81de07          pypi_0    pypi
simpervisor               0.3                        py_1    conda-forge
six                       1.15.0             pyh9f0ad1d_0    conda-forge
sortedcontainers          2.2.2                    pypi_0    pypi
sqlite                    3.33.0               h4cf870e_1    conda-forge
tblib                     1.7.0                    pypi_0    pypi
terminado                 0.9.1            py38h32f6830_1    conda-forge
testpath                  0.4.4                      py_0    conda-forge
threadpoolctl             2.1.0                    pypi_0    pypi
tk                        8.6.10               hed695b0_1    conda-forge
toml                      0.10.2                   pypi_0    pypi
toolz                     0.11.1                     py_0    conda-forge
tornado                   6.1              py38h25fe258_0    conda-forge
traitlets                 5.0.5                      py_0    conda-forge
typing-extensions         3.7.4.3                       0    conda-forge
typing_extensions         3.7.4.3                    py_0    conda-forge
urllib3                   1.25.11                    py_0    conda-forge
wcwidth                   0.2.5              pyh9f0ad1d_2    conda-forge
webencodings              0.5.1                      py_1    conda-forge
wheel                     0.35.1             pyh9f0ad1d_0    conda-forge
widgetsnbextension        3.5.1                    pypi_0    pypi
xarray                    0.16.1                   pypi_0    pypi
xorg-kbproto              1.0.7             h14c3975_1002    conda-forge
xorg-libice               1.0.10               h516909a_0    conda-forge
xorg-libsm                1.2.3             h84519dc_1000    conda-forge
xorg-libx11               1.6.12               h516909a_0    conda-forge
xorg-libxau               1.0.9                h14c3975_0    conda-forge
xorg-libxdmcp             1.1.3                h516909a_0    conda-forge
xorg-libxext              1.3.4                h516909a_0    conda-forge
xorg-libxpm               3.5.13               h516909a_0    conda-forge
xorg-libxrender           0.9.10            h516909a_1002    conda-forge
xorg-libxt                1.1.5             h516909a_1003    conda-forge
xorg-renderproto          0.11.1            h14c3975_1002    conda-forge
xorg-xextproto            7.3.0             h14c3975_1002    conda-forge
xorg-xproto               7.0.31            h14c3975_1007    conda-forge
xz                        5.2.5                h516909a_1    conda-forge
yaml                      0.2.5                h516909a_0    conda-forge
yarl                      1.6.2            py38h1e0a361_0    conda-forge
zarr                      2.5.0                    pypi_0    pypi
zeromq                    4.3.3                he1b5a44_2    conda-forge
zict                      2.0.0                    pypi_0    pypi
zipp                      3.4.0                      py_0    conda-forge
zlib                      1.2.11            h516909a_1010    conda-forge
```
</details>


The issue with suboptimal saturation was originally reported for this code:

```python
import fsspec
import xarray as xr
from sgkit.io.bgen.bgen_reader import unpack_variables
from dask.diagnostics import ProgressBar, ResourceProfiler, Profiler

path = "gs://foobar/data.zarr"
store = fsspec.mapping.get_mapper(path, check=False, create=False)
ds = xr.open_zarr(store, concat_characters=False, consolidated=False)
ds = unpack_variables(ds, dtype='float16')

ds["variant_dosage_std"] = ds["call_dosage"].astype("float32").std(dim="samples")
with ProgressBar(), Profiler() as prof, ResourceProfiler() as rprof:
    ds['variant_dosage_std'] = ds['variant_dosage_std'].compute()
```

With local input, performance graph:

![bokeh_plot (1)](https://user-images.githubusercontent.com/1419010/97861154-a24d1480-1d03-11eb-822a-6c1037a66e8f.png)

It's pretty clear the cores are well saturated. I also measure GIL, GIL was held for 13% of time and waited on for 2.1%, with each worker thread (16 threads) holding it for 0.7% and waiting for 0.1% of time.

For GCS input (via fsspec):

![bokeh_plot (2)](https://user-images.githubusercontent.com/1419010/97862169-2c49ad00-1d05-11eb-8198-13fcdd1cf3be.png)

GIL summary: GIL was held for 18% of time and waited on for 3.8%, with each worker thread (16 threads) holding it for 0.6% and waiting for 0.2% of time, with one thread holding GIL for 6.5% and waiting 1.6% time. 

<details>

```
held: 0.186 (0.191, 0.187, 0.186)
wait: 0.038 (0.046, 0.041, 0.039)
  <140287451305792>
    held: 0.015 (0.029, 0.017, 0.015)
    wait: 0.002 (0.002, 0.002, 0.002)
  <140284185433856>
    held: 0.065 (0.061, 0.064, 0.065)
    wait: 0.016 (0.015, 0.017, 0.016)
  <140284540389120>
    held: 0.0 (0.0, 0.0, 0.0)
    wait: 0.0 (0.0, 0.0, 0.0)
  <140284590728960>
    held: 0.006 (0.006, 0.006, 0.006)
    wait: 0.002 (0.002, 0.002, 0.002)
  <140284599121664>
    held: 0.006 (0.006, 0.006, 0.006)
    wait: 0.002 (0.002, 0.002, 0.001)
  <140284759570176>
    held: 0.006 (0.008, 0.007, 0.007)
    wait: 0.002 (0.001, 0.001, 0.002)
  <140284751177472>
    held: 0.006 (0.006, 0.006, 0.006)
    wait: 0.002 (0.001, 0.001, 0.002)
  <140283956950784>
    held: 0.006 (0.006, 0.006, 0.006)
    wait: 0.001 (0.001, 0.001, 0.001)
  <140283948558080>
    held: 0.006 (0.006, 0.006, 0.006)
    wait: 0.001 (0.001, 0.001, 0.001)
  <140283940165376>
    held: 0.006 (0.006, 0.006, 0.006)
    wait: 0.002 (0.002, 0.002, 0.002)
  <140283931772672>
    held: 0.006 (0.006, 0.006, 0.006)
    wait: 0.002 (0.001, 0.002, 0.002)
  <140283923379968>
    held: 0.006 (0.006, 0.006, 0.006)
    wait: 0.002 (0.001, 0.002, 0.002)
  <140283914987264>
    held: 0.006 (0.007, 0.007, 0.007)
    wait: 0.002 (0.001, 0.002, 0.002)
  <140283295561472>
    held: 0.006 (0.006, 0.006, 0.006)
    wait: 0.002 (0.002, 0.002, 0.002)
  <140283287168768>
    held: 0.006 (0.006, 0.006, 0.006)
    wait: 0.002 (0.002, 0.002, 0.002)
  <140283278776064>
    held: 0.006 (0.006, 0.007, 0.006)
    wait: 0.002 (0.002, 0.002, 0.002)
  <140283270383360>
    held: 0.006 (0.006, 0.006, 0.006)
    wait: 0.001 (0.001, 0.001, 0.001)
  <140283261990656>
    held: 0.006 (0.006, 0.006, 0.006)
    wait: 0.001 (0.002, 0.002, 0.001)
  <140283253597952>
    held: 0.006 (0.006, 0.006, 0.006)
    wait: 0.002 (0.001, 0.001, 0.001)
  <140283245205248>
    held: 0.0 (0.0, 0.0, 0.0)
    wait: 0.0 (0.0, 0.0, 0.0)
  <140282691581696>
    held: 0.001 (0.0, 0.001, 0.001)
    wait: 0.001 (0.001, 0.001, 0.001)
  <140282683188992>
    held: 0.002 (0.002, 0.002, 0.002)
    wait: 0.001 (0.001, 0.001, 0.001)
  <140282674796288>
    held: 0.001 (0.001, 0.001, 0.001)
    wait: 0.003 (0.012, 0.004, 0.003)
```
</details>

It's clear that the CPU usage is lower, and not fully saturated, GIL wait time is a bit up (with a concerning spike in one thread). With remote/fsspec input, we have the overhead of data decryption and potential network IO overhead (tho it doesn't seem like we hit network limits).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Genetics data IO performance stats/doc #437

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Genetics data IO performance stats/doc #437

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions