|
| 1 | +.. only:: doctest |
| 2 | + |
| 3 | + >>> import shutil |
| 4 | + >>> shutil.rmtree('data', ignore_errors=True) |
| 5 | + >>> |
| 6 | + >>> import numpy as np |
| 7 | + >>> np.random.seed(0) |
| 8 | + |
| 9 | +Quickstart |
| 10 | +========== |
| 11 | + |
| 12 | +Welcome to the Zarr-Python Quickstart guide! This page will help you get up and running with |
| 13 | +the Zarr library in Python to efficiently manage and analyze multi-dimensional arrays. |
| 14 | + |
| 15 | +Zarr is a powerful library for storage of n-dimensional arrays, supporting chunking, |
| 16 | +compression, and various backends, making it a versatile choice for scientific and |
| 17 | +large-scale data. |
| 18 | + |
| 19 | +Installation |
| 20 | +------------ |
| 21 | + |
| 22 | +Zarr requires Python 3.11 or higher. You can install it via `pip`: |
| 23 | + |
| 24 | +.. code-block:: bash |
| 25 | +
|
| 26 | + pip install zarr |
| 27 | +
|
| 28 | +or `conda`: |
| 29 | + |
| 30 | +.. code-block:: bash |
| 31 | +
|
| 32 | + conda install --channel conda-forge zarr |
| 33 | +
|
| 34 | +Creating an Array |
| 35 | +----------------- |
| 36 | + |
| 37 | +To get started, you can create a simple Zarr array:: |
| 38 | + |
| 39 | + >>> import zarr |
| 40 | + >>> import numpy as np |
| 41 | + >>> |
| 42 | + >>> # Create a 2D Zarr array |
| 43 | + >>> z = zarr.create_array( |
| 44 | + ... store="data/example-1.zarr", |
| 45 | + ... shape=(100, 100), |
| 46 | + ... chunks=(10, 10), |
| 47 | + ... dtype="f4" |
| 48 | + ... ) |
| 49 | + >>> |
| 50 | + >>> # Assign data to the array |
| 51 | + >>> z[:, :] = np.random.random((100, 100)) |
| 52 | + >>> z.info |
| 53 | + Type : Array |
| 54 | + Zarr format : 3 |
| 55 | + Data type : DataType.float32 |
| 56 | + Shape : (100, 100) |
| 57 | + Chunk shape : (10, 10) |
| 58 | + Order : C |
| 59 | + Read-only : False |
| 60 | + Store type : LocalStore |
| 61 | + Codecs : [{'endian': <Endian.little: 'little'>}, {'level': 0, 'checksum': False}] |
| 62 | + No. bytes : 40000 (39.1K) |
| 63 | + |
| 64 | +Here, we created a 2D array of shape ``(100, 100)``, chunked into blocks of |
| 65 | +``(10, 10)``, and filled it with random floating-point data. This array was |
| 66 | +written to a ``LocalStore`` in the ``data/example-1.zarr`` directory. |
| 67 | + |
| 68 | +Compression and Filters |
| 69 | +~~~~~~~~~~~~~~~~~~~~~~~ |
| 70 | + |
| 71 | +Zarr supports data compression and filters. For example, to use Blosc compression:: |
| 72 | + |
| 73 | + >>> z = zarr.create_array( |
| 74 | + ... "data/example-3.zarr", |
| 75 | + ... mode="w", shape=(100, 100), |
| 76 | + ... chunks=(10, 10), dtype="f4", |
| 77 | + ... compressor=zarr.codecs.BloscCodec(cname="zstd", clevel=3, shuffle=zarr.codecs.BloscShuffle.SHUFFLE) |
| 78 | + ... ) |
| 79 | + >>> z[:, :] = np.random.random((100, 100)) |
| 80 | + >>> |
| 81 | + >>> z.info |
| 82 | + Type : Array |
| 83 | + Zarr format : 3 |
| 84 | + Data type : DataType.float32 |
| 85 | + Shape : (100, 100) |
| 86 | + Chunk shape : (10, 10) |
| 87 | + Order : C |
| 88 | + Read-only : False |
| 89 | + Store type : LocalStore |
| 90 | + Codecs : [{'endian': <Endian.little: 'little'>}, {'level': 0, 'checksum': False}] |
| 91 | + No. bytes : 40000 (39.1K) |
| 92 | + |
| 93 | +This compresses the data using the Zstandard codec with shuffle enabled for better compression. |
| 94 | + |
| 95 | +Hierarchical Groups |
| 96 | +------------------- |
| 97 | + |
| 98 | +Zarr allows you to create hierarchical groups, similar to directories:: |
| 99 | + |
| 100 | + >>> # Create nested groups and add arrays |
| 101 | + >>> root = zarr.group("data/example-2.zarr") |
| 102 | + >>> foo = root.create_group(name="foo") |
| 103 | + >>> bar = root.create_array( |
| 104 | + ... name="bar", shape=(100, 10), chunks=(10, 10) |
| 105 | + ... ) |
| 106 | + >>> spam = foo.create_array(name="spam", shape=(10,), dtype="i4") |
| 107 | + >>> |
| 108 | + >>> # Assign values |
| 109 | + >>> bar[:, :] = np.random.random((100, 10)) |
| 110 | + >>> spam[:] = np.arange(10) |
| 111 | + >>> |
| 112 | + >>> # print the hierarchy |
| 113 | + >>> root.tree() |
| 114 | + / |
| 115 | + └── foo |
| 116 | + └── spam (10,) int32 |
| 117 | + <BLANKLINE> |
| 118 | + |
| 119 | +This creates a group with two datasets: ``foo`` and ``bar``. |
| 120 | + |
| 121 | +Persistent Storage |
| 122 | +------------------ |
| 123 | + |
| 124 | +Zarr supports persistent storage to disk or cloud-compatible backends. While examples above |
| 125 | +utilized a :class:`zarr.storage.LocalStore`, a number of other storage options are available. |
| 126 | + |
| 127 | +Zarr integrates seamlessly with cloud object storage such as Amazon S3 and Google Cloud Storage |
| 128 | +using external libraries like `s3fs <https://s3fs.readthedocs.io>`_ or |
| 129 | +`gcsfs <https://gcsfs.readthedocs.io>`_:: |
| 130 | + |
| 131 | + >>> import s3fs # doctest: +SKIP |
| 132 | + >>> |
| 133 | + >>> z = zarr.create_array("s3://example-bucket/foo", mode="w", shape=(100, 100), chunks=(10, 10)) # doctest: +SKIP |
| 134 | + >>> z[:, :] = np.random.random((100, 100)) # doctest: +SKIP |
| 135 | + |
| 136 | +A single-file store can also be created using the the :class:`zarr.storage.ZipStore`:: |
| 137 | + |
| 138 | + >>> # Store the array in a ZIP file |
| 139 | + >>> store = zarr.storage.ZipStore("data/example-3.zip", mode='w') |
| 140 | + >>> |
| 141 | + >>> z = zarr.create_array( |
| 142 | + ... store=store, |
| 143 | + ... mode="w", |
| 144 | + ... shape=(100, 100), |
| 145 | + ... chunks=(10, 10), |
| 146 | + ... dtype="f4" |
| 147 | + ... ) |
| 148 | + >>> |
| 149 | + >>> # write to the array |
| 150 | + >>> z[:, :] = np.random.random((100, 100)) |
| 151 | + >>> |
| 152 | + >>> # the ZipStore must be explicitly closed |
| 153 | + >>> store.close() |
| 154 | + |
| 155 | +To open an existing array from a ZIP file:: |
| 156 | + |
| 157 | + >>> # Open the ZipStore in read-only mode |
| 158 | + >>> store = zarr.storage.ZipStore("data/example-3.zip", read_only=True) |
| 159 | + >>> |
| 160 | + >>> z = zarr.open_array(store, mode='r') |
| 161 | + >>> |
| 162 | + >>> # read the data as a NumPy Array |
| 163 | + >>> z[:] |
| 164 | + array([[0.66734236, 0.15667458, 0.98720884, ..., 0.36229587, 0.67443246, |
| 165 | + 0.34315267], |
| 166 | + [0.65787303, 0.9544212 , 0.4830079 , ..., 0.33097172, 0.60423803, |
| 167 | + 0.45621237], |
| 168 | + [0.27632037, 0.9947008 , 0.42434934, ..., 0.94860053, 0.6226942 , |
| 169 | + 0.6386924 ], |
| 170 | + ..., |
| 171 | + [0.12854576, 0.934397 , 0.19524333, ..., 0.11838563, 0.4967675 , |
| 172 | + 0.43074256], |
| 173 | + [0.82029045, 0.4671437 , 0.8090906 , ..., 0.7814118 , 0.42650765, |
| 174 | + 0.95929915], |
| 175 | + [0.4335856 , 0.7565437 , 0.7828931 , ..., 0.48119593, 0.66220033, |
| 176 | + 0.6652362 ]], shape=(100, 100), dtype=float32) |
| 177 | + |
| 178 | +Read more about Zarr's storage options in the :ref:`User Guide <user-guide-storage>`. |
| 179 | + |
| 180 | +Next Steps |
| 181 | +---------- |
| 182 | + |
| 183 | +Now that you're familiar with the basics, explore the following resources: |
| 184 | + |
| 185 | +- `User Guide <user-guide>`_ |
| 186 | +- `API Reference <api>`_ |
0 commit comments