Skip to content

Commit 2fb2788

Browse files
authored
Merge pull request #10860 from ethereum/clarifyStorageLayout
Clarify storage layout.
2 parents d4ce896 + 61b5e8e commit 2fb2788

File tree

1 file changed

+73
-27
lines changed

1 file changed

+73
-27
lines changed

docs/internals/layout_in_storage.rst

Lines changed: 73 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -6,32 +6,43 @@ Layout of State Variables in Storage
66

77
.. _storage-inplace-encoding:
88

9-
Statically-sized variables (everything except mapping and dynamically-sized
10-
array types) are laid out contiguously in storage starting from position ``0``.
9+
State variables of contracts are stored in storage in a compact way such
10+
that multiple values sometimes use the same storage slot.
11+
Except for dynamically-sized arrays and mappings (see below), data is stored
12+
contiguously item after item starting with the first state variable,
13+
which is stored in slot ``0``. For each variable,
14+
a size in bytes is determined according to its type.
1115
Multiple, contiguous items that need less than 32 bytes are packed into a single
1216
storage slot if possible, according to the following rules:
1317

1418
- The first item in a storage slot is stored lower-order aligned.
15-
- Elementary types use only as many bytes as are necessary to store them.
16-
- If an elementary type does not fit the remaining part of a storage slot, it is moved to the next storage slot.
17-
- Structs and array data always start a new slot and occupy whole slots
18-
(but items inside a struct or array are packed tightly according to these rules).
19+
- Value types use only as many bytes as are necessary to store them.
20+
- If a value type does not fit the remaining part of a storage slot, it is stored in the next storage slot.
21+
- Structs and array data always start a new slot and their items are packed tightly according to these rules.
22+
- Items following struct or array data always start a new storage slot.
1923

2024
For contracts that use inheritance, the ordering of state variables is determined by the
2125
C3-linearized order of contracts starting with the most base-ward contract. If allowed
2226
by the above rules, state variables from different contracts do share the same storage slot.
2327

24-
The elements of structs and arrays are stored after each other, just as if they were given explicitly.
28+
The elements of structs and arrays are stored after each other, just as if they were given
29+
as individual values.
2530

2631
.. warning::
2732
When using elements that are smaller than 32 bytes, your contract's gas usage may be higher.
2833
This is because the EVM operates on 32 bytes at a time. Therefore, if the element is smaller
2934
than that, the EVM must use more operations in order to reduce the size of the element from 32
3035
bytes to the desired size.
3136

32-
It is only beneficial to use reduced-size arguments if you are dealing with storage values
37+
It might be beneficial to use reduced-size types if you are dealing with storage values
3338
because the compiler will pack multiple elements into one storage slot, and thus, combine
34-
multiple reads or writes into a single operation. When dealing with function arguments or memory
39+
multiple reads or writes into a single operation.
40+
If you are not reading or writing all the values in a slot at the same time, this can
41+
have the opposite effect, though: When one value is written to a multi-value storage
42+
slot, the storage slot has to be read first and then
43+
combined with the new value such that other data in the same slot is not destroyed.
44+
45+
When dealing with function arguments or memory
3546
values, there is no inherent benefit because the compiler does not pack these values.
3647

3748
Finally, in order to allow the EVM to optimize for this, ensure that you try to order your
@@ -53,48 +64,83 @@ Mappings and Dynamic Arrays
5364

5465
.. _storage-hashed-encoding:
5566

56-
Due to their unpredictable size, mapping and dynamically-sized array types use a Keccak-256 hash
57-
computation to find the starting position of the value or the array data.
58-
These starting positions are always full stack slots.
67+
Due to their unpredictable size, mappings and dynamically-sized array types cannot be stored
68+
"in between" the state variables preceding and following them.
69+
Instead, they are considered to occupy only 32 bytes with regards to the
70+
:ref:`rules above <storage-inplace-encoding>` and the elements they contain are stored starting at a different
71+
storage slot that is computed using a Keccak-256 hash.
5972

60-
The mapping or the dynamic array itself occupies a slot in storage at some position ``p``
61-
according to the above rule (or by recursively applying this rule for
62-
mappings of mappings or arrays of arrays). For dynamic arrays,
73+
Assume the storage location of the mapping or array ends up being a slot ``p``
74+
after applying :ref:`the storage layout rules <storage-inplace-encoding>`.
75+
For dynamic arrays,
6376
this slot stores the number of elements in the array (byte arrays and
6477
strings are an exception, see :ref:`below <bytes-and-string>`).
65-
For mappings, the slot is unused (but it is needed so that two equal mappings after each other will use a different
66-
hash distribution). Array data is located at ``keccak256(p)`` and the value corresponding to a mapping key
67-
``k`` is located at ``keccak256(k . p)`` where ``.`` is concatenation. If the value is again a
68-
non-elementary type, the positions are found by adding an offset of ``keccak256(k . p)``.
78+
For mappings, the slot stays empty, but it is still needed to ensure that even if there are
79+
two mappings next to each other, their content ends up at different storage locations.
80+
81+
Array data is located starting at ``keccak256(p)`` and it is laid out in the same way as
82+
statically-sized array data would: One element after the other, potentially sharing
83+
storage slots if the elements are not longer than 16 bytes. Dynamic arrays of dynamic arrays apply this
84+
rule recursively. The location of element ``x[i][j]``, where the type of ``x`` is ``uint24[][]``, is
85+
computed as follows (again, assuming ``x`` itself is stored at slot ``p``):
86+
The slot is ``keccak256(keccak256(p) + i) + floor(j / floor(256 / 24))`` and
87+
the element can be obtained from the slot data ``v`` using ``(v >> ((j % floor(256 / 24)) * 24)) & type(uint24).max``.
88+
89+
The value corresponding to a mapping key ``k`` is located at ``keccak256(h(k) . p)``
90+
where ``.`` is concatenation and ``h`` is a function that is applied to the key depending on its type:
91+
92+
- for value types, ``h`` pads the value to 32 bytes in the same way as when storing the value in memory.
93+
- for strings and byte arrays, ``h`` computes the ``keccak256`` hash of the unpadded data.
6994

70-
So for the following contract snippet
71-
the position of ``data[4][9].b`` is at ``keccak256(uint256(9) . keccak256(uint256(4) . uint256(1))) + 1``::
95+
If the mapping value is a
96+
non-value type, the computed slot marks the start of the data. If the value is of struct type,
97+
for example, you have to add an offset corresponding to the struct member to reach the member.
7298

99+
As an example, consider the following contract:
100+
101+
::
73102

74103
// SPDX-License-Identifier: GPL-3.0
75104
pragma solidity >=0.4.0 <0.9.0;
76105

77106

78107
contract C {
79-
struct S { uint a; uint b; }
108+
struct S { uint16 a; uint16 b; uint256 c; }
80109
uint x;
81110
mapping(uint => mapping(uint => S)) data;
82111
}
83112

113+
Let us compute the storage location of ``data[4][9].c``.
114+
The position of the mapping itself is ``1`` (the variable ``x`` with 32 bytes precedes it).
115+
This means ``data[4]`` is stored at ``keccak256(uint256(4) . uint256(1))``. The type of ``data[4]`` is
116+
again a mapping and the data for ``data[4][9]`` starts at slot
117+
``keccak256(uint256(9) . keccak256(uint256(4) . uint256(1)))``.
118+
The slot offset of the member ``c`` inside the struct ``S`` is ``1`` because ``a`` and ``b`` are packed
119+
in a single slot. This means the slot for
120+
``data[4][9].c`` is ``keccak256(uint256(9) . keccak256(uint256(4) . uint256(1))) + 1``.
121+
The type of the value is ``uint256``, so it uses a single slot.
122+
123+
84124
.. _bytes-and-string:
85125

86126
``bytes`` and ``string``
87127
------------------------
88128

89-
``bytes`` and ``string`` are encoded identically. For short byte arrays, they store their data in the same
90-
slot where the length is also stored. In particular: if the data is at most ``31`` bytes long, it is stored
91-
in the higher-order bytes (left aligned) and the lowest-order byte stores ``length * 2``.
92-
For byte arrays that store data which is ``32`` or more bytes long, the main slot stores ``length * 2 + 1`` and the data is
93-
stored as usual in ``keccak256(slot)``. This means that you can distinguish a short array from a long array
129+
``bytes`` and ``string`` are encoded identically.
130+
In general, the encoding is similar to ``byte1[]``, in the sense that there is a slot for the array itself and
131+
a data area that is computed using a ``keccak256`` hash of that slot's position.
132+
However, for short values (shorter than 32 bytes) the array elements are stored together with the length in the same slot.
133+
134+
In particular: if the data is at most ``31`` bytes long, the elements are stored
135+
in the higher-order bytes (left aligned) and the lowest-order byte stores the value ``length * 2``.
136+
For byte arrays that store data which is ``32`` or more bytes long, the main slot ``p`` stores ``length * 2 + 1`` and the data is
137+
stored as usual in ``keccak256(p)``. This means that you can distinguish a short array from a long array
94138
by checking if the lowest bit is set: short (not set) and long (set).
95139

96140
.. note::
97141
Handling invalidly encoded slots is currently not supported but may be added in the future.
142+
If you are compiling via the experimental IR-based compiler pipeline, reading an invalidly encoded
143+
slot results in a ``Panic(0x22)`` error.
98144

99145
JSON Output
100146
===========

0 commit comments

Comments
 (0)