Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ dependencies = [
"py_arkworks_bls12381==0.3.8",
"py_ecc==8.0.0",
"pycryptodome==3.23.0",
"remerkleable==0.1.28",
"remerkleable @ git+https://github.com/ethereum/remerkleable@643e8e3d1d80a34f61d4b1e32a46e38ad7e57a18",
"ruamel.yaml==0.18.14",
"setuptools==80.9.0",
"trie==3.1.0",
Expand Down
50 changes: 36 additions & 14 deletions ssz/simple-serialize.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
- [`boolean`](#boolean)
- [`Bitvector[N]`](#bitvectorn)
- [`Bitlist[N]`](#bitlistn)
- [Vectors, containers, lists](#vectors-containers-lists)
- [Vectors, containers, lists, progressive lists](#vectors-containers-lists-progressive-lists)
- [Union](#union)
- [Deserialization](#deserialization)
- [Merkleization](#merkleization)
Expand Down Expand Up @@ -58,6 +58,9 @@
- **list**: ordered variable-length homogeneous collection, limited to `N`
values
- notation `List[type, N]`, e.g. `List[uint64, N]`
- **progressive list** _[EIP-7916, currently unused]_: ordered variable-length
homogeneous collection, without limit
- notation `ProgressiveList[type]`, e.g. `ProgressiveList[uint64]`
- **bitvector**: ordered fixed-length collection of `boolean` values, with `N`
bits
- notation `Bitvector[N]`
Expand All @@ -75,9 +78,9 @@ efficiencies.

### Variable-size and fixed-size

We recursively define "variable-size" types to be lists, unions, `Bitlist` and
all types that contain a variable-size type. All other types are said to be
"fixed-size".
We recursively define "variable-size" types to be lists, progressive lists,
unions, `Bitlist` and all types that contain a variable-size type. All other
types are said to be "fixed-size".

### Byte

Expand All @@ -91,6 +94,7 @@ For convenience we alias:
- `bit` to `boolean`
- `BytesN` and `ByteVector[N]` to `Vector[byte, N]` (this is *not* a basic type)
- `ByteList[N]` to `List[byte, N]`
- `ProgressiveByteList` to `ProgressiveList[byte]`

Aliases are semantically equivalent to their underlying type and therefore share
canonical representations both in SSZ and in related formats.
Expand All @@ -108,6 +112,7 @@ Assuming a helper function `default(type)` which returns the default value for
| `Vector[type, N]` | `[default(type)] * N` |
| `Bitvector[N]` | `[False] * N` |
| `List[type, N]` | `[]` |
| `ProgressiveList[type]` | `[]` |
| `Bitlist[N]` | `[]` |
| `Union[type_0, type_1, ...]` | `default(type_0)` |

Expand Down Expand Up @@ -168,7 +173,7 @@ array[len(value) // 8] |= 1 << (len(value) % 8)
return bytes(array)
```

### Vectors, containers, lists
### Vectors, containers, lists, progressive lists

```python
# Recursively serialize
Expand Down Expand Up @@ -227,15 +232,16 @@ deserialization of basic objects is easy, and from there we can find a simple
recursive algorithm for all fixed-size objects. For variable-size objects we
have to do one of the following depending on what kind of object it is:

- Vector/list of a variable-size object: The serialized data will start with
offsets of all the serialized objects (`BYTES_PER_LENGTH_OFFSET` bytes each).
- Vector/list/progressive list of a variable-size object: The serialized data
will start with offsets of all the serialized objects
(`BYTES_PER_LENGTH_OFFSET` bytes each).
- Using the first offset, we can compute the length of the list (divide by
`BYTES_PER_LENGTH_OFFSET`), as it gives us the total number of bytes in the
offset data.
- The size of each object in the vector/list can be inferred from the
difference of two offsets. To get the size of the last object, the total
number of bytes has to be known (it is not generally possible to deserialize
an SSZ object of unknown length)
- The size of each object in the vector/list/progressive list can be inferred
from the difference of two offsets. To get the size of the last object, the
total number of bytes has to be known (it is not generally possible to
deserialize an SSZ object of unknown length)
- Containers follow the same principles as vectors, with the difference that
there may be fixed-size objects in a container as well. This means the
`fixed_parts` data will contain offsets as well as fixed-size objects.
Expand Down Expand Up @@ -299,6 +305,16 @@ We first define helper functions:
- Then, merkleize the chunks (empty input is padded to 1 zero chunk):
- If `1` chunk: the root is the chunk itself.
- If `> 1` chunks: merkleize as binary tree.
- `merkleize_progressive(chunks, num_leaves=1)`: Given ordered
`BYTES_PER_CHUNK`-byte chunks:
- The merkleization depends on the number of input chunks and is defined
recursively:
- If `len(chunks) == 0`: the root is a zero value, `Bytes32()`.
- Otherwise: compute the root using `hash(a, b)`
- `a`: Recursively merkleize chunks beyond `num_leaves` using
`merkleize_progressive(chunks[num_leaves:], num_leaves * 4)`.
- `b`: Merkleize the first up to `num_leaves` chunks as a binary tree
using `merkleize(chunks[:num_leaves], num_leaves)`.
- `mix_in_length`: Given a Merkle root `root` and a length `length` (`"uint256"`
little-endian serialization) return `hash(root + length)`.
- `mix_in_selector`: Given a Merkle root `root` and a type selector `selector`
Expand All @@ -313,12 +329,16 @@ recursively:
bitvector.
- `mix_in_length(merkleize(pack(value), limit=chunk_count(type)), len(value))`
if `value` is a list of basic objects.
- `mix_in_length(merkleize_progressive(pack(value)), len(value))` if `value` is
a progressive list of basic objects.
- `mix_in_length(merkleize(pack_bits(value), limit=chunk_count(type)), len(value))`
if `value` is a bitlist.
- `merkleize([hash_tree_root(element) for element in value])` if `value` is a
vector of composite objects or a container.
- `mix_in_length(merkleize([hash_tree_root(element) for element in value], limit=chunk_count(type)), len(value))`
if `value` is a list of composite objects.
- `mix_in_length(merkleize_progressive([hash_tree_root(element) for element in value]), len(value))`
if `value` is a progressive list of composite objects.
- `mix_in_selector(hash_tree_root(value.value), value.selector)` if `value` is
of union type, and `value.value` is not `None`
- `mix_in_selector(Bytes32(), 0)` if `value` is of union type, and `value.value`
Expand Down Expand Up @@ -362,6 +382,8 @@ value. Parsers may ignore additional JSON fields.
| `Bitvector[N]` | hex-byte-string | `"0x1122"` |
| `List[type, N]` | array | `[element, ...]` |
| `List[byte, N]` | hex-byte-string | `"0x1122"` |
| `ProgressiveList[type]` | array | `[element, ...]` |
| `ProgressiveList[byte]` | hex-byte-string | `"0x1122"` |
| `Bitlist[N]` | hex-byte-string | `"0x1122"` |
| `Union[type_0, type_1, ...]` | selector-object | `{ "selector": number, "data": type_N }` |

Expand All @@ -372,9 +394,9 @@ Aliases are encoded as their underlying type.
`hex-byte-string` is a `0x`-prefixed hex encoding of byte data, as it would
appear in an SSZ stream.

`List` and `Vector` of `byte` (and aliases thereof) are encoded as
`hex-byte-string`. `Bitlist` and `Bitvector` similarly map their SSZ-byte
encodings to a `hex-byte-string`.
`List`, `ProgressiveList`, and `Vector` of `byte` (and aliases thereof) are
encoded as `hex-byte-string`. `Bitlist` and `Bitvector` similarly map their
SSZ-byte encodings to a `hex-byte-string`.

`Union` is encoded as an object with a `selector` and `data` field, where the
contents of `data` change according to the selector.
3 changes: 2 additions & 1 deletion tests/core/pyspec/eth2spec/debug/decode.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
ByteVector,
Container,
List,
ProgressiveList,
uint,
Union,
Vector,
Expand All @@ -17,7 +18,7 @@
def decode(data: Any, typ):
if issubclass(typ, uint | boolean):
return typ(data)
elif issubclass(typ, List | Vector):
elif issubclass(typ, List | ProgressiveList | Vector):
return typ(decode(element, typ.element_cls()) for element in data)
elif issubclass(typ, ByteVector):
return typ(bytes.fromhex(data[2:]))
Expand Down
3 changes: 2 additions & 1 deletion tests/core/pyspec/eth2spec/debug/encode.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
boolean,
Container,
List,
ProgressiveList,
uint,
Union,
Vector,
Expand All @@ -23,7 +24,7 @@ def encode(value, include_hash_tree_roots=False):
return "0x" + serialize(value).hex()
elif isinstance(value, list): # normal python lists
return [encode(element, include_hash_tree_roots) for element in value]
elif isinstance(value, List | Vector):
elif isinstance(value, List | ProgressiveList | Vector):
return [encode(element, include_hash_tree_roots) for element in value]
elif isinstance(value, bytes): # bytes, ByteList, ByteVector
return "0x" + value.hex()
Expand Down
18 changes: 11 additions & 7 deletions tests/core/pyspec/eth2spec/debug/random_value.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
ByteVector,
Container,
List,
ProgressiveList,
uint,
Union,
Vector,
Expand Down Expand Up @@ -102,19 +103,22 @@ def get_random_ssz_object(
get_random_ssz_object(rng, elem_type, max_bytes_length, max_list_length, mode, chaos)
for _ in range(typ.vector_length())
)
elif issubclass(typ, List) or issubclass(typ, Bitlist):
length = rng.randint(0, min(typ.limit(), max_list_length))
elif issubclass(typ, List) or issubclass(typ, ProgressiveList) or issubclass(typ, Bitlist):
limit = max_list_length
# SSZ imposes a hard limit on lists, we can't put in more than that
if not issubclass(typ, ProgressiveList) and typ.limit() < limit:
limit = typ.limit()

length = rng.randint(0, limit)
if mode == RandomizationMode.mode_one_count:
length = 1
elif mode == RandomizationMode.mode_max_count:
length = max_list_length
length = limit
elif mode == RandomizationMode.mode_nil_count:
length = 0

# SSZ imposes a hard limit on lists, we can't put in more than that
length = min(length, typ.limit())

elem_type = typ.element_cls() if issubclass(typ, List) else boolean
elem_type = boolean if issubclass(typ, Bitlist) else typ.element_cls()
max_list_length = 1 << (max_list_length.bit_length() >> 1)
return typ(
get_random_ssz_object(rng, elem_type, max_bytes_length, max_list_length, mode, chaos)
for _ in range(length)
Expand Down
1 change: 1 addition & 0 deletions tests/core/pyspec/eth2spec/utils/ssz/ssz_typing.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
)
from remerkleable.complex import Container, List, Vector
from remerkleable.core import BasicView, Path, View
from remerkleable.progressive import ProgressiveList
from remerkleable.union import Union

Bytes20 = ByteVector[20] # type: ignore
Expand Down
22 changes: 22 additions & 0 deletions tests/formats/ssz_generic/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ into a SSZ type:
- List
- `basic_list` *not supported yet*
- `complex_list` *not supported yet*
- ProgressiveList
- `basic_progressive_list`
- `complex_progressive_list` *not supported yet*
- Bitfields
- `bitvector`
- `bitlist`
Expand Down Expand Up @@ -105,6 +108,18 @@ Data:
{length}: an unsigned integer
```

### `basic_progressive_list`

```
Template:

proglist_{element type}

Data:

{element type}: bool, uint8, uint16, uint32, uint64, uint128, uint256
```

### `bitlist`

```
Expand Down Expand Up @@ -193,6 +208,13 @@ class ComplexTestStruct(Container):
G: Vector[VarTestStruct, 2]


class ProgressiveTestStruct(Container):
A: ProgressiveList[byte]
B: ProgressiveList[uint64]
C: ProgressiveList[SmallTestStruct]
D: ProgressiveList[ProgressiveList[VarTestStruct]]


class BitsStruct(Container):
A: Bitlist[5]
B: Bitvector[2]
Expand Down
3 changes: 3 additions & 0 deletions tests/generators/runners/ssz_generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
from eth2spec.test.helpers.constants import PHASE0

from .ssz_generic_cases import (
ssz_basic_progressive_list,
ssz_basic_vector,
ssz_bitlist,
ssz_bitvector,
Expand All @@ -15,6 +16,8 @@

def get_test_cases() -> Iterable[TestCase]:
test_case_fns = [
("basic_progressive_list", "valid", ssz_basic_progressive_list.valid_cases),
("basic_progressive_list", "invalid", ssz_basic_progressive_list.invalid_cases),
("basic_vector", "valid", ssz_basic_vector.valid_cases),
("basic_vector", "invalid", ssz_basic_vector.invalid_cases),
("bitlist", "valid", ssz_bitlist.valid_cases),
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
from random import Random

from eth2spec.debug.random_value import get_random_ssz_object, RandomizationMode
from eth2spec.utils.ssz.ssz_impl import serialize
from eth2spec.utils.ssz.ssz_typing import (
BasicView,
boolean,
ProgressiveList,
uint8,
uint16,
uint32,
uint64,
uint128,
uint256,
)

from .ssz_boolean import INVALID_BOOL_CASES
from .ssz_test_case import invalid_test_case, valid_test_case


def progressive_list_case_fn(
rng: Random, mode: RandomizationMode, elem_type: type[BasicView], length: int
):
return get_random_ssz_object(
rng,
ProgressiveList[elem_type],
max_bytes_length=length * 8,
max_list_length=length,
mode=mode,
chaos=False,
)


BASIC_TYPES: dict[str, type[BasicView]] = {
"bool": boolean,
"uint8": uint8,
"uint16": uint16,
"uint32": uint32,
"uint64": uint64,
"uint128": uint128,
"uint256": uint256,
}


def valid_cases():
rng = Random(1234)
for name, typ in BASIC_TYPES.items():
random_modes = [RandomizationMode.mode_zero, RandomizationMode.mode_max]
if name != "bool":
random_modes.append(RandomizationMode.mode_random)
for length in [0, 1, 2, 3, 4, 5, 8, 20, 21, 22, 85, 86, 341, 342, 1365, 1366]:
for mode in random_modes:
yield (
f"proglist_{name}_{mode.to_name()}_{length}",
valid_test_case(
lambda rng=rng, mode=mode, typ=typ, length=length: progressive_list_case_fn(
rng, mode, typ, length
)
),
)


def invalid_cases():
rng = Random(1234)
for name, typ in BASIC_TYPES.items():
random_modes = [RandomizationMode.mode_zero, RandomizationMode.mode_max]
if name != "bool":
random_modes.append(RandomizationMode.mode_random)
for length in [0, 1, 2, 3, 4, 5, 8, 20, 21, 22, 85, 86, 341, 342, 1365, 1366]:
for mode in random_modes:
if name == "bool":
for description, data in INVALID_BOOL_CASES:
yield (
f"proglist_{name}_{length}_{mode.to_name()}_{description}",
invalid_test_case(
lambda rng=rng,
mode=mode,
typ=typ,
length=length,
data=data: serialize(
progressive_list_case_fn(rng, mode, typ, length)
)[:-1]
+ data
),
)
if typ.type_byte_length() > 1:
yield (
f"proglist_{name}_{length}_{mode.to_name()}_one_byte_less",
invalid_test_case(
lambda rng=rng, mode=mode, typ=typ, length=length: serialize(
progressive_list_case_fn(rng, mode, typ, length)
)[:-1]
),
)
yield (
f"proglist_{name}_{length}_{mode.to_name()}_one_byte_more",
invalid_test_case(
lambda rng=rng, mode=mode, typ=typ, length=length: serialize(
progressive_list_case_fn(rng, mode, typ, length)
)
+ serialize(progressive_list_case_fn(rng, mode, uint8, 1))
),
)
Loading