You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: 10_compact_size_unsigned_integers.md
+17-14Lines changed: 17 additions & 14 deletions
Original file line number
Diff line number
Diff line change
@@ -1,13 +1,12 @@
1
1
# Compact Size Unsigned Integers
2
2
3
-
We'll talk more about the Segwit soft fork and how the transaction format changed later on in this course.
4
3
For now, we're going to assume the transactions we're decoding are serialized according to the legacy, pre-segwit format.
5
-
This means the next field after the version will be the number of inputs.
4
+
We'll talk more about the Segwit soft fork and how the transaction format changed later on in this course.
6
5
7
6
If you have read [Mastering Bitcoin 3rd Edition, Chapter 6](https://github.com/bitcoinbook/bitcoinbook/blob/develop/ch06_transactions.adoc#length-of-transaction-input-list), you'll remember that the next byte represents the length of the transaction input list encoded as a compactSize usigned integer.
8
7
The compactSize integer indicates how many bytes to read to determine the number of inputs.
9
-
For example, if the length is less than 253, then the next byte is simply interpreted as an unsigned 8-bit integer (the `u8` data type in Rust).
10
-
If the length is greater than 252 and less than 2^16, then we would expect to see the byte `fd` (or the integer 253) followed by two additional bytes interpreted as a `u16` integer, etc.
8
+
If the length is less than 253, then the next byte is simply interpreted as an unsigned 8-bit integer (the `u8` data type in Rust).
9
+
If the length is greater than 252 and less than 2^16, then we would expect to see the byte `fd` (or the integer 253) followed by two additional bytes interpreted as a `u16` integer, and so on.
11
10
This is the table we can use as reference:
12
11
13
12
| Value | Bytes Used | Format |
@@ -17,17 +16,22 @@ This is the table we can use as reference:
17
16
| >= `0x10000` && <= `0xffffffff`| 5 |`0xfe` followed by the number as `uint32_t`|
18
17
| >= `0x100000000` && <= `0xffffffffffffffff`| 9 |`0xff` followed by the number as `uint64_t`|
19
18
20
-
So let's write a function to read a compactSize unsigned integer.
21
-
Let's think about this a bit.
22
-
What kind of argument do we want to accept? And what should the return type be? Take a moment to fill out the function signature and come back.
19
+
Let's write a function to read a compactSize unsigned integer.
20
+
What kind of argument do we want to accept?
21
+
And what should the return type be?
22
+
Take a moment to fill out the function signature and come back.
23
23
24
24
<hr/>
25
25
26
26
For the argument type, we have to remember that we're still passing around the same mutable reference to the slice so that we can keep reading it and moving the pointer.
27
27
So we'll keep the same argument type as in the `read_version` function.
28
28
29
-
Now, what should the return type be? Well, the input length can be an 8-bit, 16-bit, 32-bit or a 64-bit unsigned integer? So if we need to specify just one type for the length, let's choose the highest one as it will contain any other possibility.
1.`0..253` syntax is a [range type](https://doc.rust-lang.org/std/ops/struct.Range.html#), which has a method called `contains` to check if a value is in the given range.
64
68
2. The number of bytes read match the integer type.
65
69
For example, 2 bytes give us a `u16` type, and 4 bytes give us a `u32` type.
@@ -68,7 +72,6 @@ We can convert between primitive types in Rust using the [`as` keyword](https://
68
72
4. Notice how there are are no semicolons for each ending line, such as `u32::from_le_bytes(buffer) as u64`.
69
73
This is the equivalent of returning that value from the function. We could also write it as `return u32::from_le_bytes(buffer) as u64;` but implicit return without semicolon is more idiomatic.
70
74
71
-
72
75
We're going to make one more change.
73
76
While standard if/else statements work fine, Rust provides pattern matching via the `match` keyword and this is a good opportunity to use it as it is commonly used in Rust codebases.
74
77
https://doc.rust-lang.org/book/ch06-02-match.html
@@ -104,7 +107,7 @@ Take a moment to get familiar with the syntax.
104
107
Each of the `arm`'s has a pattern to match followed by `=>` and then some code to return for that given pattern.
105
108
106
109
We sometimes see an arm with the underscore symbol (`_` ) as the pattern to match.
107
-
This represents a catchall pattern that will capture any value not already covered by the previous arms.
110
+
This represents a catch-all (wild card) pattern that will capture any value not already covered by the previous arms.
108
111
However, in our case, this is not needed since the previous arms are exhaustive and capture all the possible scenarios.
109
112
Remember a `u8` can only have a value between `0` and `255`.
110
113
@@ -123,7 +126,7 @@ fn main() {
123
126
}
124
127
```
125
128
126
-
And if we run this, it should print the following to the terminal:
129
+
When we run this, it should print the following to the terminal:
127
130
128
131
```shell
129
132
Version: 1
@@ -134,7 +137,7 @@ Pretty neat! We're making good progress.
134
137
But even though our code compiles, how can we be sure we've written it correctly and that this function will return the appropriate number of inputs for different transactions?
135
138
We want to test it with different arguments and ensure it is returning the correct compactSize.
136
139
We can do this with unit testing.
137
-
So let's look into setting up our first unit test in the next section.
140
+
Let's look into setting up our first unit test in the next section.
138
141
139
142
### Quiz
140
143
*How do nodes know whether the transaction is a legacy or a segwit transaction as they read it?
0 commit comments