Skip to content

Commit 4b23961

Browse files
committed
Merge commit 'refs/pull/114/head' of github.com:bitcoindevkit/bitcoindevkit.org
2 parents c36ebaa + f608e9e commit 4b23961

File tree

1 file changed

+329
-0
lines changed

1 file changed

+329
-0
lines changed
Lines changed: 329 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,329 @@
1+
---
2+
title: "Improving coin selection in BDK"
3+
description: "A brief description of the work done in the coin selection module in BDK during Summer of Bitcoin 2022"
4+
date: "2022-08-17"
5+
tags: ["coin selection", "BDK", "development", "summer of bitcoin"]
6+
authors:
7+
- "César Alvarez Vallero"
8+
hidden: true
9+
draft: false
10+
---
11+
12+
As a project designed to be used as a build tool in wallet development, one of
13+
the main things that BDK provides is the coin selection module. The purpose of
14+
the module is to select the group of utxos to use as inputs for the transaction.
15+
When you coin select you must consider cost, size and traceability.
16+
17+
- What are those costs?
18+
19+
Principally fees determined by the satisfaction size required by each of the
20+
inputs. But the costs are also related to the change outputs generated.
21+
Change outputs are not part of the inputs, but they must be considered during
22+
coin selection because they affect the fee rate of the transaction and will
23+
be used in future transactions as inputs.
24+
For example, if you always create change outputs when you have some excess
25+
after coin selecting, you'll probably end up with very small UTXOs. The
26+
smaller the UTXO, the greater the proportion of fees spend to use that UTXO,
27+
depending on the fee rate.
28+
29+
- What do we mean by "size" considerations?
30+
31+
Here we are not referring to the size in MB of the transaction, as that is
32+
addressed by the associated fees.
33+
Here, "size" is the number of new UTXOs created by each transaction. It has a
34+
direct impact on the size of the UTXO set maintained by each node.
35+
36+
- What is this traceability thing?
37+
38+
Certain companies sell services whose purpose is to link address with their
39+
owners, harming the fungibility of some bitcoins and attacking the privacy of
40+
the users.
41+
There are some things that coin selection can do to make privacy leaking
42+
harder. For example, not creating change outputs, avoiding mixing UTXOs
43+
belonging to different owned addresses in the same transaction, or the total
44+
expenditure of the related utxos.
45+
46+
Besides the algorithm you use to coin select, which can target some of the
47+
things described above, other code changes also have implications for them. The
48+
following section will describe some of those changes and why they have been
49+
done or could be added.
50+
51+
## Waste
52+
53+
One of my project changes for the `coin_selection` module is the addition of
54+
the `Waste` metric, and its use to optimize the coin selection in relation to
55+
the fee costs.
56+
57+
Waste is a metric introduced by the BnB algorithm as part of its bounding
58+
procedure. Later, it was included as a high-level function to use in comparison
59+
of different coin selection algorithms in Bitcoin Core.
60+
61+
### How it works?
62+
63+
We can describe waste as the sum of two values: creation cost and timing cost.
64+
65+
Timing cost is the cost associated with the current fee rate and some long-term
66+
fee rate used as a threshold to consolidate UTXOs. It can be negative if the
67+
current fee rate is cheaper than the long-term fee rate or zero if they are
68+
equal.
69+
70+
Creation cost is the cost associated with the surplus of coins besides the
71+
transaction amount and transaction fees. It can happen in the form of a change
72+
output or excessive fees paid to the miner.
73+
Change cost derives from the cost of adding the extra output to the transaction
74+
and spending it in the future.
75+
Excess happens when there is no change, and the surplus of coins is spent as
76+
part of the fees to the miner.
77+
78+
The creation cost can be zero if there is a perfect match as a result of the
79+
coin selection algorithm.
80+
81+
So, waste can be zero or negative if the creation cost is zero and the timing
82+
cost is less than or equal to zero
83+
84+
You can read about the technical details in [bdk PR 558](https://github.com/bitcoindevkit/bdk/pull/558). Comments and suggestions are
85+
welcome!
86+
87+
But, while developing the proposal, some requirements to resolve first arose.
88+
Let's talk about them.
89+
90+
### What has been done
91+
92+
Waste is closely related to the creation of change or the drop of it as fees.
93+
Formerly, whether your selection would produce change or not, was decided
94+
inside the `create_tx` function. From the perspective of the Waste metric, that
95+
was problematic. How to score coin selection based on `Waste` if you don't know
96+
yet if it will create change or not?
97+
98+
The problem had been pointed out before, in [this issue](https://github.com/bitcoindevkit/bdk/issues/147).
99+
100+
The [bdk PR 630](https://github.com/bitcoindevkit/bdk/pull/630) merged in [release 0.21.0](https://github.com/bitcoindevkit/bdk/releases/tag/v0.21.0) moved change creation to the
101+
`coin_selection` module. It introduced several changes:
102+
103+
- the enum `Excess`.
104+
- the function `decide_change`.
105+
- a new field in `CoinSelectionResult` to hold the `Excess` produced while coin
106+
selecting.
107+
108+
We hope to have chosen meaningful names for all these new additions, but lets
109+
explain them in depth.
110+
111+
Formerly, when you needed to create change inside `create_tx`, you must get the
112+
weight of the change output, compute its fees and, jointly with the overall
113+
fee amount and the outgoing amount, subtract them from the remaining amount of
114+
the selected utxos, then decide whether the amount of that output should be
115+
considered dust, and throw the remaining amount to fees in that case. Otherwise
116+
add an extra output to the output list and sum their fees to the fee amount.
117+
Also, there was the case when you wanted to sweep all the funds associated with
118+
an address, but the amount created a dust output. In that situation, the dust
119+
value of the output and the amount available after deducing the fees were
120+
necessary to report an informative error to the user.
121+
122+
In general, the idea was to compute all those values inside `coin_selection`
123+
but keep the decision logic where it was meaningful, that is, inside
124+
`create_tx`.
125+
126+
Those considerations ended up with an enum, `Excess`, with two struct variants
127+
that differentiated the cases mentioned above, which carry all the needed
128+
information to act in each one of those cases.
129+
130+
```rust
131+
/// Remaining amount after performing coin selection
132+
pub enum Excess {
133+
/// It's not possible to create spendable output from excess using the current drain output
134+
NoChange {
135+
/// Threshold to consider amount as dust for this particular change script_pubkey
136+
dust_threshold: u64,
137+
/// Exceeding amount of current selection over outgoing value and fee costs
138+
remaining_amount: u64,
139+
/// The calculated fee for the drain TxOut with the selected script_pubkey
140+
change_fee: u64,
141+
},
142+
/// It's possible to create spendable output from excess using the current drain output
143+
Change {
144+
/// Effective amount available to create change after deducting the change output fee
145+
amount: u64,
146+
/// The deducted change output fee
147+
fee: u64,
148+
},
149+
}
150+
```
151+
152+
The function `decide_change` was created to build `Excess`. This function
153+
requires the remaining amount after coin selection, the script that will be
154+
used to create the output and the fee rate aimed by the user.
155+
156+
```rust
157+
/// Decide if change can be created
158+
///
159+
/// - `remaining_amount`: the amount in which the selected coins exceed the target amount
160+
/// - `fee_rate`: required fee rate for the current selection
161+
/// - `drain_script`: script to consider change creation
162+
pub fn decide_change(remaining_amount: u64, fee_rate: FeeRate, drain_script: &Script) -> Excess {
163+
// drain_output_len = size(len(script_pubkey)) + len(script_pubkey) + size(output_value)
164+
let drain_output_len = serialize(drain_script).len() + 8usize;
165+
let change_fee = fee_rate.fee_vb(drain_output_len);
166+
let drain_val = remaining_amount.saturating_sub(change_fee);
167+
168+
if drain_val.is_dust(drain_script) {
169+
let dust_threshold = drain_script.dust_value().as_sat();
170+
Excess::NoChange {
171+
dust_threshold,
172+
change_fee,
173+
remaining_amount,
174+
}
175+
} else {
176+
Excess::Change {
177+
amount: drain_val,
178+
fee: change_fee,
179+
}
180+
}
181+
}
182+
```
183+
184+
To pass this new value to `Wallet::create_tx` and make decisions based on it,
185+
the field `excess` was added to the `CoinSelectionResult`, and the
186+
`coin_select` methods of each algorithm were adapted to compute this value,
187+
using `decide_change` after performing the coin selection.
188+
189+
```rust
190+
/// Result of a successful coin selection
191+
pub struct CoinSelectionResult {
192+
/// List of outputs selected for use as inputs
193+
pub selected: Vec<Utxo>,
194+
/// Total fee amount for the selected utxos in satoshis
195+
pub fee_amount: u64,
196+
/// Remaining amount after deducing fees and outgoing outputs
197+
pub excess: Excess,
198+
}
199+
```
200+
201+
202+
### Work in progress
203+
204+
There remains unresolved the work to integrate the `Waste::calculate` method
205+
with the `CoinSelectionAlgorithm` implementations and the `decide_change`
206+
function.
207+
208+
A step towards that goal would be the removal of the Database generic parameter
209+
from the `CoinSelectionAlgorithm` trait. There isn't a clear way to make it, as
210+
you may guess by this
211+
[issue](https://github.com/bitcoindevkit/bdk/issues/281).
212+
The only algorithm currently using the database features is
213+
`OldestFirstCoinSelection`.
214+
There is a proposal to fix this problem by removing the need for a database
215+
trait altogether, so, in the meanwhile, we could move the generic from the
216+
trait to the `OldestFirstCoinSelection`, to avoid doing work that will probably
217+
be disposed in the future.
218+
219+
Another step in that direction is a proposal to add a
220+
`CoinSelectionAlgorithm::process_and_select_coins` wrapper to the coin
221+
selection module, which will join together preprocessing and validation of the
222+
utxos, coin selection, the decision to create change and the calculus of waste
223+
in the same function. The idea is to create a real pipeline to build a
224+
`CoinSelectionResult`.
225+
226+
In addition, the function will allow the separation of the algorithms
227+
`BranchAndBound` and `SingleRandomDraw` from each other, which were put
228+
together only by the dependence of the former on the second one as a fallback
229+
method.
230+
That dependence will not be broken, but the possibility to use
231+
`SingleRandomDraw` through BDK will be enabled, expanding the flexibility of
232+
the library.
233+
234+
As a bonus, this function will save some parts of the code from unnecessary
235+
information, avoid code duplication (and all the things associated with it) and
236+
provide a simple interface to integrate your custom algorithms with all the
237+
other functionalities of the BDK library, enhancing them through the new change
238+
primitives and the computation of `Waste`.
239+
240+
You can start reviewing [bdk PR 727](https://github.com/bitcoindevkit/bdk/pull/727) right now!
241+
242+
## Further Improvements
243+
244+
Besides the `Waste` metric, there are other changes that could improve the
245+
current state of the coin selection module in BDK, which will impact the
246+
privacy and the flexibility provided by it.
247+
248+
### Privacy
249+
250+
In Bitcoin Core, the term `Output Group` is associated with a structure that
251+
joins all the UTXOs belonging to a certain ScriptPubKey, up to a specified
252+
threshold. The idea behind this is to reduce the address footprint in the
253+
blockchain, reducing traceability and improving privacy.
254+
In BDK, OutputGroups are a mere way to aggregate metadata to UTXOs. But this
255+
structure can be improved to something like what there is in Bitcoin, by
256+
transforming the weighted utxos into a vector of them and adding a new field or
257+
parameter to control the amount stored in the vector.
258+
259+
### Flexibility
260+
261+
A further tweak in the UTXO structure could be the transition to traits, which
262+
define the minimal properties accepted by the algorithms to select the
263+
underlying UTXOs.
264+
The hope is that anyone can define new algorithms consuming any form of UTXO
265+
wrapper that you can imagine, as long as they follow the behavior specified by
266+
those primitive traits.
267+
268+
Also, there is a major architectural change proposal called `bdk_core` that
269+
will refactor a lot of sections of BDK to improve its modularity and
270+
flexibility. If you want to know more, you can read the
271+
[blog post](https://bitcoindevkit.org/blog/bdk-core-pt1/) about it or dig
272+
directly into its [prototype](https://github.com/LLFourn/bdk_core_staging).
273+
274+
## Conclusion
275+
276+
A lot of work is coming to the coin selection module of BDK.
277+
Adding the `Waste` metric will be a great step in the improvement of the coin
278+
selection features of the kit, and we hope to find new ways to measure the
279+
selection capabilities. We are open to new ideas!
280+
The new changes range from refactorings to enhancements. It's not hard to find
281+
something to do in the project, as long as you spend some time figuring out how
282+
the thing works. Hopefully, these new changes will make this task easier. And
283+
we are ready to help anyone who needs it.
284+
If you would like to improve something, request a new feature or discuss how
285+
you would use BDK in your personal project, join us on
286+
[Discord](https://discord.gg/dstn4dQ).
287+
288+
## Acknowledgements
289+
290+
Special thanks to my mentor [Daniela Brozzoni](https://github.com/danielabrozzoni) for the support and help provided
291+
during the development of the above work, and to [Steve Myers](https://github.com/notmandatory),
292+
for the final review of this article.
293+
294+
Thanks to all BDK contributors for their reviews and comments and thanks to the
295+
Bitcoin community for the open source work that made this an enjoyable learning
296+
experience.
297+
298+
Finally, thanks to the [Summer of Bitcoin](https://www.summerofbitcoin.org/) organizers, sponsors and speakers for
299+
the wonderful initiative, and all the guide provided.
300+
301+
## References
302+
303+
### About coin selection considerations
304+
- Jameson Lopp. "The Challenges of Optimizing Unspent Output Selection"
305+
_Cypherpunk Cogitations_.
306+
[https://blog.lopp.net/the-challenges-of-optimizing-unspent-output-selection/](https://blog.lopp.net/the-challenges-of-optimizing-unspent-output-selection/)
307+
308+
### About Waste metric
309+
- Murch. "What is the Waste Metric?" _Murch ado about nothing_.
310+
[https://murch.one/posts/waste-metric/](https://murch.one/posts/waste-metric/)
311+
- Andrew Chow. "wallet: Decide which coin selection solution to use based on
312+
waste metric" _Bitcoin Core_. [https://github.com/bitcoin/bitcoin/pull/22009](https://github.com/bitcoin/bitcoin/pull/22009)
313+
- Bitcoin Core PR Review Club. "Decide which coin selection solution to use
314+
based on waste metric". _Bitcoin Core PR Review Club_.
315+
[https://bitcoincore.reviews/22009](https://bitcoincore.reviews/22009)
316+
317+
### About improving privacy in coin selection
318+
- Josi Bake. "wallet: avoid mixing different OutputTypes during coin selection"
319+
_Bitcoin Core_. [https://github.com/bitcoin/bitcoin/pull/24584](https://github.com/bitcoin/bitcoin/pull/24584)
320+
- Bitcoin Core PR Review Club. "Increase OUTPUT_GROUP_MAX_ENTRIES to 100"
321+
_Bitcoin Core PR Review Club_. [https://bitcoincore.reviews/18418](https://bitcoincore.reviews/18418)
322+
- Bitcoin Core PR Review Club. "Avoid mixing different `OutputTypes` during
323+
coin selection" _Bitcoin Core PR Review Club_.
324+
[https://bitcoincore.reviews/24584](https://bitcoincore.reviews/24584)
325+
326+
### About `bdk_core`
327+
- Lloyd Fournier. "bdk_core: a new architecture for the Bitcoin Dev Kit".
328+
_bitcoindevkit blog_. [https://bitcoindevkit.org/blog/bdk-core-pt1/](https://bitcoindevkit.org/blog/bdk-core-pt1/)
329+

0 commit comments

Comments
 (0)