Skip to content

Commit 5523004

Browse files
committed
Add article about changes in coin selection module
This blog post summarizes the work done during the Summer of Bitcoin initiative in the coin selection module of BDK and the ideas projected for its future.
1 parent 38788cc commit 5523004

File tree

1 file changed

+330
-0
lines changed

1 file changed

+330
-0
lines changed
Lines changed: 330 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,330 @@
1+
---
2+
title: "Improving coin selection in BDK"
3+
description: "A brief description of the work done in the coin selection module in BDK during Summer of Bitcoin 2022"
4+
date: "2022-08-17"
5+
tags: ["coin_selection", "BDK", "development", "summer_of_bitcoin"]
6+
authors:
7+
- "César Alvarez Vallero"
8+
hidden: true
9+
draft: false
10+
---
11+
12+
As a project designed to use as a build tool in wallet development, one of the
13+
main things that BDK provides is the coin selection module. The purpose of the
14+
module is to select the group of utxos to use as inputs for the transaction.
15+
When you coin select you must consider cost, size and traceability.
16+
17+
- What are those costs?
18+
19+
Principally fees. Determined by the satisfaction size required by each of the
20+
inputs. But are also related to the change outputs generated. Change outputs
21+
are not part of the inputs, but they must be considered during coin selection
22+
because it affects the fee rate of the transaction and will be used in future
23+
transactions as inputs.
24+
For example, if you create change outputs always that you have some excess
25+
after coin selecting, you'll probably end up with very small UTXOs. The smaller
26+
the UTXO, the greater the proportion of fees spend to use that UTXO, depending
27+
on the fee rate.
28+
29+
- What do we mean by "size" considerations?
30+
31+
Here we are not referring to the size in MB of the transaction, as that is
32+
addressed by the associated fees.
33+
Here, "size" is the number of new UTXOs created by each transaction. It has a
34+
direct impact on the size of the UTXO set maintained by each node.
35+
36+
- What is this traceability thing?
37+
38+
Certain companies sell services whose purpose is to link address with their
39+
owners, harming the fungibility of some bitcoins and attacking the privacy of
40+
the users.
41+
There are some things that coin selection can do to make privacy leaking
42+
harder. For example, not creating change outputs, avoiding mixing UTXOs
43+
belonging to different owned addresses in the same transaction, or the total
44+
expenditure of the related utxos.
45+
46+
Besides the algorithm you use to coin select, which can target some of the
47+
things described above, other code changes also have implications for them. The
48+
following section will describe some of those changes and why they have been
49+
done or could be added.
50+
51+
## Waste
52+
53+
One of the changes projected for the `coin_selection` module is the addition
54+
of the `Waste` metric, and its use to optimize the coin selection in relation
55+
to the fee costs.
56+
57+
Waste is a metric introduced by the BnB algorithm as part of its bounding
58+
procedure. Later, it was included as a high-level function to use in comparison
59+
of different coin selection algorithms in Bitcoin Core.
60+
61+
### How it works?
62+
63+
We can describe waste as the sum of two values: creation cost and timing cost.
64+
65+
Timing cost is the cost associated with the current fee rate and some long-term
66+
fee rate used as a threshold to consolidate UTXOs. It can be negative if the
67+
current fee rate is cheaper than the long-term fee rate or zero if they are
68+
equal.
69+
70+
Creation cost is the cost associated with the surplus of coins besides the
71+
transaction amount and transaction fees. It can happen in the form of a change
72+
output or excessive fees paid to the miner.
73+
Change cost derives from the cost of adding the extra output to the transaction
74+
and spending in the future.
75+
Excess happens when there is no change, and the surplus of coins is spent as
76+
part of the fees to the miner.
77+
78+
The creation cost can be zero if there is a perfect match as a result of the
79+
coin selection algorithm.
80+
81+
So, waste can be zero or negative if the creation cost is zero and the timing
82+
cost is less than or equal to zero
83+
84+
You can read its technical aspects in its [PR](https://github.com/bitcoindevkit/bdk/pull/558). Comments and suggestions are
85+
welcome!
86+
87+
But, while developing the proposal, some requirements to resolve before arose.
88+
Let's talk about them.
89+
90+
### What has been done
91+
92+
Waste is closely related to the creation of change or the drop of it as fees.
93+
Formerly, whether your selection would produce change or not, was decided
94+
inside the `create_tx` function. From the perspective of the Waste metric, that
95+
was problematic. How to score coin selection based on `Waste` if you don't know
96+
yet if it will create change or not?
97+
98+
The problem had been pointed out before, in [this issue](https://github.com/bitcoindevkit/bdk/issues/147).
99+
100+
A [PR](https://github.com/bitcoindevkit/bdk/pull/630) merged in the last
101+
release moved change creation to the `coin_selection` module. It introduced
102+
several changes:
103+
104+
- the enum `Excess`.
105+
- the function `decide_change`.
106+
- a new field in `CoinSelectionResult` to hold the `Excess` produced while coin
107+
selecting.
108+
109+
We hope to have chosen meaningful names for all these new additions, but lets
110+
explain them in depth.
111+
112+
Formerly, when you needed to create change inside `create_tx`, you must get the
113+
weight of the change output, compute its fees and, jointly with the overall
114+
fee amount and the outgoing amount, subtract them from the remaining amount of
115+
the selected utxos, then decide whether the amount of that output should be
116+
considered dust, and throw the remaining amount to fees in that case. Otherwise
117+
add an extra output to the output list and sum their fees to the fee
118+
amount.
119+
Also, there was the case when you wanted to sweep all the funds associated with
120+
an address, but the amount created a dust output. In that situation, the dust
121+
value of the output and the amount available after deducing the fees were
122+
necessary to report an informative error to the user.
123+
124+
In general, the idea was to compute all those values inside `coin_selection`
125+
but keep the decision logic where it was meaningful, that is, inside
126+
`create_tx`.
127+
128+
Those considerations ended up with an enum, `Excess`, with two struct variants
129+
that differentiated the cases mentioned above, which carry all the needed
130+
information to act in each one of those cases.
131+
132+
```rust
133+
/// Remaining amount after performing coin selection
134+
pub enum Excess {
135+
/// It's not possible to create spendable output from excess using the current drain output
136+
NoChange {
137+
/// Threshold to consider amount as dust for this particular change script_pubkey
138+
dust_threshold: u64,
139+
/// Exceeding amount of current selection over outgoing value and fee costs
140+
remaining_amount: u64,
141+
/// The calculated fee for the drain TxOut with the selected script_pubkey
142+
change_fee: u64,
143+
},
144+
/// It's possible to create spendable output from excess using the current drain output
145+
Change {
146+
/// Effective amount available to create change after deducting the change output fee
147+
amount: u64,
148+
/// The deducted change output fee
149+
fee: u64,
150+
},
151+
}
152+
```
153+
154+
The function `decide_change` was created to build `Excess`. This function
155+
requires the remaining amount after coin selection, the script that will be
156+
used to create the output and the fee rate aimed by the user.
157+
158+
```rust
159+
/// Decide if change can be created
160+
///
161+
/// - `remaining_amount`: the amount in which the selected coins exceed the target amount
162+
/// - `fee_rate`: required fee rate for the current selection
163+
/// - `drain_script`: script to consider change creation
164+
pub fn decide_change(remaining_amount: u64, fee_rate: FeeRate, drain_script: &Script) -> Excess {
165+
// drain_output_len = size(len(script_pubkey)) + len(script_pubkey) + size(output_value)
166+
let drain_output_len = serialize(drain_script).len() + 8usize;
167+
let change_fee = fee_rate.fee_vb(drain_output_len);
168+
let drain_val = remaining_amount.saturating_sub(change_fee);
169+
170+
if drain_val.is_dust(drain_script) {
171+
let dust_threshold = drain_script.dust_value().as_sat();
172+
Excess::NoChange {
173+
dust_threshold,
174+
change_fee,
175+
remaining_amount,
176+
}
177+
} else {
178+
Excess::Change {
179+
amount: drain_val,
180+
fee: change_fee,
181+
}
182+
}
183+
}
184+
```
185+
186+
To pass this new value to `Wallet::create_tx` and make decisions based on it,
187+
the field `excess` was added to the `CoinSelectionResult`, and the
188+
`coin_select` methods of each algorithm were adapted to compute this value,
189+
using `decide_change` after performing the coin selection.
190+
191+
```rust
192+
/// Result of a successful coin selection
193+
pub struct CoinSelectionResult {
194+
/// List of outputs selected for use as inputs
195+
pub selected: Vec<Utxo>,
196+
/// Total fee amount for the selected utxos in satoshis
197+
pub fee_amount: u64,
198+
/// Remaining amount after deducing fees and outgoing outputs
199+
pub excess: Excess,
200+
}
201+
```
202+
203+
204+
### Work in progress
205+
206+
There persists unresolved yet the necessity to integrate the `Waste::calculate`
207+
method with the `CoinSelectionAlgorithm` implementations and the
208+
`decide_change` function.
209+
210+
The first step in that direction would be the removal of the Database generic
211+
parameter from the `CoinSelectionAlgorithm` trait. There isn't a clear way to
212+
make it, as you may guess by this
213+
[issue](https://github.com/bitcoindevkit/bdk/issues/281).
214+
The only algorithm currently using the database features is
215+
`OldestFirstCoinSelection`.
216+
There is a proposal to fix this problem by removing the need of a database
217+
trait altogether, so, in the meanwhile, we could move the generic from the
218+
trait to the `OldestFirstCoinSelection`, to avoid doing work that will probably
219+
be disposed in the future.
220+
221+
Another step in that way is a proposal to add a
222+
`CoinSelectionAlgorithm::get_selection` wrapper to the coin selection module,
223+
which will join together preprocessing and validation of the utxos, coin
224+
selection, the decision to create change and the calculus of waste in the same
225+
function. The idea is to create a real pipeline to build a
226+
`CoinSelectionResult`.
227+
228+
In addition, the function will allow the separation of the algorithms
229+
`BranchAndBound` and `SingleRandomDraw` from each other, which were put
230+
together only by the dependence of the former on the second one as a fallback
231+
method.
232+
That dependence will not be broken, but the possibility to use
233+
`SingleRandomDraw` through BDK for their purpose will be left open, expanding
234+
the flexibility of the library.
235+
236+
As a bonus, this function will deflate some parts of the code from unnecessary
237+
information, avoid code duplication (and all the things associated with it) and
238+
provide a simple interface to integrate your custom algorithms with all the
239+
other functionalities of the BDK library, enhancing them through the new change
240+
primitives and the computation of Waste.
241+
242+
You can start reviewing the [PR](https://github.com/bitcoindevkit/bdk/pull/727) right now!
243+
244+
## Further Improvements
245+
246+
Besides the `Waste` metric, there are other changes that could improve the
247+
current state of the coin selection module in BDK, which will impact the
248+
privacy and the flexibility provided by it.
249+
250+
### Privacy
251+
252+
In Bitcoin Core, the term `Output Group` is associated with a structure that
253+
joins all the UTXOs belonging to a certain ScriptPubKey, up to a specified
254+
threshold. The idea behind this is to reduce the address footprint in the
255+
blockchain, reducing traceability and improving privacy.
256+
In BDK, OutputGroups are a mere way to aggregate metadata to UTXOs. But this
257+
structure can be improved to something like what there is in Bitcoin, by
258+
transforming the weighted utxos in a vector of them and adding a new field or
259+
parameter to control the amount stored in the vector.
260+
261+
### Flexibility
262+
263+
A further tweak in the UTXO structure could be the transition to traits, which
264+
define the minimal properties accepted by the algorithms to select the
265+
underlying UTXOs.
266+
The hope is that anyone can define new algorithms consuming any form of UTXO
267+
wrapper that you can imagine, as long as they follow the behavior specified by
268+
those primitive traits.
269+
270+
Also, there is a major architectural change proposal called `bdk_core` that
271+
will refactor a lot of sections of BDK to improve its modularity and
272+
flexibility. If you want to know more, you can read the
273+
[blog post](https://bitcoindevkit.org/blog/bdk-core-pt1/) about it or dig
274+
directly into its [prototype](https://github.com/LLFourn/bdk_core_staging).
275+
276+
## Conclusion
277+
278+
A lot of work is coming to the coin selection module of BDK.
279+
Adding the Waste metric would be a great step in the improvement of the coin
280+
selection features of the kit, and we hope to find new ways to measure the
281+
selection capabilities. We are open to new ideas!
282+
The new changes range from refactorings to enhancements. It's not hard to find
283+
something to do in the project, as long as you spend some time figuring out how
284+
the thing works. Hopefully, these new changes will make this task easier. And
285+
we are ready to help if you need to.
286+
If you would like to improve something, request a new feature or discuss how
287+
would you use BDK in your personal project, join us on
288+
[Discord](https://discord.gg/dstn4dQ).
289+
290+
## Acknowledgements
291+
292+
Special thanks to my mentor Daniela Brozzoni for the support and help provided
293+
during the development of the above work.
294+
295+
Thanks to all BDK contributors for their reviews and comments and thanks to the
296+
Bitcoin community for the open source work that made this an enjoyable learning
297+
experience.
298+
299+
Finally, thanks to the Summer of Bitcoin organizers, sponsors and speakers for
300+
the wonderful initiative, and all the guide provided.
301+
302+
## References
303+
304+
### About coin selection considerations
305+
- Jameson Lopp. "The Challenges of Optimizing Unspent Output Selection"
306+
_Cypherpunk Cogitations_.
307+
[https://blog.lopp.net/the-challenges-of-optimizing-unspent-output-selection/](https://blog.lopp.net/the-challenges-of-optimizing-unspent-output-selection/)
308+
309+
### About Waste metric
310+
- Murch. "What is the Waste Metric?" _Murch ado about nothing_.
311+
[https://murch.one/posts/waste-metric/](https://murch.one/posts/waste-metric/)
312+
- Andrew Chow. "wallet: Decide which coin selection solution to use based on
313+
waste metric" _Bitcoin Core_. [https://github.com/bitcoin/bitcoin/pull/22009](https://github.com/bitcoin/bitcoin/pull/22009)
314+
- Bitcoin Core PR Review Club. "Decide which coin selection solution to use
315+
based on waste metric". _Bitcoin Core PR Review Club_.
316+
[https://bitcoincore.reviews/22009](https://bitcoincore.reviews/22009)
317+
318+
### About improving privacy in coin selection
319+
- Josi Bake. "wallet: avoid mixing different OutputTypes during coin selection"
320+
_Bitcoin Core_. [https://github.com/bitcoin/bitcoin/pull/24584](https://github.com/bitcoin/bitcoin/pull/24584)
321+
- Bitcoin Core PR Review Club. "Increase OUTPUT_GROUP_MAX_ENTRIES to 100"
322+
_Bitcoin Core PR Review Club_. [https://bitcoincore.reviews/18418](https://bitcoincore.reviews/18418)
323+
- Bitcoin Core PR Review Club. "Avoid mixing different `OutputTypes` during
324+
coin selection" _Bitcoin Core PR Review Club_.
325+
[https://bitcoincore.reviews/24584](https://bitcoincore.reviews/24584)
326+
327+
### About `bdk_core`
328+
- Lloyd Fournier. "bdk_core: a new architecture for the Bitcoin Dev Kit".
329+
_bitcoindevkit blog_. [https://bitcoindevkit.org/blog/bdk-core-pt1/](https://bitcoindevkit.org/blog/bdk-core-pt1/)
330+

0 commit comments

Comments
 (0)