|
| 1 | +--- |
| 2 | +title: "Improving coin selection in BDK" |
| 3 | +description: "A brief description of the work done in the coin selection module in BDK during Summer of Bitcoin 2022" |
| 4 | +date: "2022-08-17" |
| 5 | +tags: ["coin selection", "BDK", "development", "summer of bitcoin"] |
| 6 | +authors: |
| 7 | + - "César Alvarez Vallero" |
| 8 | +hidden: true |
| 9 | +draft: false |
| 10 | +--- |
| 11 | + |
| 12 | +As a project designed to be used as a build tool in wallet development, one of |
| 13 | +the main things that BDK provides is the coin selection module. The purpose of |
| 14 | +the module is to select the group of utxos to use as inputs for the transaction. |
| 15 | +When you coin select you must consider cost, size and traceability. |
| 16 | + |
| 17 | +- What are those costs? |
| 18 | + |
| 19 | + Principally fees determined by the satisfaction size required by each of the |
| 20 | + inputs. But the costs are also related to the change outputs generated. |
| 21 | + Change outputs are not part of the inputs, but they must be considered during |
| 22 | + coin selection because they affect the fee rate of the transaction and will |
| 23 | + be used in future transactions as inputs. |
| 24 | + For example, if you always create change outputs when you have some excess |
| 25 | + after coin selecting, you'll probably end up with very small UTXOs. The |
| 26 | + smaller the UTXO, the greater the proportion of fees spend to use that UTXO, |
| 27 | + depending on the fee rate. |
| 28 | + |
| 29 | +- What do we mean by "size" considerations? |
| 30 | + |
| 31 | + Here we are not referring to the size in MB of the transaction, as that is |
| 32 | + addressed by the associated fees. |
| 33 | + Here, "size" is the number of new UTXOs created by each transaction. It has a |
| 34 | + direct impact on the size of the UTXO set maintained by each node. |
| 35 | + |
| 36 | +- What is this traceability thing? |
| 37 | + |
| 38 | + Certain companies sell services whose purpose is to link address with their |
| 39 | + owners, harming the fungibility of some bitcoins and attacking the privacy of |
| 40 | + the users. |
| 41 | + There are some things that coin selection can do to make privacy leaking |
| 42 | + harder. For example, not creating change outputs, avoiding mixing UTXOs |
| 43 | + belonging to different owned addresses in the same transaction, or the total |
| 44 | + expenditure of the related utxos. |
| 45 | + |
| 46 | +Besides the algorithm you use to coin select, which can target some of the |
| 47 | +things described above, other code changes also have implications for them. The |
| 48 | +following section will describe some of those changes and why they have been |
| 49 | +done or could be added. |
| 50 | + |
| 51 | +## Waste |
| 52 | + |
| 53 | +One of my project changes for the `coin_selection` module is the addition of |
| 54 | +the `Waste` metric, and its use to optimize the coin selection in relation to |
| 55 | +the fee costs. |
| 56 | + |
| 57 | +Waste is a metric introduced by the BnB algorithm as part of its bounding |
| 58 | +procedure. Later, it was included as a high-level function to use in comparison |
| 59 | +of different coin selection algorithms in Bitcoin Core. |
| 60 | + |
| 61 | +### How it works? |
| 62 | + |
| 63 | +We can describe waste as the sum of two values: creation cost and timing cost. |
| 64 | + |
| 65 | +Timing cost is the cost associated with the current fee rate and some long-term |
| 66 | +fee rate used as a threshold to consolidate UTXOs. It can be negative if the |
| 67 | +current fee rate is cheaper than the long-term fee rate or zero if they are |
| 68 | +equal. |
| 69 | + |
| 70 | +Creation cost is the cost associated with the surplus of coins besides the |
| 71 | +transaction amount and transaction fees. It can happen in the form of a change |
| 72 | +output or excessive fees paid to the miner. |
| 73 | +Change cost derives from the cost of adding the extra output to the transaction |
| 74 | +and spending it in the future. |
| 75 | +Excess happens when there is no change, and the surplus of coins is spent as |
| 76 | +part of the fees to the miner. |
| 77 | + |
| 78 | +The creation cost can be zero if there is a perfect match as a result of the |
| 79 | +coin selection algorithm. |
| 80 | + |
| 81 | +So, waste can be zero or negative if the creation cost is zero and the timing |
| 82 | +cost is less than or equal to zero |
| 83 | + |
| 84 | +You can read about the technical details in [bdk PR 558](https://github.com/bitcoindevkit/bdk/pull/558). Comments and suggestions are |
| 85 | +welcome! |
| 86 | + |
| 87 | +But, while developing the proposal, some requirements to resolve first arose. |
| 88 | +Let's talk about them. |
| 89 | + |
| 90 | +### What has been done |
| 91 | + |
| 92 | +Waste is closely related to the creation of change or the drop of it as fees. |
| 93 | +Formerly, whether your selection would produce change or not, was decided |
| 94 | +inside the `create_tx` function. From the perspective of the Waste metric, that |
| 95 | +was problematic. How to score coin selection based on `Waste` if you don't know |
| 96 | +yet if it will create change or not? |
| 97 | + |
| 98 | +The problem had been pointed out before, in [this issue](https://github.com/bitcoindevkit/bdk/issues/147). |
| 99 | + |
| 100 | +The [bdk PR 630](https://github.com/bitcoindevkit/bdk/pull/630) merged in [release 0.21.0](https://github.com/bitcoindevkit/bdk/releases/tag/v0.21.0) moved change creation to the |
| 101 | +`coin_selection` module. It introduced several changes: |
| 102 | + |
| 103 | +- the enum `Excess`. |
| 104 | +- the function `decide_change`. |
| 105 | +- a new field in `CoinSelectionResult` to hold the `Excess` produced while coin |
| 106 | +selecting. |
| 107 | + |
| 108 | +We hope to have chosen meaningful names for all these new additions, but lets |
| 109 | +explain them in depth. |
| 110 | + |
| 111 | +Formerly, when you needed to create change inside `create_tx`, you must get the |
| 112 | +weight of the change output, compute its fees and, jointly with the overall |
| 113 | +fee amount and the outgoing amount, subtract them from the remaining amount of |
| 114 | +the selected utxos, then decide whether the amount of that output should be |
| 115 | +considered dust, and throw the remaining amount to fees in that case. Otherwise |
| 116 | +add an extra output to the output list and sum their fees to the fee amount. |
| 117 | +Also, there was the case when you wanted to sweep all the funds associated with |
| 118 | +an address, but the amount created a dust output. In that situation, the dust |
| 119 | +value of the output and the amount available after deducing the fees were |
| 120 | +necessary to report an informative error to the user. |
| 121 | + |
| 122 | +In general, the idea was to compute all those values inside `coin_selection` |
| 123 | +but keep the decision logic where it was meaningful, that is, inside |
| 124 | +`create_tx`. |
| 125 | + |
| 126 | +Those considerations ended up with an enum, `Excess`, with two struct variants |
| 127 | +that differentiated the cases mentioned above, which carry all the needed |
| 128 | +information to act in each one of those cases. |
| 129 | + |
| 130 | +```rust |
| 131 | +/// Remaining amount after performing coin selection |
| 132 | +pub enum Excess { |
| 133 | + /// It's not possible to create spendable output from excess using the current drain output |
| 134 | + NoChange { |
| 135 | + /// Threshold to consider amount as dust for this particular change script_pubkey |
| 136 | + dust_threshold: u64, |
| 137 | + /// Exceeding amount of current selection over outgoing value and fee costs |
| 138 | + remaining_amount: u64, |
| 139 | + /// The calculated fee for the drain TxOut with the selected script_pubkey |
| 140 | + change_fee: u64, |
| 141 | + }, |
| 142 | + /// It's possible to create spendable output from excess using the current drain output |
| 143 | + Change { |
| 144 | + /// Effective amount available to create change after deducting the change output fee |
| 145 | + amount: u64, |
| 146 | + /// The deducted change output fee |
| 147 | + fee: u64, |
| 148 | + }, |
| 149 | +} |
| 150 | +``` |
| 151 | + |
| 152 | +The function `decide_change` was created to build `Excess`. This function |
| 153 | +requires the remaining amount after coin selection, the script that will be |
| 154 | +used to create the output and the fee rate aimed by the user. |
| 155 | + |
| 156 | +```rust |
| 157 | +/// Decide if change can be created |
| 158 | +/// |
| 159 | +/// - `remaining_amount`: the amount in which the selected coins exceed the target amount |
| 160 | +/// - `fee_rate`: required fee rate for the current selection |
| 161 | +/// - `drain_script`: script to consider change creation |
| 162 | +pub fn decide_change(remaining_amount: u64, fee_rate: FeeRate, drain_script: &Script) -> Excess { |
| 163 | + // drain_output_len = size(len(script_pubkey)) + len(script_pubkey) + size(output_value) |
| 164 | + let drain_output_len = serialize(drain_script).len() + 8usize; |
| 165 | + let change_fee = fee_rate.fee_vb(drain_output_len); |
| 166 | + let drain_val = remaining_amount.saturating_sub(change_fee); |
| 167 | + |
| 168 | + if drain_val.is_dust(drain_script) { |
| 169 | + let dust_threshold = drain_script.dust_value().as_sat(); |
| 170 | + Excess::NoChange { |
| 171 | + dust_threshold, |
| 172 | + change_fee, |
| 173 | + remaining_amount, |
| 174 | + } |
| 175 | + } else { |
| 176 | + Excess::Change { |
| 177 | + amount: drain_val, |
| 178 | + fee: change_fee, |
| 179 | + } |
| 180 | + } |
| 181 | +} |
| 182 | +``` |
| 183 | + |
| 184 | +To pass this new value to `Wallet::create_tx` and make decisions based on it, |
| 185 | +the field `excess` was added to the `CoinSelectionResult`, and the |
| 186 | +`coin_select` methods of each algorithm were adapted to compute this value, |
| 187 | +using `decide_change` after performing the coin selection. |
| 188 | + |
| 189 | +```rust |
| 190 | +/// Result of a successful coin selection |
| 191 | +pub struct CoinSelectionResult { |
| 192 | + /// List of outputs selected for use as inputs |
| 193 | + pub selected: Vec<Utxo>, |
| 194 | + /// Total fee amount for the selected utxos in satoshis |
| 195 | + pub fee_amount: u64, |
| 196 | + /// Remaining amount after deducing fees and outgoing outputs |
| 197 | + pub excess: Excess, |
| 198 | +} |
| 199 | +``` |
| 200 | + |
| 201 | + |
| 202 | +### Work in progress |
| 203 | + |
| 204 | +There remains unresolved the work to integrate the `Waste::calculate` method |
| 205 | +with the `CoinSelectionAlgorithm` implementations and the `decide_change` |
| 206 | +function. |
| 207 | + |
| 208 | +A step towards that goal would be the removal of the Database generic parameter |
| 209 | +from the `CoinSelectionAlgorithm` trait. There isn't a clear way to make it, as |
| 210 | +you may guess by this |
| 211 | +[issue](https://github.com/bitcoindevkit/bdk/issues/281). |
| 212 | +The only algorithm currently using the database features is |
| 213 | +`OldestFirstCoinSelection`. |
| 214 | +There is a proposal to fix this problem by removing the need for a database |
| 215 | +trait altogether, so, in the meanwhile, we could move the generic from the |
| 216 | +trait to the `OldestFirstCoinSelection`, to avoid doing work that will probably |
| 217 | +be disposed in the future. |
| 218 | + |
| 219 | +Another step in that direction is a proposal to add a |
| 220 | +`CoinSelectionAlgorithm::process_and_select_coins` wrapper to the coin |
| 221 | +selection module, which will join together preprocessing and validation of the |
| 222 | +utxos, coin selection, the decision to create change and the calculus of waste |
| 223 | +in the same function. The idea is to create a real pipeline to build a |
| 224 | +`CoinSelectionResult`. |
| 225 | + |
| 226 | +In addition, the function will allow the separation of the algorithms |
| 227 | +`BranchAndBound` and `SingleRandomDraw` from each other, which were put |
| 228 | +together only by the dependence of the former on the second one as a fallback |
| 229 | +method. |
| 230 | +That dependence will not be broken, but the possibility to use |
| 231 | +`SingleRandomDraw` through BDK will be enabled, expanding the flexibility of |
| 232 | +the library. |
| 233 | + |
| 234 | +As a bonus, this function will save some parts of the code from unnecessary |
| 235 | +information, avoid code duplication (and all the things associated with it) and |
| 236 | +provide a simple interface to integrate your custom algorithms with all the |
| 237 | +other functionalities of the BDK library, enhancing them through the new change |
| 238 | +primitives and the computation of `Waste`. |
| 239 | + |
| 240 | +You can start reviewing [bdk PR 727](https://github.com/bitcoindevkit/bdk/pull/727) right now! |
| 241 | + |
| 242 | +## Further Improvements |
| 243 | + |
| 244 | +Besides the `Waste` metric, there are other changes that could improve the |
| 245 | +current state of the coin selection module in BDK, which will impact the |
| 246 | +privacy and the flexibility provided by it. |
| 247 | + |
| 248 | +### Privacy |
| 249 | + |
| 250 | +In Bitcoin Core, the term `Output Group` is associated with a structure that |
| 251 | +joins all the UTXOs belonging to a certain ScriptPubKey, up to a specified |
| 252 | +threshold. The idea behind this is to reduce the address footprint in the |
| 253 | +blockchain, reducing traceability and improving privacy. |
| 254 | +In BDK, OutputGroups are a mere way to aggregate metadata to UTXOs. But this |
| 255 | +structure can be improved to something like what there is in Bitcoin, by |
| 256 | +transforming the weighted utxos into a vector of them and adding a new field or |
| 257 | +parameter to control the amount stored in the vector. |
| 258 | + |
| 259 | +### Flexibility |
| 260 | + |
| 261 | +A further tweak in the UTXO structure could be the transition to traits, which |
| 262 | +define the minimal properties accepted by the algorithms to select the |
| 263 | +underlying UTXOs. |
| 264 | +The hope is that anyone can define new algorithms consuming any form of UTXO |
| 265 | +wrapper that you can imagine, as long as they follow the behavior specified by |
| 266 | +those primitive traits. |
| 267 | + |
| 268 | +Also, there is a major architectural change proposal called `bdk_core` that |
| 269 | +will refactor a lot of sections of BDK to improve its modularity and |
| 270 | +flexibility. If you want to know more, you can read the |
| 271 | +[blog post](https://bitcoindevkit.org/blog/bdk-core-pt1/) about it or dig |
| 272 | +directly into its [prototype](https://github.com/LLFourn/bdk_core_staging). |
| 273 | + |
| 274 | +## Conclusion |
| 275 | + |
| 276 | +A lot of work is coming to the coin selection module of BDK. |
| 277 | +Adding the `Waste` metric will be a great step in the improvement of the coin |
| 278 | +selection features of the kit, and we hope to find new ways to measure the |
| 279 | +selection capabilities. We are open to new ideas! |
| 280 | +The new changes range from refactorings to enhancements. It's not hard to find |
| 281 | +something to do in the project, as long as you spend some time figuring out how |
| 282 | +the thing works. Hopefully, these new changes will make this task easier. And |
| 283 | +we are ready to help anyone who needs it. |
| 284 | +If you would like to improve something, request a new feature or discuss how |
| 285 | +you would use BDK in your personal project, join us on |
| 286 | +[Discord](https://discord.gg/dstn4dQ). |
| 287 | + |
| 288 | +## Acknowledgements |
| 289 | + |
| 290 | +Special thanks to my mentor [Daniela Brozzoni](https://github.com/danielabrozzoni) for the support and help provided |
| 291 | +during the development of the above work, and to [Steve Myers](https://github.com/notmandatory), |
| 292 | +for the final review of this article. |
| 293 | + |
| 294 | +Thanks to all BDK contributors for their reviews and comments and thanks to the |
| 295 | +Bitcoin community for the open source work that made this an enjoyable learning |
| 296 | +experience. |
| 297 | + |
| 298 | +Finally, thanks to the [Summer of Bitcoin](https://www.summerofbitcoin.org/) organizers, sponsors and speakers for |
| 299 | +the wonderful initiative, and all the guide provided. |
| 300 | + |
| 301 | +## References |
| 302 | + |
| 303 | +### About coin selection considerations |
| 304 | +- Jameson Lopp. "The Challenges of Optimizing Unspent Output Selection" |
| 305 | + _Cypherpunk Cogitations_. |
| 306 | + [https://blog.lopp.net/the-challenges-of-optimizing-unspent-output-selection/](https://blog.lopp.net/the-challenges-of-optimizing-unspent-output-selection/) |
| 307 | + |
| 308 | +### About Waste metric |
| 309 | +- Murch. "What is the Waste Metric?" _Murch ado about nothing_. |
| 310 | + [https://murch.one/posts/waste-metric/](https://murch.one/posts/waste-metric/) |
| 311 | +- Andrew Chow. "wallet: Decide which coin selection solution to use based on |
| 312 | + waste metric" _Bitcoin Core_. [https://github.com/bitcoin/bitcoin/pull/22009](https://github.com/bitcoin/bitcoin/pull/22009) |
| 313 | +- Bitcoin Core PR Review Club. "Decide which coin selection solution to use |
| 314 | + based on waste metric". _Bitcoin Core PR Review Club_. |
| 315 | + [https://bitcoincore.reviews/22009](https://bitcoincore.reviews/22009) |
| 316 | + |
| 317 | +### About improving privacy in coin selection |
| 318 | +- Josi Bake. "wallet: avoid mixing different OutputTypes during coin selection" |
| 319 | + _Bitcoin Core_. [https://github.com/bitcoin/bitcoin/pull/24584](https://github.com/bitcoin/bitcoin/pull/24584) |
| 320 | +- Bitcoin Core PR Review Club. "Increase OUTPUT_GROUP_MAX_ENTRIES to 100" |
| 321 | + _Bitcoin Core PR Review Club_. [https://bitcoincore.reviews/18418](https://bitcoincore.reviews/18418) |
| 322 | +- Bitcoin Core PR Review Club. "Avoid mixing different `OutputTypes` during |
| 323 | + coin selection" _Bitcoin Core PR Review Club_. |
| 324 | + [https://bitcoincore.reviews/24584](https://bitcoincore.reviews/24584) |
| 325 | + |
| 326 | +### About `bdk_core` |
| 327 | +- Lloyd Fournier. "bdk_core: a new architecture for the Bitcoin Dev Kit". |
| 328 | + _bitcoindevkit blog_. [https://bitcoindevkit.org/blog/bdk-core-pt1/](https://bitcoindevkit.org/blog/bdk-core-pt1/) |
| 329 | + |
0 commit comments