|
1 |
| -# The SLG solver |
| 1 | +# The On-Demand SLG solver |
2 | 2 |
|
3 |
| -TODO: <https://github.com/rust-lang-nursery/chalk/blob/master/chalk-engine/src/README.md> |
| 3 | +Given a set of program clauses (provided by our [lowering rules][lowering]) |
| 4 | +and a query, we need to return the result of the query and the value of any |
| 5 | +type variables we can determine. This is the job of the solver. |
| 6 | + |
| 7 | +For example, `exists<T> { Vec<T>: FromIterator<u32> }` has one solution, so |
| 8 | +its result is `Unique; substitution [?T := u32]`. A solution also comes with |
| 9 | +a set of region constraints, which we'll ignore in this introduction. |
| 10 | + |
| 11 | +[lowering]: ./lowering-rules.html |
| 12 | + |
| 13 | +## Goals of the Solver |
| 14 | + |
| 15 | +### On demand |
| 16 | + |
| 17 | +There are often many, or even infinitely many, solutions to a query. For |
| 18 | +example, say we want to prove that `exists<T> { Vec<T>: Debug }` for _some_ |
| 19 | +type `?T`. Our solver should be capable of yielding one answer at a time, say |
| 20 | +`?T = u32`, then `?T = i32`, and so on, rather than iterating over every type |
| 21 | +in the type system. If we need more answers, we can request more until we are |
| 22 | +done. This is similar to how Prolog works. |
| 23 | + |
| 24 | +*See also: [The traditional, interactive Prolog query][pq]* |
| 25 | + |
| 26 | +[pq]: ./canonical-queries.html#the-traditional-interactive-prolog-query |
| 27 | + |
| 28 | +### Breadth-first |
| 29 | + |
| 30 | +`Vec<?T>: Debug` is true if `?T: Debug`. This leads to a cycle: `[Vec<u32>, |
| 31 | +Vec<Vec<u32>>, Vec<Vec<Vec<u32>>>]`, and so on all implement `Debug`. Our |
| 32 | +solver ought to be breadth first and consider answers like `[Vec<u32>: Debug, |
| 33 | +Vec<i32>: Debug, ...]` before it recurses, or we may never find the answer |
| 34 | +we're looking for. |
| 35 | + |
| 36 | +### Cachable |
| 37 | + |
| 38 | +To speed up compilation, we need to cache results, including partial results |
| 39 | +left over from past solver queries. |
| 40 | + |
| 41 | +## Description of how it works |
| 42 | + |
| 43 | +The basis of the solver is the [`Forest`] type. A *forest* stores a |
| 44 | +collection of *tables* as well as a *stack*. Each *table* represents |
| 45 | +the stored results of a particular query that is being performed, as |
| 46 | +well as the various *strands*, which are basically suspended |
| 47 | +computations that may be used to find more answers. Tables are |
| 48 | +interdependent: solving one query may require solving others. |
| 49 | + |
| 50 | +[`Forest`]: https://rust-lang-nursery.github.io/chalk/doc/chalk_engine/forest/struct.Forest.html |
| 51 | + |
| 52 | +### Walkthrough |
| 53 | + |
| 54 | +Perhaps the easiest way to explain how the solver works is to walk |
| 55 | +through an example. Let's imagine that we have the following program: |
| 56 | + |
| 57 | +```rust,ignore |
| 58 | +trait Debug { } |
| 59 | +
|
| 60 | +struct u32 { } |
| 61 | +impl Debug for u32 { } |
| 62 | +
|
| 63 | +struct Rc<T> { } |
| 64 | +impl<T: Debug> Debug for Rc<T> { } |
| 65 | +
|
| 66 | +struct Vec<T> { } |
| 67 | +impl<T: Debug> Debug for Vec<T> { } |
| 68 | +``` |
| 69 | + |
| 70 | +Now imagine that we want to find answers for the query `exists<T> { Rc<T>: |
| 71 | +Debug }`. The first step would be to u-canonicalize this query; this is the |
| 72 | +act of giving canonical names to all the unbound inference variables based on |
| 73 | +the order of their left-most appearance, as well as canonicalizing the |
| 74 | +universes of any universally bound names (e.g., the `T` in `forall<T> { ... |
| 75 | +}`). In this case, there are no universally bound names, but the canonical |
| 76 | +form Q of the query might look something like: |
| 77 | + |
| 78 | +```text |
| 79 | +Rc<?0>: Debug |
| 80 | +``` |
| 81 | + |
| 82 | +where `?0` is a variable in the root universe U0. We would then go and |
| 83 | +look for a table with this canonical query as the key: since the forest is |
| 84 | +empty, this lookup will fail, and we will create a new table T0, |
| 85 | +corresponding to the u-canonical goal Q. |
| 86 | + |
| 87 | +**Ignoring negative reasoning and regions.** To start, we'll ignore |
| 88 | +the possibility of negative goals like `not { Foo }`. We'll phase them |
| 89 | +in later, as they bring several complications. |
| 90 | + |
| 91 | +**Creating a table.** When we first create a table, we also initialize |
| 92 | +it with a set of *initial strands*. A "strand" is kind of like a |
| 93 | +"thread" for the solver: it contains a particular way to produce an |
| 94 | +answer. The initial set of strands for a goal like `Rc<?0>: Debug` |
| 95 | +(i.e., a "domain goal") is determined by looking for *clauses* in the |
| 96 | +environment. In Rust, these clauses derive from impls, but also from |
| 97 | +where-clauses that are in scope. In the case of our example, there |
| 98 | +would be three clauses, each coming from the program. Using a |
| 99 | +Prolog-like notation, these look like: |
| 100 | + |
| 101 | +```text |
| 102 | +(u32: Debug). |
| 103 | +(Rc<T>: Debug) :- (T: Debug). |
| 104 | +(Vec<T>: Debug) :- (T: Debug). |
| 105 | +``` |
| 106 | + |
| 107 | +To create our initial strands, then, we will try to apply each of |
| 108 | +these clauses to our goal of `Rc<?0>: Debug`. The first and third |
| 109 | +clauses are inapplicable because `u32` and `Vec<?0>` cannot be unified |
| 110 | +with `Rc<?0>`. The second clause, however, will work. |
| 111 | + |
| 112 | +**What is a strand?** Let's talk a bit more about what a strand *is*. In the code, a strand |
| 113 | +is the combination of an inference table, an _X-clause_, and (possibly) |
| 114 | +a selected subgoal from that X-clause. But what is an X-clause |
| 115 | +([`ExClause`], in the code)? An X-clause pulls together a few things: |
| 116 | + |
| 117 | +- The current state of the goal we are trying to prove; |
| 118 | +- A set of subgoals that have yet to be proven; |
| 119 | +- There are also a few things we're ignoring for now: |
| 120 | + - delayed literals, region constraints |
| 121 | + |
| 122 | +The general form of an X-clause is written much like a Prolog clause, |
| 123 | +but with somewhat different semantics. Since we're ignoring delayed |
| 124 | +literals and region constraints, an X-clause just looks like this: |
| 125 | + |
| 126 | +```text |
| 127 | +G :- L |
| 128 | +``` |
| 129 | + |
| 130 | +where G is a goal and L is a set of subgoals that must be proven. |
| 131 | +(The L stands for *literal* -- when we address negative reasoning, a |
| 132 | +literal will be either a positive or negative subgoal.) The idea is |
| 133 | +that if we are able to prove L then the goal G can be considered true. |
| 134 | + |
| 135 | +In the case of our example, we would wind up creating one strand, with |
| 136 | +an X-clause like so: |
| 137 | + |
| 138 | +```text |
| 139 | +(Rc<?T>: Debug) :- (?T: Debug) |
| 140 | +``` |
| 141 | + |
| 142 | +Here, the `?T` refers to one of the inference variables created in the |
| 143 | +inference table that accompanies the strand. (I'll use named variables |
| 144 | +to refer to inference variables, and numbered variables like `?0` to |
| 145 | +refer to variables in a canonicalized goal; in the code, however, they |
| 146 | +are both represented with an index.) |
| 147 | + |
| 148 | +For each strand, we also optionally store a *selected subgoal*. This |
| 149 | +is the subgoal after the turnstile (`:-`) that we are currently trying |
| 150 | +to prove in this strand. Initally, when a strand is first created, |
| 151 | +there is no selected subgoal. |
| 152 | + |
| 153 | +[`ExClause`]: https://rust-lang-nursery.github.io/chalk/doc/chalk_engine/struct.ExClause.html |
| 154 | + |
| 155 | +**Activating a strand.** Now that we have created the table T0 and |
| 156 | +initialized it with strands, we have to actually try and produce an answer. |
| 157 | +We do this by invoking the [`ensure_root_answer`] operation on the table: |
| 158 | +specifically, we say `ensure_root_answer(T0, A0)`, meaning "ensure that there |
| 159 | +is a 0th answer A0 to query T0". |
| 160 | + |
| 161 | +Remember that tables store not only strands, but also a vector of cached |
| 162 | +answers. The first thing that [`ensure_root_answer`] does is to check whether |
| 163 | +answer A0 is in this vector. If so, we can just return immediately. In this |
| 164 | +case, the vector will be empty, and hence that does not apply (this becomes |
| 165 | +important for cyclic checks later on). |
| 166 | + |
| 167 | +When there is no cached answer, [`ensure_root_answer`] will try to produce one. |
| 168 | +It does this by selecting a strand from the set of active strands -- the |
| 169 | +strands are stored in a `VecDeque` and hence processed in a round-robin |
| 170 | +fashion. Right now, we have only one strand, storing the following X-clause |
| 171 | +with no selected subgoal: |
| 172 | + |
| 173 | +```text |
| 174 | +(Rc<?T>: Debug) :- (?T: Debug) |
| 175 | +``` |
| 176 | + |
| 177 | +When we activate the strand, we see that we have no selected subgoal, |
| 178 | +and so we first pick one of the subgoals to process. Here, there is only |
| 179 | +one (`?T: Debug`), so that becomes the selected subgoal, changing |
| 180 | +the state of the strand to: |
| 181 | + |
| 182 | +```text |
| 183 | +(Rc<?T>: Debug) :- selected(?T: Debug, A0) |
| 184 | +``` |
| 185 | + |
| 186 | +Here, we write `selected(L, An)` to indicate that (a) the literal `L` |
| 187 | +is the selected subgoal and (b) which answer `An` we are looking for. We |
| 188 | +start out looking for `A0`. |
| 189 | + |
| 190 | +[`ensure_root_answer`]: https://rust-lang-nursery.github.io/chalk/doc/chalk_engine/forest/struct.Forest.html#method.ensure_root_answer |
| 191 | + |
| 192 | +**Processing the selected subgoal.** Next, we have to try and find an |
| 193 | +answer to this selected goal. To do that, we will u-canonicalize it |
| 194 | +and try to find an associated table. In this case, the u-canonical |
| 195 | +form of the subgoal is `?0: Debug`: we don't have a table yet for |
| 196 | +that, so we can create a new one, T1. As before, we'll initialize T1 |
| 197 | +with strands. In this case, there will be three strands, because all |
| 198 | +the program clauses are potentially applicable. Those three strands |
| 199 | +will be: |
| 200 | + |
| 201 | +- `(u32: Debug) :-`, derived from the program clause `(u32: Debug).`. |
| 202 | + - Note: This strand has no subgoals. |
| 203 | +- `(Vec<?U>: Debug) :- (?U: Debug)`, derived from the `Vec` impl. |
| 204 | +- `(Rc<?U>: Debug) :- (?U: Debug)`, derived from the `Rc` impl. |
| 205 | + |
| 206 | +We can thus summarize the state of the whole forest at this point as |
| 207 | +follows: |
| 208 | + |
| 209 | +```text |
| 210 | +Table T0 [Rc<?0>: Debug] |
| 211 | + Strands: |
| 212 | + (Rc<?T>: Debug) :- selected(?T: Debug, A0) |
| 213 | + |
| 214 | +Table T1 [?0: Debug] |
| 215 | + Strands: |
| 216 | + (u32: Debug) :- |
| 217 | + (Vec<?U>: Debug) :- (?U: Debug) |
| 218 | + (Rc<?V>: Debug) :- (?V: Debug) |
| 219 | +``` |
| 220 | + |
| 221 | +**Delegation between tables.** Now that the active strand from T0 has |
| 222 | +created the table T1, it can try to extract an answer. It does this |
| 223 | +via that same `ensure_answer` operation we saw before. In this case, |
| 224 | +the strand would invoke `ensure_answer(T1, A0)`, since we will start |
| 225 | +with the first answer. This will cause T1 to activate its first |
| 226 | +strand, `u32: Debug :-`. |
| 227 | + |
| 228 | +This strand is somewhat special: it has no subgoals at all. This means |
| 229 | +that the goal is proven. We can therefore add `u32: Debug` to the set |
| 230 | +of *answers* for our table, calling it answer A0 (it is the first |
| 231 | +answer). The strand is then removed from the list of strands. |
| 232 | + |
| 233 | +The state of table T1 is therefore: |
| 234 | + |
| 235 | +```text |
| 236 | +Table T1 [?0: Debug] |
| 237 | + Answers: |
| 238 | + A0 = [?0 = u32] |
| 239 | + Strand: |
| 240 | + (Vec<?U>: Debug) :- (?U: Debug) |
| 241 | + (Rc<?V>: Debug) :- (?V: Debug) |
| 242 | +``` |
| 243 | + |
| 244 | +Note that I am writing out the answer A0 as a substitution that can be |
| 245 | +applied to the table goal; actually, in the code, the goals for each |
| 246 | +X-clause are also represented as substitutions, but in this exposition |
| 247 | +I've chosen to write them as full goals, following [NFTD]. |
| 248 | + |
| 249 | +[NFTD]: ./bibliography.html#slg |
| 250 | + |
| 251 | +Since we now have an answer, `ensure_answer(T1, A0)` will return `Ok` |
| 252 | +to the table T0, indicating that answer A0 is available. T0 now has |
| 253 | +the job of incorporating that result into its active strand. It does |
| 254 | +this in two ways. First, it creates a new strand that is looking for |
| 255 | +the next possible answer of T1. Next, it incorpoates the answer from |
| 256 | +A0 and removes the subgoal. The resulting state of table T0 is: |
| 257 | + |
| 258 | +```text |
| 259 | +Table T0 [Rc<?0>: Debug] |
| 260 | + Strands: |
| 261 | + (Rc<?T>: Debug) :- selected(?T: Debug, A1) |
| 262 | + (Rc<u32>: Debug) :- |
| 263 | +``` |
| 264 | + |
| 265 | +We then immediately activate the strand that incorporated the answer |
| 266 | +(the `Rc<u32>: Debug` one). In this case, that strand has no further |
| 267 | +subgoals, so it becomes an answer to the table T0. This answer can |
| 268 | +then be returned up to our caller, and the whole forest goes quiescent |
| 269 | +at this point (remember, we only do enough work to generate *one* |
| 270 | +answer). The ending state of the forest at this point will be: |
| 271 | + |
| 272 | +```text |
| 273 | +Table T0 [Rc<?0>: Debug] |
| 274 | + Answer: |
| 275 | + A0 = [?0 = u32] |
| 276 | + Strands: |
| 277 | + (Rc<?T>: Debug) :- selected(?T: Debug, A1) |
| 278 | +
|
| 279 | +Table T1 [?0: Debug] |
| 280 | + Answers: |
| 281 | + A0 = [?0 = u32] |
| 282 | + Strand: |
| 283 | + (Vec<?U>: Debug) :- (?U: Debug) |
| 284 | + (Rc<?V>: Debug) :- (?V: Debug) |
| 285 | +``` |
| 286 | + |
| 287 | +Here you can see how the forest captures both the answers we have |
| 288 | +created thus far *and* the strands that will let us try to produce |
| 289 | +more answers later on. |
| 290 | + |
| 291 | +## See also |
| 292 | + |
| 293 | +- [chalk_solve README][readme], which contains links to papers used and |
| 294 | + acronyms referenced in the code |
| 295 | +- This section is a lightly adapted version of the blog post [An on-demand |
| 296 | + SLG solver for chalk][slg-blog] |
| 297 | +- [Negative Reasoning in Chalk][negative-reasoning-blog] explains the need |
| 298 | + for negative reasoning, but not how the SLG solver does it |
| 299 | + |
| 300 | +[readme]: https://github.com/rust-lang-nursery/chalk/blob/239e4ae4e69b2785b5f99e0f2b41fc16b0b4e65e/chalk-engine/src/README.md |
| 301 | +[slg-blog]: http://smallcultfollowing.com/babysteps/blog/2018/01/31/an-on-demand-slg-solver-for-chalk/ |
| 302 | +[negative-reasoning-blog]: http://aturon.github.io/blog/2017/04/24/negative-chalk/ |
0 commit comments