|
| 1 | +# MIR definition and pass system |
| 2 | + |
| 3 | +This file contains the definition of the MIR datatypes along with the |
| 4 | +various types for the "MIR Pass" system, which lets you easily |
| 5 | +register and define new MIR transformations and analyses. |
| 6 | + |
| 7 | +Most of the code that operates on MIR can be found in the |
| 8 | +`librustc_mir` crate or other crates. The code found here in |
| 9 | +`librustc` is just the datatype definitions, alonging the functions |
| 10 | +which operate on MIR to be placed everywhere else. |
| 11 | + |
| 12 | +## MIR Data Types and visitor |
| 13 | + |
| 14 | +The main MIR data type is `rustc::mir::Mir`, defined in `mod.rs`. |
| 15 | +There is also the MIR visitor (in `visit.rs`) which allows you to walk |
| 16 | +the MIR and override what actions will be taken at various points (you |
| 17 | +can visit in either shared or mutable mode; the latter allows changing |
| 18 | +the MIR in place). Finally `traverse.rs` contains various traversal |
| 19 | +routines for visiting the MIR CFG in [different standard orders][traversal] |
| 20 | +(e.g. pre-order, reverse post-order, and so forth). |
| 21 | + |
| 22 | +[traversal]: https://en.wikipedia.org/wiki/Tree_traversal |
| 23 | + |
| 24 | +## MIR pass suites and their integration into the query system |
| 25 | + |
| 26 | +As a MIR *consumer*, you are expected to use one of the queries that |
| 27 | +returns a "final MIR". As of the time of this writing, there is only |
| 28 | +one: `optimized_mir(def_id)`, but more are expected to come in the |
| 29 | +future. For foreign def-ids, we simply read the MIR from the other |
| 30 | +crate's metadata. But for local query, this query will construct the |
| 31 | +MIR and then iteratively optimize it by putting it through various |
| 32 | +pipeline stages. This section describes those pipeline stages and how |
| 33 | +you can extend them. |
| 34 | + |
| 35 | +Here is a diagram showing the various MIR queries involved in producing |
| 36 | +the final `optimized_mir()` for a single def-id `D`. The arrows here |
| 37 | +indicate how data flows from query to query. |
| 38 | + |
| 39 | +``` |
| 40 | +mir_build(D) |
| 41 | + -> mir_pass((0,0,D)) ---+ each suite consists of many passes |
| 42 | + -> ... | |
| 43 | + -> mir_pass((0,N,D)) | |
| 44 | + -> mir_suite((0,D)) ---+ ---+ there are several suites |
| 45 | + -> ... | |
| 46 | + -> mir_suite((M,D)) ---+ |
| 47 | + -> mir_optimized(D) |
| 48 | +``` |
| 49 | + |
| 50 | +The MIR transformation pipeline is organized into **suites**. When |
| 51 | +you ask for `mir_optimized(D)`, it will turn around and request the |
| 52 | +result from the final **suite** of MIR passes |
| 53 | +(`mir_suite((M,D))`). This will in turn (eventually) trigger the MIR |
| 54 | +to be build and then passes through each of the optimization suites. |
| 55 | +Each suite internally triggers one query for each of its passes |
| 56 | +(`mir_pass(...)`). |
| 57 | + |
| 58 | +The reason for the suites is that they represent points in the MIR |
| 59 | +transformation pipeline where other bits of code are interested in |
| 60 | +observing. For example, the `MIR_CONST` suite defines the point where |
| 61 | +analysis for constant rvalues and expressions can take |
| 62 | +place. `MIR_OPTIMIZED` naturally represents the point where we |
| 63 | +actually generate machine code. Nobody should ever request the result |
| 64 | +of an individual *pass*, at least outside of the transformation |
| 65 | +pipeline: this allows us to add passes into the appropriate suite |
| 66 | +without having to modify anything else in the compiler. |
| 67 | + |
| 68 | +### Stealing |
| 69 | + |
| 70 | +Each of these intermediate queries yields up a `&'tcx |
| 71 | +Steal<Mir<'tcx>>`, allocated using `tcx.alloc_steal_mir()`. This |
| 72 | +indicates that the result may be **stolen** by the next pass -- this |
| 73 | +is an optimization to avoid cloning the MIR. Attempting to use a |
| 74 | +stolen result will cause a panic in the compiler. Therefore, it is |
| 75 | +important that you not read directly from these intermediate queries |
| 76 | +except as part of the MIR processing pipeline. |
| 77 | + |
| 78 | +Because of this stealing mechanism, some care must also be taken to |
| 79 | +ensure that, before the MIR at a particular phase in the processing |
| 80 | +pipeline is stolen, anyone who may want to read from it has already |
| 81 | +done so. Sometimes this requires **forcing** queries |
| 82 | +(`ty::queries::foo::force(...)`) during an optimization pass -- this |
| 83 | +will force a query to execute even though you don't directly require |
| 84 | +its result. The query can then read the MIR it needs, and -- once it |
| 85 | +is complete -- you can steal it. |
| 86 | + |
| 87 | +As an example, consider MIR const qualification. It wants to read the |
| 88 | +result produced by the `MIR_CONST` suite. However, that result will be |
| 89 | +**stolen** by the first pass in the next suite (that pass performs |
| 90 | +const promotion): |
| 91 | + |
| 92 | +``` |
| 93 | +mir_suite((MIR_CONST,D)) --read-by--> mir_const_qualif(D) |
| 94 | + | |
| 95 | + stolen-by |
| 96 | + | |
| 97 | + v |
| 98 | +mir_pass((MIR_VALIDATED,0,D)) |
| 99 | +``` |
| 100 | + |
| 101 | +Therefore, the const promotion pass (the `mir_pass()` in the diagram) |
| 102 | +will **force** `mir_const_qualif` before it actually steals, thus |
| 103 | +ensuring that the reads have already happened (and the final result is |
| 104 | +cached). |
| 105 | + |
| 106 | +### Implementing and registering a pass |
| 107 | + |
| 108 | +To create a new MIR pass, you have to implement one of the MIR pass |
| 109 | +traits. There are several traits, and you want to pick the most |
| 110 | +specific one that applies to your pass. They are described here in |
| 111 | +order of preference. Once you have implemented a trait for your type |
| 112 | +`Foo`, you then have to insert `Foo` into one of the suites; this is |
| 113 | +done in `librustc_driver/driver.rs` by invoking `push_pass()` with the |
| 114 | +appropriate suite. |
| 115 | + |
| 116 | +**The `MirPass` trait.** For the most part, a MIR pass works by taking |
| 117 | +as input the MIR for a single function and mutating it imperatively to |
| 118 | +perform an optimization. To write such a pass, you can implement the |
| 119 | +`MirPass` trait, which has a single callback that takes an `&mut Mir`. |
| 120 | + |
| 121 | +**The `DefIdPass` trait.** When a `MirPass` trait is executed, the |
| 122 | +system will automatically steal the result of the previous pass and |
| 123 | +supply it to you. (See the section on queries and stealing below.) |
| 124 | +Sometimes you don't want to steal the result of the previous pass |
| 125 | +right away. In such cases, you can define a `DefIdPass`, which simply |
| 126 | +gets a callback and lets you decide when to steal the previous result. |
| 127 | + |
| 128 | +**The `Pass` trait.** The most primitive but flexible trait is `Pass`. |
| 129 | +Unlike the other pass types, it returns a `Multi` result, which means |
| 130 | +it scan be used for interprocedural passes which mutate more than one |
| 131 | +MIR at a time (e.g., `inline`). |
| 132 | + |
| 133 | +### The MIR Context |
| 134 | + |
| 135 | +All of the passes when invoked take a `MirCtxt` object. This contains |
| 136 | +various methods to find out (e.g.) the current pass suite and pass |
| 137 | +index, the def-id you are operating on, and so forth. You can also |
| 138 | +access the MIR for the current def-id using `read_previous_mir()`; the |
| 139 | +"previous" refers to the fact that this will be the MIR that was |
| 140 | +output by the previous pass. Finally, you can `steal_previous_mir()` |
| 141 | +to steal the output of the current pass (in which case you get |
| 142 | +ownership of the MIR). |
0 commit comments