|
| 1 | +# Assumptions mechanism |
| 2 | +We have a public API that might help both us and external users to interact with UTBot. |
| 3 | +This document contains detailed instructions about how to use `assume` methods of `UtMock` class and description |
| 4 | +of the problems we encountered during the implementation. |
| 5 | + |
| 6 | +## Brief description |
| 7 | +This section contains short explanations of the meaning for mentioned functions and examples of usage. |
| 8 | + |
| 9 | +### UtMock.assume(predicate) |
| 10 | + |
| 11 | +`assume` is a method that gives the opportunity for users to say to the symbolic virtual machine that |
| 12 | +instructions of the MUT (Method Under Test) encountered after this instruction satisfy the given predicate. |
| 13 | +It is natively understandable concept and the closest analog from Java is the `assert` function. |
| 14 | +When the virtual machine during the analysis encounters such instruction, it drops all the branches in the |
| 15 | +control flow graph violating the predicate. |
| 16 | + |
| 17 | +Examples: |
| 18 | +```java |
| 19 | +int foo(int x) { |
| 20 | + // Here `x` is unbounded symbolic variable. |
| 21 | + // It can be any value from [MIN_VALUE..MAX_VALUE] range. |
| 22 | + |
| 23 | + UtMock.assume(x > 0); |
| 24 | + // Now engine will adjust the range to (0..MAX_VALUE]. |
| 25 | + |
| 26 | + if (x <= 0) { |
| 27 | + throw new RuntimeException("Unreachable exception"); |
| 28 | + } else { |
| 29 | + return 0; |
| 30 | + } |
| 31 | +} |
| 32 | + |
| 33 | +// A function that removes all the branches with a null, empty or unsorted list. |
| 34 | +public List<Integer> sortedList(List<Integer> a) { |
| 35 | + // An invariant that the list is non-null, non-empty and sorted |
| 36 | + UtMock.assume(a != null); |
| 37 | + UtMock.assume(a.size() > 0); |
| 38 | + UtMock.assume(a.get(0) != null); |
| 39 | + |
| 40 | + for (int i = 0; i < a.size() - 1; i++) { |
| 41 | + Integer element = a.get(i + 1) |
| 42 | + UtMock.assume(element != null); |
| 43 | + UtMock.assume(a.get(i) <= element); |
| 44 | + } |
| 45 | + |
| 46 | + return a; |
| 47 | +} |
| 48 | +``` |
| 49 | + |
| 50 | +Thus, `assume` is a mechanism to provide some known properties of the program to the symbolic engine. |
| 51 | + |
| 52 | +### UtMock.assumeOrExecuteConcretely(predicate) |
| 53 | +It is a similar concept to the `assume` with the only difference: it does not drop |
| 54 | +paths violating the predicate, but execute them concretely from the moment of the |
| 55 | +first encountered controversy. Let's take a look at the example below: |
| 56 | + |
| 57 | +```java |
| 58 | +int foo(List<Integer> list) { |
| 59 | + // Let's assume that we have small lists only |
| 60 | + UtMock.assume(list.size() <= 10); |
| 61 | + |
| 62 | + if (list.size() > 10) { |
| 63 | + throw new RuntimeException("Unreachable branch"); |
| 64 | + } else { |
| 65 | + return 0; |
| 66 | + } |
| 67 | +} |
| 68 | +``` |
| 69 | +Here we decided to take into consideration lists with sizes less or equal to 10 to improve performance, and, therefore, lost |
| 70 | +the branch with possible exception. Let's change `assume` to `assumeOrExecuteConcretely`. |
| 71 | +```java |
| 72 | +int foo(List<Integer> list) { |
| 73 | + // Let's assume that we have small lists only |
| 74 | + UtMock.assumeOrExecuteConcretely(list.size() <= 10); |
| 75 | + |
| 76 | + if (list.size() > 10) { |
| 77 | + throw new RuntimeException("Now we will cover this branch"); |
| 78 | + } else { |
| 79 | + return 0; |
| 80 | + } |
| 81 | +} |
| 82 | +``` |
| 83 | +Now we will cover both branches of the MUT. How did we do it? |
| 84 | +At the moment we processed `if` condition we got conflicting constraints: `list.size <= 10` and |
| 85 | +`list.size > 10`. In contrast to the example with `assume` method usage, here we know that |
| 86 | +we got conflict with the provided predicate and we stop symbolic analysis, remove this assumption from |
| 87 | +the path constraints, resolve the MUT's parameters and run the MUT using concrete execution. |
| 88 | + |
| 89 | +Thus, `assumeOrExecuteConcretely` is a way to provide to the engine information about the program |
| 90 | +you'd like to take into account, but if it is impossible, the engine will run the MUT concretely trying to |
| 91 | +cover at least something after the encountered controversy. |
| 92 | + |
| 93 | +## Implementation |
| 94 | +Implementation of the `assume` does not have anything interesting -- |
| 95 | +we add the predicate into path hard-constraints, and it eventually removes violating |
| 96 | +paths from consideration. |
| 97 | + |
| 98 | +Processing a predicate passed as argument in `assumeOrExecuteConcretely` is more tricky. |
| 99 | +Due to the way we work with the solver, it cannot be added to the path constraints |
| 100 | +directly. We treat hard PC as hypothesis and add them to the solver directly, that deprives us |
| 101 | +opportunity to calculate unsat-core to check whether the predicate was a part of it. |
| 102 | + |
| 103 | +Because of it, we introduced an additional type of path constraints. Now we have three of them: |
| 104 | +hard, soft and assumptions. |
| 105 | +* Hard constraints -- properties of the program that must be satisfied at any point of the program. |
| 106 | +* Soft constraints -- properties of the program we want to satisfy, but we can remove them if it is impossible. |
| 107 | +For example, it might be information that some number should be less than zero. But if it is not, we still |
| 108 | +can continue exploration of the same path without this constraint. |
| 109 | +* Assumptions -- predicates passed in the `assumeOrExecuteConcretely`. If we have a controversy between |
| 110 | +an assumption and hard constraints, we should execute the MUT concretely without violating assumption. |
| 111 | + |
| 112 | +Now, when we check if some state is satisfiable, we put hard constraints as hypothesis into the solver |
| 113 | +and check their consistency with soft constraints and assumptions. If the solver returns UNSAT status with |
| 114 | +non-empty unsat core, we remove all conflicting soft constraints from it and try again. If we have |
| 115 | +UNSAT status for the second time and assumptions in it, we remove them from the request and calculates |
| 116 | +status once again. If it is SAT, we put this state in the queue for concrete executions, otherwise -- remove the |
| 117 | +state from consideration. |
| 118 | + |
| 119 | +## Problems |
| 120 | +The main problem is that we didn't get the result we expected. We have many `assume` usages |
| 121 | +in overridden classes source code that limits their size to improve performance. Because of that, the following |
| 122 | +code does not work (we don't generate any executions for them). |
| 123 | + |
| 124 | +```java |
| 125 | +void bigList(List<Integer> list) { |
| 126 | + UtMock.assume(list != null && list.size() > MAX_LIST_SIZE); |
| 127 | +} |
| 128 | + |
| 129 | +void bigSet(Set<Integer> set) { |
| 130 | + UtMock.assume(set != null && set.size() > MAX_SET_SIZE); |
| 131 | +} |
| 132 | + |
| 133 | +void bigMap(Map<Integer> map) { |
| 134 | + UtMock.assume(map != null && map.size() > MAX_MAP_SIZE); |
| 135 | +} |
| 136 | +``` |
| 137 | +The problem in a `preconditionCheck` method of the wrappers. It contains something like that: |
| 138 | +```java |
| 139 | +private void preconditionCheck() { |
| 140 | + if (alreadyVisited(this)) { |
| 141 | + return; |
| 142 | + } |
| 143 | + ... |
| 144 | + assume(elementData.end >= 0 & elementData.end <= MAX_LIST_SIZE); |
| 145 | + ... |
| 146 | +} |
| 147 | +``` |
| 148 | +It is a method that imposes restrictions at the overridden classes provided as arguments. |
| 149 | +For `Lists` the idea works fine, we replace `assume` with `assumeOrExecuteConcretely` and |
| 150 | +find additional branches. |
| 151 | +```java |
| 152 | +private void preconditionCheck() { |
| 153 | + if (alreadyVisited(this)) { |
| 154 | + return; |
| 155 | + } |
| 156 | + ... |
| 157 | + assume(elementData.end >= 0); |
| 158 | + assumeOrExecuteConcretely(elementData.end <= MAX_LIST_SIZE); |
| 159 | + ... |
| 160 | +} |
| 161 | + |
| 162 | +// Now it works! |
| 163 | +void bigList(List<Integer> list) { |
| 164 | + UtMock.assume(list != null && list.size() > MAX_LIST_SIZE); |
| 165 | +} |
| 166 | +``` |
| 167 | + |
| 168 | +But it doesn't work for `String`, `HashSet` and `HashMap`. |
| 169 | + |
| 170 | +The problem with `String` is connected with the way we represent them. Restriction for the size is closely |
| 171 | +connected with the internal storage of chars -- we create a new arrays of chars with some size `n` using `new char[n]` instruction. |
| 172 | +It adds hard constraint that max size of the string is `n` and assumption that `n` is less that `MAX_STRING_SIZE`. |
| 173 | +Somewhere later in the code we have a condition that `String.length() > MAX_STRING_SIZE`. Unfortunately, |
| 174 | +it will cause not only controversy between the assumption and new hard constraints, but and controversy |
| 175 | +between internal array size (hard constraint) and the new-coming hard constraint, that will cause UNSAT status and |
| 176 | +we will lose this branch anyway. |
| 177 | + |
| 178 | +`HashSet` and `HashMap` is a different story. The problem there is inside of `preconditionCheck` implementation. Let's take |
| 179 | +a look at a part of it: |
| 180 | +```java |
| 181 | +assume(elementData.begin == 0); |
| 182 | +assume(elementData.end >= 0); |
| 183 | +assumeOrExecuteConcretely(elementData.end <= 3); |
| 184 | + |
| 185 | +parameter(elementData); |
| 186 | +parameter(elementData.storage); |
| 187 | +doesntThrow(); |
| 188 | + |
| 189 | +// check that all elements are distinct. |
| 190 | +for (int i = elementData.begin; i < elementData.end; i++) { |
| 191 | + E element = elementData.get(i); |
| 192 | + parameter(element); |
| 193 | + // make element address non-positive |
| 194 | + |
| 195 | + // if key is not null, check its hashCode for exception |
| 196 | + if (element != null) { |
| 197 | + element.hashCode(); |
| 198 | + } |
| 199 | + |
| 200 | + // check that there are no duplicate values |
| 201 | + // we can start with a next value, as all previous values are already processed |
| 202 | + for (int j = i + 1; j < elementData.end; j++) { |
| 203 | + // we use Objects.equals to process null case too |
| 204 | + assume(!Objects.equals(element, elementData.get(j))); |
| 205 | + } |
| 206 | +} |
| 207 | + |
| 208 | +visit(this); |
| 209 | +``` |
| 210 | +The problem happens at the first line of the cycle. We now (from the first line of the snippet) that |
| 211 | +the cycle will be from zero to three. The problem is in the `i < elementData.end` check. It produces |
| 212 | +at the fourth iteration hard constraint that `elementData.begin + 4 < elementData.end` and we have |
| 213 | +an assumption that `elementData.end <= 3`. It will cause a concrete run of the MUT in every |
| 214 | +`preconditionCheck` analysis with a constraint `elementData.end == 4`. Moreover, it still won't |
| 215 | +help us with code like `if (someHashSet.size() == 10)`, since we will never get here without hard |
| 216 | +constraint `elementData.end < 4` that came from the cycle. |
0 commit comments