| 
 | 1 | +# Assumptions mechanism  | 
 | 2 | +We have a public API that might help both us and external users to interact with UTBot.   | 
 | 3 | +This document contains detailed instructions about how to use `assume` methods of `UtMock` class and description   | 
 | 4 | +of the problems we encountered during the implementation.  | 
 | 5 | + | 
 | 6 | +## Brief description  | 
 | 7 | +This section contains short explanations of the meaning for mentioned functions and examples of usage.  | 
 | 8 | + | 
 | 9 | +### UtMock.assume(predicate)  | 
 | 10 | + | 
 | 11 | +`assume` is a method that gives the opportunity for users to say to the symbolic virtual machine that  | 
 | 12 | +instructions of the MUT (Method Under Test) encountered after this instruction satisfy the given predicate.  | 
 | 13 | +It is natively understandable concept and the closest analog from Java is the `assert` function.   | 
 | 14 | +When the virtual machine during the analysis encounters such instruction, it drops all the branches in the   | 
 | 15 | +control flow graph violating the predicate.  | 
 | 16 | + | 
 | 17 | +Examples:   | 
 | 18 | +```java  | 
 | 19 | +int foo(int x) {  | 
 | 20 | +    // Here `x` is unbounded symbolic variable.  | 
 | 21 | +    // It can be any value from [MIN_VALUE..MAX_VALUE] range.  | 
 | 22 | +          | 
 | 23 | +    UtMock.assume(x > 0);  | 
 | 24 | +    // Now engine will adjust the range to (0..MAX_VALUE].  | 
 | 25 | +          | 
 | 26 | +    if (x <= 0) {  | 
 | 27 | +        throw new RuntimeException("Unreachable exception");      | 
 | 28 | +    } else {  | 
 | 29 | +        return 0;  | 
 | 30 | +    }  | 
 | 31 | +}  | 
 | 32 | + | 
 | 33 | +// A function that removes all the branches with a null, empty or unsorted list.  | 
 | 34 | +public List<Integer> sortedList(List<Integer> a) {  | 
 | 35 | +    // An invariant that the list is non-null, non-empty and sorted  | 
 | 36 | +    UtMock.assume(a != null);  | 
 | 37 | +    UtMock.assume(a.size() > 0);  | 
 | 38 | +    UtMock.assume(a.get(0) != null);  | 
 | 39 | +      | 
 | 40 | +    for (int i = 0; i < a.size() - 1; i++) {  | 
 | 41 | +        Integer element = a.get(i + 1)  | 
 | 42 | +        UtMock.assume(element != null);  | 
 | 43 | +        UtMock.assume(a.get(i) <= element);  | 
 | 44 | +    }  | 
 | 45 | +      | 
 | 46 | +    return a;  | 
 | 47 | +}  | 
 | 48 | +```  | 
 | 49 | + | 
 | 50 | +Thus, `assume` is a mechanism to provide some known properties of the program to the symbolic engine.  | 
 | 51 | + | 
 | 52 | +### UtMock.assumeOrExecuteConcretely(predicate)  | 
 | 53 | +It is a similar concept to the `assume` with the only difference: it does not drop   | 
 | 54 | +paths violating the predicate, but execute them concretely from the moment of the   | 
 | 55 | +first encountered controversy. Let's take a look at the example below:  | 
 | 56 | + | 
 | 57 | +```java  | 
 | 58 | +int foo(List<Integer> list) {  | 
 | 59 | +    // Let's assume that we have small lists only  | 
 | 60 | +    UtMock.assume(list.size() <= 10);  | 
 | 61 | +      | 
 | 62 | +    if (list.size() > 10) {  | 
 | 63 | +        throw new RuntimeException("Unreachable branch");  | 
 | 64 | +    } else {  | 
 | 65 | +        return 0;      | 
 | 66 | +    }  | 
 | 67 | +}  | 
 | 68 | +```  | 
 | 69 | +Here we decided to take into consideration lists with sizes less or equal to 10 to improve performance, and, therefore, lost  | 
 | 70 | +the branch with possible exception. Let's change `assume` to `assumeOrExecuteConcretely`.   | 
 | 71 | +```java  | 
 | 72 | +int foo(List<Integer> list) {  | 
 | 73 | +    // Let's assume that we have small lists only  | 
 | 74 | +    UtMock.assumeOrExecuteConcretely(list.size() <= 10);  | 
 | 75 | +      | 
 | 76 | +    if (list.size() > 10) {  | 
 | 77 | +        throw new RuntimeException("Now we will cover this branch");  | 
 | 78 | +    } else {  | 
 | 79 | +        return 0;      | 
 | 80 | +    }  | 
 | 81 | +}  | 
 | 82 | +```  | 
 | 83 | +Now we will cover both branches of the MUT. How did we do it?   | 
 | 84 | +At the moment we processed `if` condition we got conflicting constraints: `list.size <= 10` and   | 
 | 85 | +`list.size > 10`. In contrast to the example with `assume` method usage, here we know that   | 
 | 86 | +we got conflict with the provided predicate and we stop symbolic analysis, remove this assumption from   | 
 | 87 | +the path constraints, resolve the MUT's parameters and run the MUT using concrete execution.  | 
 | 88 | + | 
 | 89 | +Thus, `assumeOrExecuteConcretely` is a way to provide to the engine information about the program  | 
 | 90 | +you'd like to take into account, but if it is impossible, the engine will run the MUT concretely trying to   | 
 | 91 | +cover at least something after the encountered controversy.  | 
 | 92 | + | 
 | 93 | +## Implementation  | 
 | 94 | +Implementation of the `assume` does not have anything interesting --  | 
 | 95 | +we add the predicate into path hard-constraints, and it eventually removes violating   | 
 | 96 | +paths from consideration.  | 
 | 97 | + | 
 | 98 | +Processing a predicate passed as argument in `assumeOrExecuteConcretely` is more tricky.   | 
 | 99 | +Due to the way we work with the solver, it cannot be added to the path constraints  | 
 | 100 | +directly. We treat hard PC as hypothesis and add them to the solver directly, that deprives us  | 
 | 101 | +opportunity to calculate unsat-core to check whether the predicate was a part of it.   | 
 | 102 | + | 
 | 103 | +Because of it, we introduced an additional type of path constraints. Now we have three of them:   | 
 | 104 | +hard, soft and assumptions.   | 
 | 105 | +* Hard constraints -- properties of the program that must be satisfied at any point of the program.  | 
 | 106 | +* Soft constraints -- properties of the program we want to satisfy, but we can remove them if it is impossible.  | 
 | 107 | +For example, it might be information that some number should be less than zero. But if it is not, we still  | 
 | 108 | +can continue exploration of the same path without this constraint.   | 
 | 109 | +* Assumptions -- predicates passed in the `assumeOrExecuteConcretely`. If we have a controversy between   | 
 | 110 | +an assumption and hard constraints, we should execute the MUT concretely without violating assumption.  | 
 | 111 | + | 
 | 112 | +Now, when we check if some state is satisfiable, we put hard constraints as hypothesis into the solver  | 
 | 113 | +and check their consistency with soft constraints and assumptions. If the solver returns UNSAT status with   | 
 | 114 | +non-empty unsat core, we remove all conflicting soft constraints from it and try again. If we have   | 
 | 115 | +UNSAT status for the second time and assumptions in it, we remove them from the request and calculates   | 
 | 116 | +status once again. If it is SAT, we put this state in the queue for concrete executions, otherwise -- remove the  | 
 | 117 | +state from consideration.  | 
 | 118 | + | 
 | 119 | +## Problems  | 
 | 120 | +The main problem is that we didn't get the result we expected. We have many `assume` usages   | 
 | 121 | +in overridden classes source code that limits their size to improve performance. Because of that, the following   | 
 | 122 | +code does not work (we don't generate any executions for them).  | 
 | 123 | + | 
 | 124 | +```java  | 
 | 125 | +void bigList(List<Integer> list) {  | 
 | 126 | +    UtMock.assume(list != null && list.size() > MAX_LIST_SIZE);  | 
 | 127 | +}  | 
 | 128 | + | 
 | 129 | +void bigSet(Set<Integer> set) {  | 
 | 130 | +    UtMock.assume(set != null && set.size() > MAX_SET_SIZE);  | 
 | 131 | +}  | 
 | 132 | +          | 
 | 133 | +void bigMap(Map<Integer> map) {  | 
 | 134 | +    UtMock.assume(map != null && map.size() > MAX_MAP_SIZE);  | 
 | 135 | +}  | 
 | 136 | +```  | 
 | 137 | +The problem in a `preconditionCheck` method of the wrappers. It contains something like that:  | 
 | 138 | +```java  | 
 | 139 | +private void preconditionCheck() {  | 
 | 140 | +    if (alreadyVisited(this)) {  | 
 | 141 | +        return;  | 
 | 142 | +    }  | 
 | 143 | +    ...  | 
 | 144 | +    assume(elementData.end >= 0 & elementData.end <= MAX_LIST_SIZE);  | 
 | 145 | +    ...  | 
 | 146 | +}  | 
 | 147 | +```  | 
 | 148 | +It is a method that imposes restrictions at the overridden classes provided as arguments.  | 
 | 149 | +For `Lists` the idea works fine, we replace `assume` with `assumeOrExecuteConcretely` and  | 
 | 150 | +find additional branches.   | 
 | 151 | +```java  | 
 | 152 | +private void preconditionCheck() {  | 
 | 153 | +    if (alreadyVisited(this)) {  | 
 | 154 | +        return;  | 
 | 155 | +    }  | 
 | 156 | +    ...  | 
 | 157 | +    assume(elementData.end >= 0);  | 
 | 158 | +    assumeOrExecuteConcretely(elementData.end <= MAX_LIST_SIZE);  | 
 | 159 | +    ...  | 
 | 160 | +}  | 
 | 161 | + | 
 | 162 | +// Now it works!  | 
 | 163 | +void bigList(List<Integer> list) {  | 
 | 164 | +    UtMock.assume(list != null && list.size() > MAX_LIST_SIZE);  | 
 | 165 | +}  | 
 | 166 | +```  | 
 | 167 | + | 
 | 168 | +But it doesn't work for `String`, `HashSet` and `HashMap`.  | 
 | 169 | + | 
 | 170 | +The problem with `String` is connected with the way we represent them. Restriction for the size is closely  | 
 | 171 | +connected with the internal storage of chars -- we create a new arrays of chars with some size `n` using `new char[n]` instruction.  | 
 | 172 | +It adds hard constraint that max size of the string is `n` and assumption that `n` is less that `MAX_STRING_SIZE`.  | 
 | 173 | +Somewhere later in the code we have a condition that `String.length() > MAX_STRING_SIZE`. Unfortunately,   | 
 | 174 | +it will cause not only controversy between the assumption and new hard constraints, but and controversy  | 
 | 175 | +between internal array size (hard constraint) and the new-coming hard constraint, that will cause UNSAT status and   | 
 | 176 | +we will lose this branch anyway.  | 
 | 177 | + | 
 | 178 | +`HashSet` and `HashMap` is a different story. The problem there is inside of `preconditionCheck` implementation. Let's take   | 
 | 179 | +a look at a part of it:  | 
 | 180 | +```java  | 
 | 181 | +assume(elementData.begin == 0);  | 
 | 182 | +assume(elementData.end >= 0);  | 
 | 183 | +assumeOrExecuteConcretely(elementData.end <= 3);  | 
 | 184 | + | 
 | 185 | +parameter(elementData);  | 
 | 186 | +parameter(elementData.storage);  | 
 | 187 | +doesntThrow();  | 
 | 188 | + | 
 | 189 | +// check that all elements are distinct.  | 
 | 190 | +for (int i = elementData.begin; i < elementData.end; i++) {  | 
 | 191 | +    E element = elementData.get(i);  | 
 | 192 | +    parameter(element);  | 
 | 193 | +    // make element address non-positive  | 
 | 194 | + | 
 | 195 | +    // if key is not null, check its hashCode for exception  | 
 | 196 | +    if (element != null) {  | 
 | 197 | +        element.hashCode();  | 
 | 198 | +    }  | 
 | 199 | + | 
 | 200 | +    // check that there are no duplicate values  | 
 | 201 | +    // we can start with a next value, as all previous values are already processed  | 
 | 202 | +    for (int j = i + 1; j < elementData.end; j++) {  | 
 | 203 | +        // we use Objects.equals to process null case too  | 
 | 204 | +        assume(!Objects.equals(element, elementData.get(j)));  | 
 | 205 | +    }  | 
 | 206 | +}  | 
 | 207 | + | 
 | 208 | +visit(this);  | 
 | 209 | +```  | 
 | 210 | +The problem happens at the first line of the cycle. We now (from the first line of the snippet) that   | 
 | 211 | +the cycle will be from zero to three. The problem is in the `i < elementData.end` check. It produces   | 
 | 212 | +at the fourth iteration hard constraint that `elementData.begin + 4 < elementData.end` and we have  | 
 | 213 | +an assumption that `elementData.end <= 3`. It will cause a concrete run of the MUT in every   | 
 | 214 | +`preconditionCheck` analysis with a constraint `elementData.end == 4`. Moreover, it still won't   | 
 | 215 | +help us with code like `if (someHashSet.size() == 10)`, since we will never get here without hard  | 
 | 216 | +constraint `elementData.end < 4` that came from the cycle.  | 
0 commit comments