Skip to content

Commit 5d4c4a8

Browse files
authored
Merge pull request #5333 from danpoe/docs/memory-primitives
Add documentation for memory primitives
2 parents 50154b6 + 62a4b55 commit 5d4c4a8

File tree

2 files changed

+257
-27
lines changed

2 files changed

+257
-27
lines changed
Lines changed: 216 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,216 @@
1+
\ingroup module_hidden
2+
\page memory-primitives Memory Primitives
3+
4+
This document describes the semantics and usage of memory-related and
5+
pointer-related primitives in CBMC.
6+
7+
8+
# Background
9+
10+
11+
## Memory and pointers in CBMC
12+
13+
When CBMC analyzes a program, by default it uses the architectural parameters of
14+
the platform it is running on. That is, on a 64-bit system, CBMC will treat
15+
pointers as having 64 bits. This can be changed by various options (see section
16+
"C/C++ frontend options" in the output of `cbmc --help`).
17+
18+
Memory is represented in CBMC as a set of objects. Each object represents a
19+
contiguous sequence of bytes and is identified via a numeric object ID. For
20+
example, assuming integers of width 4 and chars of with 1, a global integer
21+
variable would correspond to an object of size 4, and memory allocated via
22+
`malloc(10)` would correspond to an object of size 10.
23+
24+
A pointer then consists of two parts: the upper n bits form the object ID, and
25+
the remaining bits form the offset. The object ID part holds the ID of the
26+
object the pointer is pointing to, and the offset part holds the byte offset
27+
within that object. The offset is signed.<sup>1</sup> The null pointer is the
28+
pointer with object ID 0 and offset 0. CBMC uses 8 bits by default to represent
29+
the object ID. This can be changed via the `--object-bits <n>` option.
30+
31+
There are three primitives which directly operate on the value of a pointer:
32+
33+
- `__CPROVER_size_t __CPROVER_POINTER_OBJECT(const void *p)`
34+
- `__CPROVER_ssize_t __CPROVER_POINTER_OFFSET(const void *p)`
35+
- `_Bool __CPROVER_same_object(const void *p, const void *q)`
36+
37+
The primitive `__CPROVER_POINTER_OBJECT(p)` retrieves the object ID part of a
38+
pointer, and the primitive `__CPROVER_POINTER_OFFSET(p)` retrieves the offset
39+
part of a pointer. The `__CPROVER_same_object(p, q)` primitive simply compares
40+
the object IDs of the two given pointers. That is, it is true if and only if
41+
`__CPROVER_POINTER_OBJECT(p) == __CPROVER_POINTER_OBJECT(q)`. It is always valid
42+
to apply these three primitives to a pointer (i.e., they do not have any special
43+
preconditions).
44+
45+
## Memory Objects
46+
47+
Seeing that pointers consist of an object ID and an offset, it remains to
48+
describe how CBMC assigns object IDs to memory objects (such as local variables
49+
or malloced memory). CBMC deterministically assigns consecutive object IDs to
50+
memory objects as it encounters them. For example:
51+
52+
```C
53+
...
54+
char c;
55+
56+
char *p = &c; // object ID n
57+
char *q = malloc(10); // object ID n + 1
58+
59+
```
60+
61+
Here the pointers `p` and `q` would contain consecutive IDs in their object ID
62+
parts (as retrieved by `__CPROVER_POINTER_OBJECT()`). Due to the deterministic
63+
assignment of object IDs, bugs that can only be exposed with specific pointer
64+
values cannot be found by CBMC. For example:
65+
66+
```C
67+
char *p = malloc(1); // assume cbmc assigns object ID 0xE to the malloced memory
68+
assert(p != (char *)0x0F00000000000000);
69+
```
70+
71+
CBMC will report verification successful for this code snippet (assuming it
72+
assigns an object ID other than 0x0F to the malloced memory). However, assuming
73+
that `malloc()` could allocate memory at any address, the assertion could fail.
74+
75+
Moreover, CBMC does not reuse object IDs for malloced memory. For example:
76+
77+
```C
78+
char *p = malloc(1);
79+
free(p);
80+
char *q = malloc(1);
81+
assert(p != q);
82+
```
83+
84+
CBMC would report verification successful on this code snippet. However,
85+
assuming that `malloc()` could reuse deallocated addresses, the assertion could
86+
fail.
87+
88+
The memory objects in CBMC are independent of each other. That is, for example,
89+
when incrementing a pointer past the bounds of an object, the pointer will never
90+
point into another memory object (such as could happen when running on a real
91+
machine). To verify that pointers stay within the bounds of their pointees, the
92+
CBMC option `--pointer-overflow-check` can be used.
93+
94+
## Uninitialized pointers
95+
96+
In verification tools, uninitialized variables are typically treated as having a
97+
nondeterministic value. Programs can thus be verified on a set of potential
98+
inputs. For example:
99+
100+
```C
101+
int i;
102+
__CPROVER_assume(i < 0);
103+
int result = rectify(i);
104+
assert(result == 0);
105+
```
106+
107+
Here, the value of `i` is nondeterministically chosen from all the possible
108+
integer values, and then constrained to negative values via the assumption.
109+
In CBMC, like uninitialized integers, uninitialized pointers are treated as
110+
having a nondeterministic value. That is, the value of the pointer itself is
111+
nondeterministically chosen, though **no memory is allocated**. Therefore,
112+
pointers should be explicitely initialized to ensure that they are backed by
113+
valid memory.
114+
115+
116+
# Memory Primitives
117+
118+
In this section, we describe further memory primitives of CBMC. Above, we have
119+
already encountered the primitives `__CPROVER_POINTER_OBJECT(p)`,
120+
`__CPROVER_POINTER_OFFSET(p)`, and `__CPROVER_same_object(p, q)`. These
121+
primitives just retrieve the object ID part or offset part of a pointer, or
122+
compare the object ID parts of two pointers. It is always valid to apply these
123+
primitives to a pointer (i.e., they do not have any special preconditions).
124+
125+
In the following, we need the concept of a valid pointer. A pointer is *valid*
126+
if it points to a live memory object. That is, it points to the start or to
127+
somewhere within the sequence of bytes that makes up the memory object.
128+
129+
Conversely, a pointer is invalid if it is null, uninitialized, points to
130+
deallocated dynamic memory, points to an out-of-scope local variable, or has a
131+
value that does not point to (dynamically, automatically, or statically)
132+
allocated memory, or is out of bounds of the memory object it points to (i.e.,
133+
the memory object identified by `__CPROVER_POINTER_OBJECT(p)`).
134+
135+
The primitives below have unspecified semantics on pointers that are neither
136+
null nor valid. CBMC has an option `--pointer-primitive-check` (see section
137+
[Detecting potential misuses of memory primitives](#detecting-potential-misuses-of-memory-primitives) below)
138+
to check that pointers used in the primitives are either null or valid.
139+
140+
141+
## Retrieving the size of a memory object
142+
143+
The following primitive can be used to retrieve the size of the memory object a
144+
pointer points to:
145+
146+
- `__CPROVER_size_t __CPROVER_OBJECT_SIZE(const void *p)`
147+
148+
If `p` is the null pointer, the primitive returns 0. If `p` is valid, the
149+
primitive returns the size of the memory object the pointer points to.
150+
Otherwise, the semantics is unspecified. In particular, it is valid to apply
151+
this primitive to a pointer that points to within a memory object (i.e., not
152+
necessarily to the start). The result is the same as if the pointer would point
153+
to the start of the memory object (i.e., would have offset 0).
154+
155+
156+
## Checking if a pointer points to dynamic memory
157+
158+
The following primitive can be used to check whether a pointer points to dynamic
159+
(malloced, heap) memory:
160+
161+
- `_Bool __CPROVER_DYNAMIC_OBJECT(const void *p)`
162+
163+
If `p` is the null pointer, the primitive returns false. If `p` is valid, the
164+
primitive returns true if the pointer points to dynamically allocated memory,
165+
and false otherwise. If `p` is neither null nor valid, the semantics is
166+
unspecified. Like `__CPROVER_OBJECT_SIZE()`, it is valid to apply the primitive
167+
to pointers that point to within a memory object.
168+
169+
170+
## Checking if a memory segment has at least a given size
171+
172+
The following two primitives can be used to check whether there is a memory
173+
segment starting at the given pointer and extending for at least the given
174+
number of bytes:
175+
176+
- `_Bool __CPROVER_r_ok(const void *p, size_t size)`
177+
- `_Bool __CPROVER_w_ok(const void *p, size_t size)`
178+
179+
At present, both primitives are equivalent as all memory in CBMC is considered
180+
both readable and writeable. If `p` is the null pointer, the primitives return
181+
false. If `p` is valid, the primitives return true if the memory segment
182+
starting at the pointer has at least the given size, and false otherwise. If `p`
183+
is neither null nor valid, the semantics is unspecified. It is valid to apply
184+
the primitive to pointers that point to within a memory object. For example:
185+
186+
```C
187+
char *p = malloc(10);
188+
assert(__CPROVER_r_ok(p, 10)); // valid
189+
p += 5;
190+
assert(__CPROVER_r_ok(p, 3)); // valid
191+
assert(__CPROVER_r_ok(p, 10)); // fails
192+
```
193+
194+
# Detecting potential misuses of memory primitives
195+
196+
As described above, the primitives listed in the Memory Primitives section
197+
require a pointer that is either null or valid to have well-defined semantics.
198+
CBMC has the option `--pointer-primitive-check` to detect potential misuses of
199+
the memory primitives. It checks that the pointers that appear in the following
200+
primitives are either null or valid:
201+
202+
- `__CPROVER_POINTER_OBJECT`
203+
- `__CPROVER_POINTER_OFFSET`
204+
- `__CPROVER_same_object`
205+
- `__CPROVER_OBJECT_SIZE`
206+
- `__CPROVER_DYNAMIC_OBJECT`
207+
- `__CPROVER_r_ok`
208+
- `__CPROVER_w_ok`
209+
210+
While the first three primitives have well-defined semantics even on invalid
211+
pointers, using them on invalid pointers is usually unintended in user programs.
212+
Thus, they have been included in the `--pointer-primitive-check` option.
213+
214+
<sup>1</sup> Pointers with negative offsets never point to memory objects.
215+
Negative values are used internally to detect pointer underflows.
216+

doc/cprover-manual/api.md

Lines changed: 41 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -20,33 +20,6 @@ to the program. If the expression evaluates to false, the execution
2020
aborts without failure. More detail on the use of assumptions is in the
2121
section on [Assumptions](../modeling/assumptions/).
2222
23-
#### \_\_CPROVER\_r_ok, \_\_CPROVER\_w_ok
24-
25-
```C
26-
void __CPROVER_r_ok(const void *, size_t size);
27-
void __CPROVER_w_ok(const void *, size_t size);
28-
```
29-
30-
The function **\_\_CPROVER\_r_ok** returns true if reading the piece of
31-
memory starting at the given pointer with the given size is safe.
32-
**\_\_CPROVER\_w_ok** does the same with writing.
33-
34-
#### \_\_CPROVER\_same\_object, \_\_CPROVER\_POINTER\_OBJECT, \_\_CPROVER\_POINTER\_OFFSET, \_\_CPROVER\_DYNAMIC\_OBJECT
35-
36-
```C
37-
_Bool __CPROVER_same_object(const void *, const void *);
38-
__CPROVER_size_t __CPROVER_POINTER_OBJECT(const void *p);
39-
__CPROVER_ssize_t __CPROVER_POINTER_OFFSET(const void *p);
40-
_Bool __CPROVER_DYNAMIC_OBJECT(const void *p);
41-
```
42-
43-
The function **\_\_CPROVER\_same\_object** returns true if the two
44-
pointers given as arguments point to the same object. The function
45-
**\_\_CPROVER\_POINTER\_OFFSET** returns the offset of the given pointer
46-
relative to the base address of the object. The function
47-
**\_\_CPROVER\_DYNAMIC\_OBJECT** returns true if the pointer passed as
48-
arguments points to a dynamically allocated object.
49-
5023
#### \_\_CPROVER\_input, \_\_CPROVER\_output
5124
5225
```C
@@ -156,6 +129,47 @@ the array **dest** with the given value.
156129
157130
Uninterpreted functions are documented [here](./modeling-nondeterminism.md)).
158131
132+
### Memory-Related Functions
133+
134+
The semantics of the primitives listed in this section is described in more detail in the
135+
document about [Memory Primitives](http://cprover.diffblue.com/memory-primitives.html).
136+
137+
#### \_\_CPROVER\_POINTER\_OBJECT, \_\_CPROVER\_POINTER\_OFFSET, \_\_CPROVER\_same\_object
138+
139+
```C
140+
__CPROVER_size_t __CPROVER_POINTER_OBJECT(const void *p);
141+
__CPROVER_ssize_t __CPROVER_POINTER_OFFSET(const void *p);
142+
_Bool __CPROVER_same_object(const void *p, const void *q);
143+
```
144+
145+
The function **\_\_CPROVER\_POINTER\_OBJECT** returns the ID of the object the
146+
pointer points to. The function **\_\_CPROVER\_POINTER\_OFFSET** returns the
147+
offset of the given pointer relative to the base address of the object. The
148+
function **\_\_CPROVER\_same\_object** returns true if the two pointers given as
149+
arguments point to the same object.
150+
151+
#### \_\_CPROVER\_OBJECT\_SIZE, \_\_CPROVER\_DYNAMIC\_OBJECT, \_\_CPROVER\_r\_ok, \_\_CPROVER\_w\_ok
152+
153+
The following primitives require a pointer that is null or valid in order to
154+
have well-defined semantics in all usage cases. See the document about
155+
[Memory Primitives](http://cprover.diffblue.com/memory-primitives.html)
156+
for more details. It also includes a description of the `--pointer-primitive-check`
157+
option to verify the preconditions of the primitives.
158+
159+
```C
160+
__CPROVER_size_t __CPROVER_OBJECT_SIZE(const void *p);
161+
_Bool __CPROVER_DYNAMIC_OBJECT(const void *p);
162+
void __CPROVER_r_ok(const void *p, size_t size);
163+
void __CPROVER_w_ok(const void *p, size_t size);
164+
```
165+
166+
The function **\_\_CPROVER_\_OBJECT\_SIZE** returns the size of the object the
167+
given pointer points to. The function **\_\_CPROVER\_DYNAMIC\_OBJECT** returns
168+
true if the pointer passed as an argument points to a dynamically allocated
169+
object. The function **\_\_CPROVER\_r_ok** returns true if reading the piece of
170+
memory starting at the given pointer with the given size is safe.
171+
**\_\_CPROVER\_w_ok** does the same with writing.
172+
159173
### Predefined Types and Symbols
160174
161175
#### \_\_CPROVER\_bitvector

0 commit comments

Comments
 (0)