Skip to content

Commit

Permalink
_Out, _Opt qualifiers
Browse files Browse the repository at this point in the history
  • Loading branch information
thradams committed Dec 28, 2023
1 parent 7eff3e0 commit ddf7832
Show file tree
Hide file tree
Showing 10 changed files with 14,943 additions and 14,406 deletions.
196 changes: 193 additions & 3 deletions ownership.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,14 @@

Last Updated 23/12/2023
Last Updated 27/12/2023

## Abstract
The objective is to statically check code and prevent all types of bugs, including memory bugs. For this task, the compiler needs information that humans typically gather from the context. For example, names like "destroy" or "init" serve as hints, along with documentation and sometimes the implementation itself.

The compiler doesn't read documentation, nor does it operate in the same way as humans. Instead, a formal means of communication with the compiler is necessary. To facilitate this, new qualifiers have been created, and new methods of communication with the compiler have been established.

In the end, we still have the same language, but with a TypeSystem++ version of C. This TypeSystem++ can be disabled, and the language remains unmodified.

The creation of these rules follows certain principles, one of which is to default to safety. In cases of uncertainty, the compiler should seek clarification. While C programmers retain the freedom to code as they wish, they must either persuade the compiler or disable analysis in specific code sections. Qualifiers are chosen to minimize annotations considering a common scenario.

## Owner Objects

Expand Down Expand Up @@ -326,8 +335,165 @@ c:/main.c:4:2: note: static_debug
| ^~~~~~~~~~~~
a == "uninitialized"
```


As we have just seen, the **uninitialized** state is the state of variables that are declared but not initialized.

The compiler ensures that we don't read uninitialized objects.

```c
void f1(int i);
int main() {
int i;
f1(i); //error: uninitialized object 'i'
}
```
The other situation were variables becomes **uninitialized** is when moving ownership to function parameters. This prevents bugs like double free or use after free.
```c
#include <ownership.h>
#include <stdlib.h>
struct X {
char * owner text;
};
void x_delete(struct X * owner p)
{
if (p) {
free(p->text);
free(p);
}
}
void f(struct X * p){}
int main() {
struct X * owner p = malloc(sizeof(struct X));
p->text = malloc(10);
x_delete(p);
f(p); //uninitialized object 'p'
}
```

When objects are moved within a local scope, the state is "moved" rather than "uninitialized." The "moved" state is similar to the "uninitialized" state. For instance, it's not possible to move an object that has already been moved.

```c
#include <ownership.h>
#include <stdlib.h>

struct X {
char * owner text;
};

void x_delete(struct X * owner p)
{
if (p)
{
free(p->text);
free(p);
}
}

void f(struct X * p){}

int main() {
struct X * owner p = malloc(sizeof(struct X));
p->text = malloc(10);

struct X * owner p2 = 0;
p2 = p; //MOVED

f(p); //error: object 'p' was moved
x_delete(p2);
}

```
The "moved" state was introduced instead of solely relying on the "uninitialized" state because static analysis benefits from having more information on local variables. "Moved" objects may, in some cases, serve as view objects. For example, in listing XX, the x object has been moved to x2, but it is safe to use x as "view" object even after the move.
```c
#include <ownership.h>
#include <stdlib.h>
struct X {
char * owner text;
};
int main() {
struct X x = {0};
struct X x2 = {0};
free(x2.text);
}
```

Note: The current implementation of cake does not handle all necessary states to ensure the safe usage of moved objects.

A common scenario where uninitialized objects are utilized is when a pointer to an uninitialized object is passed to an "init" function. This situation is addressed by the qualifier **out**.

```c
#include <ownership.h>
#include <stdlib.h>
#include <string.h>

struct X {
char * owner text;
};

int init(out struct X *p, const char * text)
{
p->text = strdup(text);
}

int main() {
struct X x;
init(&x, "text");
free(x.text);
}

As we have just seen, the **uninitialized** state is the state of variables that are declared but not initialized. The compiler ensures that we don't read uninitialized objects.
```
The "out" qualifier is valuable for both the caller and the implementation. The caller is informed that the argument must be uninitialized, and the implementation is aware that it can safely override the contents of the object `p->text = strdup(text);` without causing a memory leak.
There is no explicit "initialized" state. When referring to initialized objects, it means the state is neither "moved" nor "uninitialized."
**Rule** By default, the parameters of a function are considered initialized. The exception is created with out qualifier.
For instance, at set implementation we need free text before assignment.
```c
#include <ownership.h>
#include <stdlib.h>
#include <string.h>
struct X {
char * owner text;
};
int init(out struct X *p, const char * text)
{
p->text = strdup(text);
}
int set(struct X *p, const char * text)
{
free(p->text);
p->text = strdup(text);
}
int main() {
struct X x;
init(&x, "text1");
set(&x, "text2");
free(x.text);
}
```

**Rule** All objects passed as arguments must be initialized. The exception is when object is out qualified.

**Rule**: We cannot pass initialized objects to **out** qualified object.

The **null** state means that owner objects are initialized and not referencing any object. Listing 13 shows a sample using owner pointers:

Expand Down Expand Up @@ -366,6 +532,12 @@ int main()

The **zero** state is used for non-owner objects to complement and support uninitialized checks.

**Rule** Pointer parameters are consider not-null by default.

To tell the compiler that the pointer can be null, we use the qualifier _Opt.

(Currently Cake is only doing null-checks if the -nullchecks option is passed to the compiler)

**Listing 15 - The zero state**

```c
Expand Down Expand Up @@ -393,6 +565,24 @@ The **not-zero** state is used for non-owner objects to indicate the value if no
close(server_socket);
```
Similarly of `maybe-null`, `any` is a alias for `zero or not-zero`.
```c
int f();
int main() {
int i = f();
static_state(i, "any");
}
```

By the way, the result of functions are never `uninitialized` objects by convention.



**Rule**: Function never returns uninitialized objects.

Now let's consider `realloc` function.

```c
Expand Down Expand Up @@ -460,7 +650,7 @@ int main() {
```


**Check:** We cannot discard owner objects as showed in listing 18.
**Rule:** We cannot discard owner objects as showed in listing 18.

**Listing 18 - owner objects cannot be discarded.**

Expand Down
4 changes: 4 additions & 0 deletions src/fs.c
Original file line number Diff line number Diff line change
Expand Up @@ -948,12 +948,16 @@ const char* file_ownership_h =
"#define __OWNERSHIP_H__\n"
"\n"
"#ifdef __STDC_OWNERSHIP__\n"
"#define out _Out\n"
"#define opt _Opt\n"
"#define owner _Owner\n"
"#define obj_owner _Obj_owner\n"
"#define view _View\n"
"#define unchecked \"unchecked\"\n"
"\n"
"#else\n"
"#define out \n"
"#define opt \n"
"#define owner\n"
"#define obj_owner\n"
"#define view\n"
Expand Down
23 changes: 14 additions & 9 deletions src/lib.c
Original file line number Diff line number Diff line change
Expand Up @@ -9465,12 +9465,16 @@ const char* file_ownership_h =
"#define __OWNERSHIP_H__\n"
"\n"
"#ifdef __STDC_OWNERSHIP__\n"
"#define out _Out\n"
"#define opt _Opt\n"
"#define owner _Owner\n"
"#define obj_owner _Obj_owner\n"
"#define view _View\n"
"#define unchecked \"unchecked\"\n"
"\n"
"#else\n"
"#define out \n"
"#define opt \n"
"#define owner\n"
"#define obj_owner\n"
"#define view\n"
Expand Down Expand Up @@ -10405,15 +10409,18 @@ void expression_evaluate_equal_not_equal(const struct expression* left,



/*
Object represents "memory" and state. Used by flow analysis
*/

//#pragma once

enum object_state
{
/*
Not used
Not applicable. The state cannot be used.
*/
OBJECT_STATE_EMPTY = 0,
OBJECT_STATE_NOT_APPLICABLE = 0,

OBJECT_STATE_UNINITIALIZED = 1 << 0,
/*
Expand Down Expand Up @@ -10458,6 +10465,7 @@ struct objects {
int size;
int capacity;
};

void objects_destroy(struct objects* obj_owner p);
/*
Used in flow analysis to represent the object instance
Expand Down Expand Up @@ -19719,7 +19727,7 @@ bool has_name(const char* name, struct object_name_list* list)

if (p_struct_or_union_specifier)
{
obj.state = OBJECT_STATE_EMPTY;
obj.state = OBJECT_STATE_NOT_APPLICABLE;

struct member_declaration* p_member_declaration =
p_struct_or_union_specifier->member_declaration_list.head;
Expand Down Expand Up @@ -19755,7 +19763,7 @@ bool has_name(const char* name, struct object_name_list* list)
{
struct object member_obj = { 0 };
member_obj.declarator = declarator;
member_obj.state = OBJECT_STATE_EMPTY;
member_obj.state = OBJECT_STATE_NOT_APPLICABLE;
objects_push_back(&obj.members, &member_obj);
}
else
Expand Down Expand Up @@ -19804,7 +19812,7 @@ bool has_name(const char* name, struct object_name_list* list)
}
else if (type_is_pointer(p_type))
{
obj.state = OBJECT_STATE_EMPTY;
obj.state = OBJECT_STATE_NOT_APPLICABLE;

if (deep < 1)
{
Expand All @@ -19824,7 +19832,7 @@ bool has_name(const char* name, struct object_name_list* list)
{
//assert(p_object->members_size == 0);
//p_object->state = flags;
obj.state = OBJECT_STATE_EMPTY;
obj.state = OBJECT_STATE_NOT_APPLICABLE;
}

return obj;
Expand Down Expand Up @@ -21335,9 +21343,6 @@ void visit_ctx_destroy( struct visit_ctx* obj_owner ctx);






void object_state_to_string(enum object_state e)
{
bool first = true;
Expand Down
8 changes: 4 additions & 4 deletions src/object.c
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ bool has_name(const char* name, struct object_name_list* list)

if (p_struct_or_union_specifier)
{
obj.state = OBJECT_STATE_EMPTY;
obj.state = OBJECT_STATE_NOT_APPLICABLE;

struct member_declaration* p_member_declaration =
p_struct_or_union_specifier->member_declaration_list.head;
Expand Down Expand Up @@ -225,7 +225,7 @@ bool has_name(const char* name, struct object_name_list* list)
{
struct object member_obj = { 0 };
member_obj.declarator = declarator;
member_obj.state = OBJECT_STATE_EMPTY;
member_obj.state = OBJECT_STATE_NOT_APPLICABLE;
objects_push_back(&obj.members, &member_obj);
}
else
Expand Down Expand Up @@ -274,7 +274,7 @@ bool has_name(const char* name, struct object_name_list* list)
}
else if (type_is_pointer(p_type))
{
obj.state = OBJECT_STATE_EMPTY;
obj.state = OBJECT_STATE_NOT_APPLICABLE;

if (deep < 1)
{
Expand All @@ -294,7 +294,7 @@ bool has_name(const char* name, struct object_name_list* list)
{
//assert(p_object->members_size == 0);
//p_object->state = flags;
obj.state = OBJECT_STATE_EMPTY;
obj.state = OBJECT_STATE_NOT_APPLICABLE;
}

return obj;
Expand Down
5 changes: 3 additions & 2 deletions src/object.h
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@
enum object_state
{
/*
Not used
Not applicable. The state cannot be used.
*/
OBJECT_STATE_EMPTY = 0,
OBJECT_STATE_NOT_APPLICABLE = 0,

OBJECT_STATE_UNINITIALIZED = 1 << 0,
/*
Expand Down Expand Up @@ -56,6 +56,7 @@ struct objects {
int size;
int capacity;
};

void objects_destroy(struct objects* obj_owner p);
/*
Used in flow analysis to represent the object instance
Expand Down
Loading

0 comments on commit ddf7832

Please sign in to comment.