Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bit-level undef/poison #366

Open
brianhuffman opened this issue Dec 6, 2019 · 6 comments
Open

Bit-level undef/poison #366

brianhuffman opened this issue Dec 6, 2019 · 6 comments
Labels
llvm poison Issues touching on the LLVM concept of poison

Comments

@brianhuffman
Copy link
Contributor

We might want to extend the LLVM memory model to allow tracking of undefinedness and/or poison values with bit-level granularity.

There are some perfectly reasonable C programs that we can't verify with Crucible/LLVM right now because the memory model is too strict regarding uninitialized memory. For example, when a C function zero-initializes a bit field of a struct, the compiled LLVM implements it as a load, an and with a mask, and then a store. Currently this program would be rejected by Crucible/LLVM because the load reads uninitialized memory and fails.

This would be a lot of work to implement. Currently we represent the result of LLVM memory loads using the Partial type, which includes a (symbolic) bitvector along with a single definedness predicate; after each load we immediately assert that the definedness predicate is true. We would need to change this to a definedness bitmask, and we would also need to defer any assertions of well-definedness to some later time. We would need to define how undef/poison is propagated by all the bitvector operations. We would also have to model stores of partially-defined bitvectors to memory.

Another challenge is that there isn't an agreed-upon standard semantics for undefinedness/poison in LLVM, so we'd have to figure out a lot of details for ourselves. One risk would be that if we make our semantics too permissive, Crucible/LLVM might verify some programs that actually fail when they are compiled and run.

In summary, implementing this feature would bring both potential benefits and also risks. The goal of this ticket thread is to lay out all the risks and benefits that we can think of, so that we can make an informed decision about when/if we are ready to implement this feature.

@joehendrix
Copy link
Contributor

joehendrix commented Dec 6, 2019

Here's a small example for motivation

#include <stdint.h>

struct s {
    uint64_t x;
    uint8_t b:1;
};

struct s mk(void) {
    struct s t;
    t.x = 17;
    t.b = 1;
    return t;
}

With clang 8.0 this compiles to

%struct.s = type { i64, i8 }

; Function Attrs: noinline nounwind optnone ssp uwtable
define { i64, i8 } @mk() #0 {
  %1 = alloca %struct.s, align 8
  %2 = getelementptr inbounds %struct.s, %struct.s* %1, i32 0, i32 0
  store i64 17, i64* %2, align 8
  %3 = getelementptr inbounds %struct.s, %struct.s* %1, i32 0, i32 1
  %4 = load i8, i8* %3, align 8
  %5 = and i8 %4, -2
  %6 = or i8 %5, 1
  store i8 %6, i8* %3, align 8
  %7 = bitcast %struct.s* %1 to { i64, i8 }*
  %8 = load { i64, i8 }, { i64, i8 }* %7, align 8
  ret { i64, i8 } %8
}

The load at %4 is reading uninitialized memory, and per the LLVM language reference, this produces a "undefined value". Clang then does some bit masking in %5 on the code that at least suggests and undef 0 should be 0.

Material in the LLVM literature suggests that some of the LLVM optimizations do not treat undef as a non-deterministic choice between 0 and 1, but rather an independent value that could be optimized to either one as needed on demand. For example, in the code below %y may take the value undef rather than 0.

  %x = undef
  %y = xor %x %x

However, my understanding is that other optimization passes do treat undef as non-deterministic choice however, and this leads to inconsistencies as discussed in papers below.

Based on this, I think if we want to support undefined values, we will need a separate bitmask representing whether each is defined, and these values will need to be stored/loaded from memory. We would also need to define the semantics of what it means for a bit to be undefined in all LLVM operations.

LLVM also has a related concept called "poison". I think the one difference between the two is that and poison 0 is intended to be poison. Although some have advocated eliminating poison in favor of undef, more recently others argue for eliminating undef in favor of poison. If allocated memory becomes poison rather than undef, then clang's current behavior is incorrect.

My recommendation would be to have a very conservative model that has bit-level undef, but add checks on most operations that they are never given any undefined bits. Exceptions would be made for bitmask operations and memory reads/writes.

Related reading:

[llvm-dev] RFC: Killing undef and spreading poison
: https://lists.llvm.org/pipermail/llvm-dev/2016-October/106182.html

Taming Undefined Behavior in LLVM
: https://blog.regehr.org/archives/1496

These both have value level rather than bitlevel poison, which doesn't help with the bitfields problem. Bitlevel poison also lets bitcast form a groupoid, which seems like a good property.

@robdockins
Copy link
Contributor

Yeah, as @joehendrix mentioned, the main obstacle to this is the fact that the LLVM devs haven't come to a coherent understanding of what poison and undef actually mean, so the correctness story is pretty tricky. I'm reluctant to do a bunch of work given one understanding of the semantics of these things, only to have to redo it later. However, it seems we're encountering real examples of programs we'd like to verify that we can't, so perhaps a fairly conservative expansion of the semantics is warranted.

I think I agree with Joe that a simple bitmask for definedness would get us most of what we need. I suggest that bitwise operations would operate bitwise on definedness, and arithmetic operations would require all input bits to be defined, otherwise the result is totally undefined. Likewise, pointer operations require totally defined values. It would become UB to branch on an undefined bit. Select values on an undefined bit becomes undefined. We'd have to make case-by-case decisions about what to do with intrinsics. I think that probably keeps us from verifying any programs that LLVM considers UB, but still allows some basic patterns we're seeing in the wild. Of course, the devil is in the details, and it would be quite a bit harder to get good error reporting...

@langston-barrett
Copy link
Contributor

I think bit- vs. value-level undef and poison is in the category of things that it would be reasonable to allow an advanced user to configure. We can be strict by default and point users to documentation that covers the nuances of these decisions and how to enable or disable various checks in error messages.

@robdockins robdockins added the poison Issues touching on the LLVM concept of poison label Mar 26, 2021
@robdockins
Copy link
Contributor

Since we first opened this issue, LLVM has added the freeze instruction. https://llvm.org/docs/LangRef.html#freeze-instruction

The express purpose of this instruction is to stop the propagation of undef and poison values. I think this instruction first appeared in LLVM 10 or 11.

We should perhaps revisit this now, and see if consensus has arisen about these issues.

@RyanGlScott
Copy link
Contributor

LLVM 12 also introduces a dedicated poison constant, similar in spirit to the existing undef constant.

RyanGlScott added a commit that referenced this issue Sep 16, 2021
We handle `freeze` instructions in `crucible-llvm` by simply passing the
argument value unchanged. This isn't the intended semantics of `freeze`
(see https://releases.llvm.org/12.0.0/docs/LangRef.html#id323), where `undef`
or `poison` arguments should result in `freeze` returning an arbitrary value.
Unfortunately, Crucible isn't quite equipped to accurately determine if an
given value is `undef` or `poison` in all situations (see #366), so this is the
best that we can do for now.
RyanGlScott added a commit that referenced this issue Dec 5, 2021
This bumps the `llvm-pretty` submodule to bring in the following PRs:

* GaloisInc/llvm-pretty#86 (support LLVM 11 `IsDefault` fields in metadata)
* GaloisInc/llvm-pretty#87 (Relax the Symbol requirement to `ConstBlockAddr`)
* GaloisInc/llvm-pretty#88 (add LLVM12 poison value)

This also bumps the `llvm-pretty-bc-parser` submodule to bring in the
following PRs:

* GaloisInc/llvm-pretty-bc-parser#171 (don't consume fast math flag entry)
* GaloisInc/llvm-pretty-bc-parser#169 (make `getBitString` assumptions explicit)
* GaloisInc/llvm-pretty-bc-parser#168 (support for LLVM 11 `IsDefault` fields in metadata)
* GaloisInc/llvm-pretty-bc-parser#170 (remove `Aligned` from `SubWord`)
* GaloisInc/llvm-pretty-bc-parser#172 (fix BLOCKADDRESS parsing)
* GaloisInc/llvm-pretty-bc-parser#173 (parse LLVM12 poison value)
* GaloisInc/llvm-pretty-bc-parser#178 (use Seq over list in PartialModule)

Of these, the only PR that requires code changes in Crucible is
GaloisInc/llvm-pretty#88 (add LLVM12 poison value), as this adds a new
`ValPoison` constructor to `Value'`. Properly handling LLVM poison is a large
task that I do not wish to tackle here (see #366 for discussion of how one
might do this). For now, this PR does the bare minimum:

* In `Lang.Crucible.LLVM.Translation.BlockInfo.useVal`, treat `ValPoison`
  as not referencing any identifiers.
* The other two potential sites where `ValPoison` could be matched on are in
  `Lang.Crucible.LLVM.Translation.Constant.transConstant'` and
  `Lang.Crucible.LLVM.Translation.Expr.transValue`. Since we are not handling
  `ValPoison` for now, both of these functions will simply error if they
  encounter `ValPoison`.
* I have added a section to `crucible-llvm`'s `limitations.md` document which
  describes the above limitations of poison handling.
RyanGlScott added a commit that referenced this issue Dec 6, 2021
This bumps the `llvm-pretty` submodule to bring in the following PRs:

* GaloisInc/llvm-pretty#86 (support LLVM 11 `IsDefault` fields in metadata)
* GaloisInc/llvm-pretty#87 (Relax the Symbol requirement to `ConstBlockAddr`)
* GaloisInc/llvm-pretty#88 (add LLVM12 poison value)
* GaloisInc/llvm-pretty#90 (Support structs with bitfields in `Text.LLVM.DebugUtils`)

This also bumps the `llvm-pretty-bc-parser` submodule to bring in the
following PRs:

* GaloisInc/llvm-pretty-bc-parser#171 (don't consume fast math flag entry)
* GaloisInc/llvm-pretty-bc-parser#169 (make `getBitString` assumptions explicit)
* GaloisInc/llvm-pretty-bc-parser#168 (support for LLVM 11 `IsDefault` fields in metadata)
* GaloisInc/llvm-pretty-bc-parser#170 (remove `Aligned` from `SubWord`)
* GaloisInc/llvm-pretty-bc-parser#172 (fix BLOCKADDRESS parsing)
* GaloisInc/llvm-pretty-bc-parser#173 (parse LLVM12 poison value)
* GaloisInc/llvm-pretty-bc-parser#178 (use Seq over list in PartialModule)

Of these, the only PR that requires code changes in Crucible is
GaloisInc/llvm-pretty#88 (add LLVM12 poison value), as this adds a new
`ValPoison` constructor to `Value'`. Properly handling LLVM poison is a large
task that I do not wish to tackle here (see #366 for discussion of how one
might do this). For now, this PR does the bare minimum:

* In `Lang.Crucible.LLVM.Translation.BlockInfo.useVal`, treat `ValPoison`
  as not referencing any identifiers.
* The other two potential sites where `ValPoison` could be matched on are in
  `Lang.Crucible.LLVM.Translation.Constant.transConstant'` and
  `Lang.Crucible.LLVM.Translation.Expr.transValue`. Since we are not handling
  `ValPoison` for now, both of these functions will simply error if they
  encounter `ValPoison`.
* I have added a section to `crucible-llvm`'s `limitations.md` document which
  describes the above limitations of poison handling.
RyanGlScott added a commit that referenced this issue Dec 6, 2021
This bumps the `llvm-pretty` submodule to bring in the following PRs:

* GaloisInc/llvm-pretty#86 (support LLVM 11 `IsDefault` fields in metadata)
* GaloisInc/llvm-pretty#87 (Relax the Symbol requirement to `ConstBlockAddr`)
* GaloisInc/llvm-pretty#88 (add LLVM12 poison value)
* GaloisInc/llvm-pretty#90 (Support structs with bitfields in `Text.LLVM.DebugUtils`)

This also bumps the `llvm-pretty-bc-parser` submodule to bring in the
following PRs:

* GaloisInc/llvm-pretty-bc-parser#171 (don't consume fast math flag entry)
* GaloisInc/llvm-pretty-bc-parser#169 (make `getBitString` assumptions explicit)
* GaloisInc/llvm-pretty-bc-parser#168 (support for LLVM 11 `IsDefault` fields in metadata)
* GaloisInc/llvm-pretty-bc-parser#170 (remove `Aligned` from `SubWord`)
* GaloisInc/llvm-pretty-bc-parser#172 (fix BLOCKADDRESS parsing)
* GaloisInc/llvm-pretty-bc-parser#173 (parse LLVM12 poison value)
* GaloisInc/llvm-pretty-bc-parser#178 (use Seq over list in PartialModule)

Of these, the only PR that requires code changes in Crucible is
GaloisInc/llvm-pretty#88 (add LLVM12 poison value), as this adds a new
`ValPoison` constructor to `Value'`. Properly handling LLVM poison is a large
task that I do not wish to tackle here (see #366 for discussion of how one
might do this). For now, this PR does the bare minimum:

* In `Lang.Crucible.LLVM.Translation.BlockInfo.useVal`, treat `ValPoison`
  as not referencing any identifiers.
* The other two potential sites where `ValPoison` could be matched on are in
  `Lang.Crucible.LLVM.Translation.Constant.transConstant'` and
  `Lang.Crucible.LLVM.Translation.Expr.transValue`. Since we are not handling
  `ValPoison` for now, both of these functions will simply error if they
  encounter `ValPoison`.
* I have added a section to `crucible-llvm`'s `limitations.md` document which
  describes the above limitations of poison handling.
@RyanGlScott
Copy link
Contributor

More motivation for implementing this is supporting Rust programs which use enums. For example, in this program:

// test.rs
#[no_mangle]
fn test() -> Option<u64> {
    None
}

fn main() {
    test();
}

If you attempt to run this with the following crux-llvm configuration:

-- test.config
entry-point: "test"
no-compile: yes
make-executables: no

It will fail thusly:

$ ~/Software/crux-llvm-0.8/bin/crux-llvm --config test.config test.bc
[Crux] Using pointer width: 64 for file test.bc
[Crux] Simulating function test
[Crux] Attempting to prove verification conditions.
[Crux] Found counterexample for verification goal
[Crux]   internal: error: in test
[Crux]   Error during memory load
[Crux]     No previous write to this location was found
[Crux]       Attempting load at type: i64
[Crux]     Performing overall load at type: i64
[Crux]       Via pointer: (15, 0x8:[64])
[Crux]     In memory state:
[Crux]       Stack frame test
[Crux]         Allocations:
[Crux]           StackAlloc 15 0x10:[64] Mutable 8-byte-aligned internal
[Crux]         Writes:
[Crux]           Indexed chunk:
[Crux]             15 |->   memset (15, 0x0:[64]) 0x0:[8] 0x8:[64]
[Crux]       Base memory
[Crux]         Allocations:
[Crux]           GlobalAlloc 14 0x30:[64] Immutable 8-byte-aligned [global variable  ] vtable.0
[Crux]           GlobalAlloc 13 0x0:[64] Immutable 1-byte-aligned [defined function ] main
[Crux]           GlobalAlloc 12 0x0:[64] Immutable 1-byte-aligned [defined function ] _ZN4test4main17h6eb08de441479d94E
[Crux]           GlobalAlloc 11 0x0:[64] Immutable 1-byte-aligned [defined function ] test
[Crux]           GlobalAlloc 10 0x0:[64] Immutable 1-byte-aligned [defined function ] _ZN54_$LT$$LP$$RP$$u20$as$u20$std..process..Termination$GT$6report17hf7d1fb29bee70df6E
[Crux]           GlobalAlloc 9 0x0:[64] Immutable 1-byte-aligned [defined function ] _ZN4core3ptr85drop_in_place$LT$std..rt..lang_start$LT$$LP$$RP$$GT$..$u7b$$u7b$closure$u7d$$u7d$$GT$17hd8d9e79b229f389cE
[Crux]           GlobalAlloc 8 0x0:[64] Immutable 1-byte-aligned [defined function ] _ZN4core3ops8function6FnOnce9call_once17hcea043d42613d97fE
[Crux]           GlobalAlloc 7 0x0:[64] Immutable 1-byte-aligned [defined function ] _ZN4core3ops8function6FnOnce9call_once17h37ab5b9159465bd1E
[Crux]           GlobalAlloc 6 0x0:[64] Immutable 1-byte-aligned [defined function ] _ZN4core3ops8function6FnOnce40call_once$u7b$$u7b$vtable.shim$u7d$$u7d$17h9e3aba0b2a54756fE
[Crux]           GlobalAlloc 5 0x0:[64] Immutable 1-byte-aligned [defined function ] _ZN3std2rt10lang_start28_$u7b$$u7b$closure$u7d$$u7d$17h374838e18861e5b3E
[Crux]           GlobalAlloc 4 0x0:[64] Immutable 1-byte-aligned [defined function ] _ZN3std2rt10lang_start17h719b0a0de093a116E
[Crux]           GlobalAlloc 3 0x0:[64] Immutable 1-byte-aligned [defined function ] _ZN3std10sys_common9backtrace28__rust_begin_short_backtrace17h9afeff647bf13b0bE
[Crux]           GlobalAlloc 2 0x0:[64] Immutable 1-byte-aligned [external function] _ZN3std2rt19lang_start_internal17hf797316994c07e64E
[Crux]           GlobalAlloc 1 0x0:[64] Immutable 1-byte-aligned [external function] rust_eh_personality
[Crux]         Writes:
[Crux]           Indexed chunk:
[Crux]             14 |->   *(14, 0x0:[64]) := {base+0 = (9, 0x0:[64])
[Crux]                      ,base+8 = "\b\NUL\NUL\NUL\NUL\NUL\NUL\NUL\b\NUL\NUL\NUL\NUL\NUL\NUL\NUL"
[Crux]                      ,base+24 = (6, 0x0:[64])
[Crux]                      ,base+32 = (5, 0x0:[64])
[Crux]                      ,base+40 = (5, 0x0:[64])}
[Crux]     in context:
[Crux]       test
[Crux] Goal status:
[Crux]   Total: 1
[Crux]   Proved: 0
[Crux]   Disproved: 1
[Crux]   Incomplete: 0
[Crux]   Unknown: 0
[Crux] Overall status: Invalid.

The LLVM code for test reveals why:

; Function Attrs: nonlazybind uwtable
define dso_local { i64, i64 } @test() unnamed_addr #1 {
start:
  %0 = alloca { i64, i64 }, align 8
  store i64 0, ptr %0, align 8
  %1 = getelementptr inbounds { i64, i64 }, ptr %0, i32 0, i32 0
  %2 = load i64, ptr %1, align 8, !range !5, !noundef !4
  %3 = getelementptr inbounds { i64, i64 }, ptr %0, i32 0, i32 1
  %4 = load i64, ptr %3, align 8
  %5 = insertvalue { i64, i64 } undef, i64 %2, 0
  %6 = insertvalue { i64, i64 } %5, i64 %4, 1
  ret { i64, i64 } %6
}

At the LLVM level, the enum type Option<u64> is represented as a pair { i64, i64 }, where the first i64 is the discriminant value and the second i64 value is the payload value. The first i64 is expected to be 0 for None values and 1 for Some values. If the discriminant is 1, then the payload is whatever the u64 field of Some is; otherwise, the payload is undefined.

In the LLVM code above, we see that it stack-allocates an Option<u64> value (using alloca) and sets the discriminant value to 0 (using store) to indicate that it is a None value. Note that it does not set the payload value, however, so the payload is undefined. Then, it reads the discriminant and payload values (using load), constructs a pair out of them, and returns the resulting pair.

The %4 = load i64, ptr %3, align 8 instruction (where the payload is read) is the sticking point. crucible-llvm interprets this instruction as if it were an uninitialized stack read. According to the alloca section of the LLVM Language Reference manual, however, loading memory from an allocated pointer which hasn't been initialized isn't undefined behavior per se. Instead, it returns an undef value, which has defined semantics (as noted earlier in this issue).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm poison Issues touching on the LLVM concept of poison
Projects
None yet
Development

No branches or pull requests

5 participants