Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heapster: Translation of LLVM globals is fragile, depends on getelementptr indices being zero #1875

Closed
RyanGlScott opened this issue May 30, 2023 · 0 comments
Labels
subsystem: crucible-llvm Issues related to LLVM bitcode verification with crucible-llvm subsystem: heapster Issues specifically related to memory verification using Heapster type: bug Issues reporting bugs or unexpected/unwanted behavior

Comments

@RyanGlScott
Copy link
Contributor

Given these files:

-- Blah.sawcore
module Blah where
; test.ll

@D = global [2 x i32] [i32 27, i32 42]
@E = global i32* getelementptr inbounds ([2 x i32], [2 x i32]* @D, i64 0, i64 0)

define i32 @f() #0 {
  %1 = load i32*, i32** @E, align 8
  %2 = load i32, i32* %1, align 4
  ret i32 %2
}
// test.saw
enable_experimental;
env <- heapster_init_env_for_files "Blah.sawcore" ["test.bc"];

heapster_define_perm env "int32" " " "llvmptr 32" "exists x:bv 32.eq(llvmword(x))";

heapster_typecheck_fun env
    "f"
    "(). empty -o \
       \ ret:int32<>";

heapster_export_coq env "test_gen.v";

If you run test.saw, it will typecheck without issue:

$ llvm-as-10 test.ll -o test.bc
$ ../Haskell/saw-script/bin/saw test.saw
$ cat test_gen.v

(** Mandatory imports from saw-core-coq *)
From Coq Require Import Lists.List.
From Coq Require Import String.
From Coq Require Import Vectors.Vector.
From CryptolToCoq Require Import SAWCoreScaffolding.
From CryptolToCoq Require Import SAWCoreVectorsAsCoqVectors.
Import VectorNotations.

(** Post-preamble section specified by you *)
From CryptolToCoq Require Import SAWCorePrelude.
From CryptolToCoq Require Import SAWCoreBitvectors.
From CryptolToCoq Require Import SpecMExtra.


(** Code generated by saw-core-coq *)

Module Blah.

Definition __x22D__x22 : SAWCoreVectorsAsCoqVectors.Vec (SAWCoreVectorsAsCoqVectors.bvToNat 64 (intToBv 64 2%Z)) (SAWCoreVectorsAsCoqVectors.Vec 32 Coq.Init.Datatypes.bool) :=
  SAWCorePrelude.genBVVecFromVec 2 (SAWCoreVectorsAsCoqVectors.Vec 32 Coq.Init.Datatypes.bool) [intToBv 32 0x1b%Z; intToBv 32 0x2a%Z] (intToBv 32 0%Z) 64 (intToBv 64 2%Z).

Definition __x22E__x22 : SAWCoreVectorsAsCoqVectors.Vec (SAWCoreVectorsAsCoqVectors.bvToNat 64 (intToBv 64 2%Z)) (SAWCoreVectorsAsCoqVectors.Vec 32 Coq.Init.Datatypes.bool) :=
  __x22D__x22.

Definition f__frame : @Coq.Init.Datatypes.list SpecM.LetRecType :=
  @Coq.Init.Datatypes.cons SpecM.LetRecType (SpecM.LRT_Ret (SAWCoreVectorsAsCoqVectors.Vec 32 Coq.Init.Datatypes.bool)) (@Coq.Init.Datatypes.nil SpecM.LetRecType).

Definition f__bodies : @SpecM.FrameTuple SAWCorePrelude.VoidEv (SAWCorePrelude.pushFunStack f__frame SAWCorePrelude.emptyFunStack) f__frame :=
  pair (@SpecM.RetS SAWCorePrelude.VoidEv (SAWCorePrelude.pushFunStack f__frame SAWCorePrelude.emptyFunStack) (SAWCoreVectorsAsCoqVectors.Vec 32 Coq.Init.Datatypes.bool) (SAWCorePrelude.atBVVec 64 (intToBv 64 2%Z) (SAWCoreVectorsAsCoqVectors.Vec 32 Coq.Init.Datatypes.bool) __x22E__x22 (intToBv 64 0%Z) ltac:(solveUnsafeAssertBVULt))) tt.

Definition f : @SpecM.SpecM SAWCorePrelude.VoidEv SAWCorePrelude.emptyFunStack (SAWCoreVectorsAsCoqVectors.Vec 32 Coq.Init.Datatypes.bool) :=
  @SpecM.MultiFixS SAWCorePrelude.VoidEv SAWCorePrelude.emptyFunStack f__frame f__bodies (@SpecM.mkFrameCall f__frame 0).

End Blah.

Now, let's change the definition of @E to index into the second element of D instead of the first:

-@E = global i32* getelementptr inbounds ([2 x i32], [2 x i32]* @D, i64 0, i64 0)
+@E = global i32* getelementptr inbounds ([2 x i32], [2 x i32]* @D, i64 0, i64 1)

Now, if you run SAW, the resulting test_gen.v file will have a type error!

$ llvm-as-10 test.ll -o test.bc
$ ../Haskell/saw-script/bin/saw test.saw
$ cat test_gen.v

(** Mandatory imports from saw-core-coq *)
From Coq Require Import Lists.List.
From Coq Require Import String.
From Coq Require Import Vectors.Vector.
From CryptolToCoq Require Import SAWCoreScaffolding.
From CryptolToCoq Require Import SAWCoreVectorsAsCoqVectors.
Import VectorNotations.

(** Post-preamble section specified by you *)
From CryptolToCoq Require Import SAWCorePrelude.
From CryptolToCoq Require Import SAWCoreBitvectors.
From CryptolToCoq Require Import SpecMExtra.


(** Code generated by saw-core-coq *)

Module Blah.

Definition __x22D__x22 : SAWCoreVectorsAsCoqVectors.Vec (SAWCoreVectorsAsCoqVectors.bvToNat 64 (intToBv 64 2%Z)) (SAWCoreVectorsAsCoqVectors.Vec 32 Coq.Init.Datatypes.bool) :=
  SAWCorePrelude.genBVVecFromVec 2 (SAWCoreVectorsAsCoqVectors.Vec 32 Coq.Init.Datatypes.bool) [intToBv 32 0x1b%Z; intToBv 32 0x2a%Z] (intToBv 32 0%Z) 64 (intToBv 64 2%Z).

Definition f__frame : @Coq.Init.Datatypes.list SpecM.LetRecType :=
  @Coq.Init.Datatypes.cons SpecM.LetRecType (SpecM.LRT_Ret (SAWCoreVectorsAsCoqVectors.Vec 32 Coq.Init.Datatypes.bool)) (@Coq.Init.Datatypes.nil SpecM.LetRecType).

Definition f__bodies : @SpecM.FrameTuple SAWCorePrelude.VoidEv (SAWCorePrelude.pushFunStack f__frame SAWCorePrelude.emptyFunStack) f__frame :=
  pair (@SpecM.ErrorS SAWCorePrelude.VoidEv (SAWCorePrelude.pushFunStack f__frame SAWCorePrelude.emptyFunStack) (SAWCoreVectorsAsCoqVectors.Vec 32 Coq.Init.Datatypes.bool) "At internal ($0 = resolveGlobal saw:llvm_memory ""E"")
  Regs: 
  Input perms: ghost_frm:llvmframe []
  Type-checking failure: LLVM_ResolveGlobal: no perms for global E

  "%string) tt.

Definition f : @SpecM.SpecM SAWCorePrelude.VoidEv SAWCorePrelude.emptyFunStack (SAWCoreVectorsAsCoqVectors.Vec 32 Coq.Init.Datatypes.bool) :=
  @SpecM.MultiFixS SAWCorePrelude.VoidEv SAWCorePrelude.emptyFunStack f__frame f__bodies (@SpecM.mkFrameCall f__frame 0).

End Blah.

In particular, here is the type error:

At internal ($0 = resolveGlobal saw:llvm_memory ""E"")
  Regs: 
  Input perms: ghost_frm:llvmframe []
  Type-checking failure: LLVM_ResolveGlobal: no perms for global E

Why is this happening? If you look how Heapster translates global variables, you will come across this peculiar piece of code:

translateLLVMGEP w (L.Array _ tp) vtrans (L.Typed _ (L.ValInteger 0) : ixs) =
translateLLVMGEP w tp vtrans ixs

This will translate a constant getelementptr expression that indexes into an array, but only if it uses the index 0. Because we changed the code above to use index 1, we reach the catch-all error case below:

translateLLVMGEP _ tp _ ixs =
traceAndZeroM ("translateLLVMGEP cannot handle arguments:\n" ++
" " ++ intercalate "," (show tp : map show ixs))

Admittedly, this example is a contrived one, as I had to hand-write LLVM bitcode to trigger it. But I don't think it's totally unreasonable either—Clang could quite conceivably optimize some array constant–heavy code into something that looks like this. For this reason, I find the current translation of array constants in Heapster somewhat uncomfortable, as we seem to be skating on thin ice.

This issue aims to find a more robust way to implement global constant translation in Heapster that avoids this issue.

@RyanGlScott RyanGlScott added type: bug Issues reporting bugs or unexpected/unwanted behavior subsystem: crucible-llvm Issues related to LLVM bitcode verification with crucible-llvm subsystem: heapster Issues specifically related to memory verification using Heapster labels May 30, 2023
@mergify mergify bot closed this as completed in f8e3ce8 Jun 2, 2023
yav pushed a commit that referenced this issue Jun 16, 2023
The previous implementation of `translateLLVMGEP` was needlessly complicated,
as it required the pointer argument to have a very particular shape to its
type.  But there's no good reason for this requirement, as the only thing that
`translateLLVMGEP` _really_ needs to check for is that all of the index
arguments are `0` (so that no pointer offset needs to be computed). I have
simplified the implementation of `translateLLVMGEP` to reflect this and
expanded its Haddocks accordingly.

A secondary benefit of this change is that we no longer match on `PtrTo` in
`translateLLVMGEP`, which will make it easier to support opaque pointers in an
upcoming commit.

Fixes #1875.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
subsystem: crucible-llvm Issues related to LLVM bitcode verification with crucible-llvm subsystem: heapster Issues specifically related to memory verification using Heapster type: bug Issues reporting bugs or unexpected/unwanted behavior
Projects
None yet
Development

No branches or pull requests

1 participant