Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support structs with bitfields in Text.LLVM.DebugUtils #90

Merged
merged 1 commit into from
Dec 6, 2021

Conversation

RyanGlScott
Copy link
Collaborator

LLVM bitcode doesn't directly record information about bitfields, but its debug information does record this information. Knowing about bitfields is important for certain applications—see, for example, GaloisInc/saw-script#1461. This changes Text.LLVM.DebugUtils such that if any of the fields in a struct have bitfields, it will record this information in the new BitfieldInfo data type.

This requires a backwards-incompatible change to the type of the Structure data constructor. In case we need to add additional fields to Structure in the future, I converted Structure's fields into a record data type, which makes it slightly easier to extend. I also did the same thing to Union for consistency (although this is not strictly necessary).

LLVM bitcode doesn't directly record information about bitfields, but its debug
information _does_ record this information. Knowing about bitfields is
important for certain applications—see, for example, GaloisInc/saw-script#1461.
This changes `Text.LLVM.DebugUtils` such that if any of the fields in a struct
have bitfields, it will record this information in the new `BitfieldInfo`
data type.

This requires a backwards-incompatible change to the type of the `Structure`
data constructor. In case we need to add additional fields to `Structure` in
the future, I converted `Structure`'s fields into a record data type, which
makes it slightly easier to extend. I also did the same thing to `Union` for
consistency (although this is not strictly necessary).
Copy link
Collaborator

@elliottt elliottt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems super reasonable, thanks for adding the thorough comments!

@elliottt elliottt merged commit ed904c6 into GaloisInc:master Dec 6, 2021
RyanGlScott added a commit to GaloisInc/crucible that referenced this pull request Dec 6, 2021
This bumps the `llvm-pretty` submodule to bring in the following PRs:

* GaloisInc/llvm-pretty#86 (support LLVM 11 `IsDefault` fields in metadata)
* GaloisInc/llvm-pretty#87 (Relax the Symbol requirement to `ConstBlockAddr`)
* GaloisInc/llvm-pretty#88 (add LLVM12 poison value)
* GaloisInc/llvm-pretty#90 (Support structs with bitfields in `Text.LLVM.DebugUtils`)

This also bumps the `llvm-pretty-bc-parser` submodule to bring in the
following PRs:

* GaloisInc/llvm-pretty-bc-parser#171 (don't consume fast math flag entry)
* GaloisInc/llvm-pretty-bc-parser#169 (make `getBitString` assumptions explicit)
* GaloisInc/llvm-pretty-bc-parser#168 (support for LLVM 11 `IsDefault` fields in metadata)
* GaloisInc/llvm-pretty-bc-parser#170 (remove `Aligned` from `SubWord`)
* GaloisInc/llvm-pretty-bc-parser#172 (fix BLOCKADDRESS parsing)
* GaloisInc/llvm-pretty-bc-parser#173 (parse LLVM12 poison value)
* GaloisInc/llvm-pretty-bc-parser#178 (use Seq over list in PartialModule)

Of these, the only PR that requires code changes in Crucible is
GaloisInc/llvm-pretty#88 (add LLVM12 poison value), as this adds a new
`ValPoison` constructor to `Value'`. Properly handling LLVM poison is a large
task that I do not wish to tackle here (see #366 for discussion of how one
might do this). For now, this PR does the bare minimum:

* In `Lang.Crucible.LLVM.Translation.BlockInfo.useVal`, treat `ValPoison`
  as not referencing any identifiers.
* The other two potential sites where `ValPoison` could be matched on are in
  `Lang.Crucible.LLVM.Translation.Constant.transConstant'` and
  `Lang.Crucible.LLVM.Translation.Expr.transValue`. Since we are not handling
  `ValPoison` for now, both of these functions will simply error if they
  encounter `ValPoison`.
* I have added a section to `crucible-llvm`'s `limitations.md` document which
  describes the above limitations of poison handling.
RyanGlScott added a commit to GaloisInc/crucible that referenced this pull request Dec 6, 2021
This bumps the `llvm-pretty` submodule to bring in the following PRs:

* GaloisInc/llvm-pretty#86 (support LLVM 11 `IsDefault` fields in metadata)
* GaloisInc/llvm-pretty#87 (Relax the Symbol requirement to `ConstBlockAddr`)
* GaloisInc/llvm-pretty#88 (add LLVM12 poison value)
* GaloisInc/llvm-pretty#90 (Support structs with bitfields in `Text.LLVM.DebugUtils`)

This also bumps the `llvm-pretty-bc-parser` submodule to bring in the
following PRs:

* GaloisInc/llvm-pretty-bc-parser#171 (don't consume fast math flag entry)
* GaloisInc/llvm-pretty-bc-parser#169 (make `getBitString` assumptions explicit)
* GaloisInc/llvm-pretty-bc-parser#168 (support for LLVM 11 `IsDefault` fields in metadata)
* GaloisInc/llvm-pretty-bc-parser#170 (remove `Aligned` from `SubWord`)
* GaloisInc/llvm-pretty-bc-parser#172 (fix BLOCKADDRESS parsing)
* GaloisInc/llvm-pretty-bc-parser#173 (parse LLVM12 poison value)
* GaloisInc/llvm-pretty-bc-parser#178 (use Seq over list in PartialModule)

Of these, the only PR that requires code changes in Crucible is
GaloisInc/llvm-pretty#88 (add LLVM12 poison value), as this adds a new
`ValPoison` constructor to `Value'`. Properly handling LLVM poison is a large
task that I do not wish to tackle here (see #366 for discussion of how one
might do this). For now, this PR does the bare minimum:

* In `Lang.Crucible.LLVM.Translation.BlockInfo.useVal`, treat `ValPoison`
  as not referencing any identifiers.
* The other two potential sites where `ValPoison` could be matched on are in
  `Lang.Crucible.LLVM.Translation.Constant.transConstant'` and
  `Lang.Crucible.LLVM.Translation.Expr.transValue`. Since we are not handling
  `ValPoison` for now, both of these functions will simply error if they
  encounter `ValPoison`.
* I have added a section to `crucible-llvm`'s `limitations.md` document which
  describes the above limitations of poison handling.
RyanGlScott added a commit to GaloisInc/saw-script that referenced this pull request Dec 8, 2021
This adds support for writing specifications that talk about bitfields in LLVM
code by way of the new `llvm_points_to_bitfield` command. Broadly speaking,
`llvm_points_to_bitfield ptr fieldName rhs` is like
`llvm_points_to (llvm_field ptr fieldName) rhs`, except that `fieldName` is
required to be the name of a field within a bitfield. The salient details are:

* LLVM bitcode itself does not a built-in concept of bitfields, but LLVM's
  debug metadata does. Support for retrieving bitfield-related metadata was
  added to `llvm-pretty` in GaloisInc/llvm-pretty#90, so this patch bumps the
  `llvm-pretty` submodule to incorporate it. This patch also updates the
  `crucible` submodule to incorporate corresponding changes in
  GaloisInc/crucible#936.

* The `LLVMPointsTo` data type now has a new `LLVMPointsToBitfield` data
  constructor that stores all of the necessary information related to the
  `llvm_points_to_bitfield` command. As a result, the changes in this patch
  are fairly insulated from the rest of SAW, as most of the new code involves
  adding additional cases to handle `LLVMPointsToBitfield`.

* Two of the key new functions are `storePointsToBitfieldValue` and
  `matchPointsToBitfieldValue`, which implement the behavior of
  `llvm_points_to_bitfield` in pre- and post-conditions. These functions
  implement the necessary bit-twiddling to store values in and retrieve values
  out of bitfield. I have left extensive comments in each function describing
  how all of this works.

* Accompanying `llvm_points_to_bitfield` is a new set of
  `{enable,disable}_lax_loads_and_stores` command, which toggles the
  Crucible-side option of the same name. When `enable_lax_loads_and_stores` is
  on, reading from uninitialized memory will return a symbolic value rather
  than failing outright. This is essential to be able to deal with LLVM bitcode
  involving bitfields, as reading a field from a bitfield involves reading the
  entire bitfield at once, which may include parts of the struct that have not
  been initialized yet.

* There are various `test_bitfield_*` test cases under `intTests` to test
  examples of bitfield-related specifications that should and should not
  verify.

* I have also updated `saw-remote-api` and `saw-client` to handle bitfields as
  well, along with a Python-specific test case.

Fixes #1461.
RyanGlScott added a commit to GaloisInc/saw-script that referenced this pull request Dec 10, 2021
This adds support for writing specifications that talk about bitfields in LLVM
code by way of the new `llvm_points_to_bitfield` command. Broadly speaking,
`llvm_points_to_bitfield ptr fieldName rhs` is like
`llvm_points_to (llvm_field ptr fieldName) rhs`, except that `fieldName` is
required to be the name of a field within a bitfield. The salient details are:

* LLVM bitcode itself does not a built-in concept of bitfields, but LLVM's
  debug metadata does. Support for retrieving bitfield-related metadata was
  added to `llvm-pretty` in GaloisInc/llvm-pretty#90, so this patch bumps the
  `llvm-pretty` submodule to incorporate it. This patch also updates the
  `crucible` submodule to incorporate corresponding changes in
  GaloisInc/crucible#936.

* The `LLVMPointsTo` data type now has a new `LLVMPointsToBitfield` data
  constructor that stores all of the necessary information related to the
  `llvm_points_to_bitfield` command. As a result, the changes in this patch
  are fairly insulated from the rest of SAW, as most of the new code involves
  adding additional cases to handle `LLVMPointsToBitfield`.

* Two of the key new functions are `storePointsToBitfieldValue` and
  `matchPointsToBitfieldValue`, which implement the behavior of
  `llvm_points_to_bitfield` in pre- and post-conditions. These functions
  implement the necessary bit-twiddling to store values in and retrieve values
  out of bitfield. I have left extensive comments in each function describing
  how all of this works.

* Accompanying `llvm_points_to_bitfield` is a new set of
  `{enable,disable}_lax_loads_and_stores` command, which toggles the
  Crucible-side option of the same name. When `enable_lax_loads_and_stores` is
  on, reading from uninitialized memory will return a symbolic value rather
  than failing outright. This is essential to be able to deal with LLVM bitcode
  involving bitfields, as reading a field from a bitfield involves reading the
  entire bitfield at once, which may include parts of the struct that have not
  been initialized yet.

* There are various `test_bitfield_*` test cases under `intTests` to test
  examples of bitfield-related specifications that should and should not
  verify.

* I have also updated `saw-remote-api` and `saw-client` to handle bitfields as
  well, along with a Python-specific test case.

Fixes #1461.
RyanGlScott added a commit to GaloisInc/saw-script that referenced this pull request Dec 10, 2021
This adds support for writing specifications that talk about bitfields in LLVM
code by way of the new `llvm_points_to_bitfield` command. Broadly speaking,
`llvm_points_to_bitfield ptr fieldName rhs` is like
`llvm_points_to (llvm_field ptr fieldName) rhs`, except that `fieldName` is
required to be the name of a field within a bitfield. The salient details are:

* LLVM bitcode itself does not a built-in concept of bitfields, but LLVM's
  debug metadata does. Support for retrieving bitfield-related metadata was
  added to `llvm-pretty` in GaloisInc/llvm-pretty#90, so this patch bumps the
  `llvm-pretty` submodule to incorporate it. This patch also updates the
  `crucible` submodule to incorporate corresponding changes in
  GaloisInc/crucible#936.

* The `LLVMPointsTo` data type now has a new `LLVMPointsToBitfield` data
  constructor that stores all of the necessary information related to the
  `llvm_points_to_bitfield` command. As a result, the changes in this patch
  are fairly insulated from the rest of SAW, as most of the new code involves
  adding additional cases to handle `LLVMPointsToBitfield`.

* Two of the key new functions are `storePointsToBitfieldValue` and
  `matchPointsToBitfieldValue`, which implement the behavior of
  `llvm_points_to_bitfield` in pre- and post-conditions. These functions
  implement the necessary bit-twiddling to store values in and retrieve values
  out of bitfield. I have left extensive comments in each function describing
  how all of this works.

* Accompanying `llvm_points_to_bitfield` is a new set of
  `{enable,disable}_lax_loads_and_stores` command, which toggles the
  Crucible-side option of the same name. When `enable_lax_loads_and_stores` is
  on, reading from uninitialized memory will return a symbolic value rather
  than failing outright. This is essential to be able to deal with LLVM bitcode
  involving bitfields, as reading a field from a bitfield involves reading the
  entire bitfield at once, which may include parts of the struct that have not
  been initialized yet.

* There are various `test_bitfield_*` test cases under `intTests` to test
  examples of bitfield-related specifications that should and should not
  verify.

* I have also updated `saw-remote-api` and `saw-client` to handle bitfields as
  well, along with a Python-specific test case.

Fixes #1461.
RyanGlScott added a commit to GaloisInc/saw-script that referenced this pull request Dec 11, 2021
This adds support for writing specifications that talk about bitfields in LLVM
code by way of the new `llvm_points_to_bitfield` command. Broadly speaking,
`llvm_points_to_bitfield ptr fieldName rhs` is like
`llvm_points_to (llvm_field ptr fieldName) rhs`, except that `fieldName` is
required to be the name of a field within a bitfield. The salient details are:

* LLVM bitcode itself does not a built-in concept of bitfields, but LLVM's
  debug metadata does. Support for retrieving bitfield-related metadata was
  added to `llvm-pretty` in GaloisInc/llvm-pretty#90, so this patch bumps the
  `llvm-pretty` submodule to incorporate it. This patch also updates the
  `crucible` submodule to incorporate corresponding changes in
  GaloisInc/crucible#936.

* The `LLVMPointsTo` data type now has a new `LLVMPointsToBitfield` data
  constructor that stores all of the necessary information related to the
  `llvm_points_to_bitfield` command. As a result, the changes in this patch
  are fairly insulated from the rest of SAW, as most of the new code involves
  adding additional cases to handle `LLVMPointsToBitfield`.

* Two of the key new functions are `storePointsToBitfieldValue` and
  `matchPointsToBitfieldValue`, which implement the behavior of
  `llvm_points_to_bitfield` in pre- and post-conditions. These functions
  implement the necessary bit-twiddling to store values in and retrieve values
  out of bitfield. I have left extensive comments in each function describing
  how all of this works.

* Accompanying `llvm_points_to_bitfield` is a new set of
  `{enable,disable}_lax_loads_and_stores` command, which toggles the
  Crucible-side option of the same name. When `enable_lax_loads_and_stores` is
  on, reading from uninitialized memory will return a symbolic value rather
  than failing outright. This is essential to be able to deal with LLVM bitcode
  involving bitfields, as reading a field from a bitfield involves reading the
  entire bitfield at once, which may include parts of the struct that have not
  been initialized yet.

* There are various `test_bitfield_*` test cases under `intTests` to test
  examples of bitfield-related specifications that should and should not
  verify.

* I have also updated `saw-remote-api` and `saw-client` to handle bitfields as
  well, along with a Python-specific test case.

Fixes #1461.
RyanGlScott added a commit to GaloisInc/saw-script that referenced this pull request Dec 21, 2021
This adds support for writing specifications that talk about bitfields in LLVM
code by way of the new `llvm_points_to_bitfield` command. Broadly speaking,
`llvm_points_to_bitfield ptr fieldName rhs` is like
`llvm_points_to (llvm_field ptr fieldName) rhs`, except that `fieldName` is
required to be the name of a field within a bitfield. The salient details are:

* LLVM bitcode itself does not a built-in concept of bitfields, but LLVM's
  debug metadata does. Support for retrieving bitfield-related metadata was
  added to `llvm-pretty` in GaloisInc/llvm-pretty#90, so this patch bumps the
  `llvm-pretty` submodule to incorporate it. This patch also updates the
  `crucible` submodule to incorporate corresponding changes in
  GaloisInc/crucible#936.

* The `LLVMPointsTo` data type now has a new `LLVMPointsToBitfield` data
  constructor that stores all of the necessary information related to the
  `llvm_points_to_bitfield` command. As a result, the changes in this patch
  are fairly insulated from the rest of SAW, as most of the new code involves
  adding additional cases to handle `LLVMPointsToBitfield`.

* Two of the key new functions are `storePointsToBitfieldValue` and
  `matchPointsToBitfieldValue`, which implement the behavior of
  `llvm_points_to_bitfield` in pre- and post-conditions. These functions
  implement the necessary bit-twiddling to store values in and retrieve values
  out of bitfield. I have left extensive comments in each function describing
  how all of this works.

* Accompanying `llvm_points_to_bitfield` is a new set of
  `{enable,disable}_lax_loads_and_stores` command, which toggles the
  Crucible-side option of the same name. When `enable_lax_loads_and_stores` is
  on, reading from uninitialized memory will return a symbolic value rather
  than failing outright. This is essential to be able to deal with LLVM bitcode
  involving bitfields, as reading a field from a bitfield involves reading the
  entire bitfield at once, which may include parts of the struct that have not
  been initialized yet.

* There are various `test_bitfield_*` test cases under `intTests` to test
  examples of bitfield-related specifications that should and should not
  verify.

* I have also updated `saw-remote-api` and `saw-client` to handle bitfields as
  well, along with a Python-specific test case.

Fixes #1461.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants