-
Notifications
You must be signed in to change notification settings - Fork 786
[analysis] Add a "Shared" lattice to represent shared state #6067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Current dependencies on/for this PR:
This stack of pull requests is managed by Graphite. |
518957f
to
e8e0ab4
Compare
09b5289
to
5527cd0
Compare
I followed up to here, and I got stuck. Can you maybe give a concrete example of this? As context, after reading the first paragraph I was expecting to see something like a "wrapper" for a lattice element that allows us to refer to an element without copying it. But I don't know why we need some new kind of structure here (the new chain) or even what that precisely means. |
Unfortunately we can't just have a To determine whether a block needs to be re-analyzed, we keep a sequence number alongside the pointer to the shared underlying element. When we propagate new results to the beginning of the block, we know the block needs to be re-analyzed if and only if the propagated sequence number is higher than the old sequence number. Every time the value of the shared underlying element increases, we increase the current sequence number so that any block that was analyzed based on a previous value will be re-analyzed. The sequence numbers are an important part of getting the new lattice to do what we need it to do, but they're also mathematically interesting because each successive sequence number corresponds to a higher value that the shared underlying element has taken on at some point in time. The sequence of those values through time form an increasing chain in the underlying lattice and the sequence numbers are both timestamps and indices into that chain. It's also nice to know that this |
e8e0ab4
to
f65d8a0
Compare
cc7fb30
to
b30a19c
Compare
f65d8a0
to
3e6b7dd
Compare
The analysis framework stores a separate lattice element for each basic block being analyzed to represent the program state at the beginning of the block. However, in many analyses a significant portion of program state is not flow-sensitive, so does not benefit from having a separate copy per block. For example, an analysis might track constraints on the types of locals that do not vary across blocks, so it really only needs a single copy of the constrains for each local. It would be correct to simply duplicate the state across blocks anyway, but it would not be efficient. To make it possible to share a single copy of a lattice element across basic blocks, introduce a `Shared<L>` lattice. Mathematically, this lattice represents a single ascending chain in the underlying lattice and its elements are ordered according to sequence numbers corresponding to positions in that chain. Concretely, though, the `Shared<L>` lattice only ever materializes a single, monotonically increasing element of `L` and all of its elements provide access to that shared underlying element. `Shared<L>` will let us get the benefits of having mutable shared state in the concrete implementation of analyses without losing the benefits of keeping those analyses expressible purely in terms of the monotone framework.
9a4797e
to
bb58f6f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, the idea now makes sense to me, and lgtm % comments
src/analysis/lattices/shared.h
Outdated
// Internally, there is only ever a single monotonically increasing element of L | ||
// materialized. Dereferencing any element of the Shared lattice will produce | ||
// the current value of that single element of L, which is generally safe | ||
// because the current value always overapproximates the value at the time of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does "overapproximate" mean here? A "higher" value in the lattice L?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep 👍
src/analysis/lattices/shared.h
Outdated
template<Lattice L> struct Shared { | ||
class Element { | ||
const typename L::Element* val; | ||
uint32_t seq = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to see some of the comments in the PR itself added as comments here, to explain what seq
is etc.
src/analysis/lattices/shared.h
Outdated
// sequence number. The sequence numbers monotonically increase along with | ||
// `val` and serve to provide ordering between elements of this lattice. | ||
mutable typename L::Element val; | ||
mutable uint32_t seq = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be the 2nd and 3rd time we use mutable
in this entire codebase... can we avoid that?
If we can't please add a comment explaining why this dangerous pattern is necessary here.
src/analysis/lattices/shared.h
Outdated
template<Lattice L> struct Shared { | ||
class Element { | ||
const typename L::Element* val; | ||
uint32_t seq = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint32_t seq = 0; | |
Index seq = 0; |
Index seems appropriate as this number is tied to the number of basic blocks (and iterations on them), which is tied to maximum sizes in wasm files. That is, if wasm supports more then 32 bits here, we'd want this type to change, which Index
will do for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This number is not tied to the number of basic blocks. It is the number of times the shared value is increased, which can be dramatically higher than the number of basic blocks. Since analysis continues to a fixed point, it can visit basic blocks an unbounded number of times and each visit to a basic block may potentially increase the value multiple times.
Since this isn't actually an index into anything that is ever materialized, I would prefer to keep it a uint32_t for now, or possibly make it a uint64_t right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sgtm. How about using a name for it for convenience later if we need to change it, using Seq = uint32_t;
?
test/gtest/lattices.cpp
Outdated
|
||
auto zero = shared.getBottom(); | ||
|
||
auto one = zero; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am having trouble reading a line that says literally 1 == 0
...
(And later down I see 2 == 1
. By induction, all natural numbers are apparently 0
... 🤔 😆 )
How about a, b, c
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only way to increase a shared value is via a join
, so to create our one
value we have to start from zero and add one via the join on the next line. I'll clarify this in the code.
|
||
EXPECT_EQ(shared.compare(one, zero), analysis::GREATER); | ||
EXPECT_EQ(shared.compare(one, one), analysis::EQUAL); | ||
EXPECT_EQ(shared.compare(one, uno), analysis::EQUAL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make sure I follow, uno
had an extra join
done on it, but it was of the same value as we had before, so the sequence number was not increased, hence it is equal to one
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right 👍
…bly#6067) The analysis framework stores a separate lattice element for each basic block being analyzed to represent the program state at the beginning of the block. However, in many analyses a significant portion of program state is not flow-sensitive, so does not benefit from having a separate copy per block. For example, an analysis might track constraints on the types of locals that do not vary across blocks, so it really only needs a single copy of the constrains for each local. It would be correct to simply duplicate the state across blocks anyway, but it would not be efficient. To make it possible to share a single copy of a lattice element across basic blocks, introduce a `Shared<L>` lattice. Mathematically, this lattice represents a single ascending chain in the underlying lattice and its elements are ordered according to sequence numbers corresponding to positions in that chain. Concretely, though, the `Shared<L>` lattice only ever materializes a single, monotonically increasing element of `L` and all of its elements provide access to that shared underlying element. `Shared<L>` will let us get the benefits of having mutable shared state in the concrete implementation of analyses without losing the benefits of keeping those analyses expressible purely in terms of the monotone framework.
The analysis framework stores a separate lattice element for each basic block
being analyzed to represent the program state at the beginning of the block.
However, in many analyses a significant portion of program state is not
flow-sensitive, so does not benefit from having a separate copy per block. For
example, an analysis might track constraints on the types of locals that do not
vary across blocks, so it really only needs a single copy of the constrains for
each local. It would be correct to simply duplicate the state across blocks
anyway, but it would not be efficient.
To make it possible to share a single copy of a lattice element across basic
blocks, introduce a
Shared<L>
lattice. Mathematically, this lattice representsa single ascending chain in the underlying lattice and its elements are ordered
according to sequence numbers corresponding to positions in that chain.
Concretely, though, the
Shared<L>
lattice only ever materializes a single,monotonically increasing element of
L
and all of its elements provide accessto that shared underlying element.
Shared<L>
will let us get the benefits of having mutable shared state in theconcrete implementation of analyses without losing the benefits of keeping those
analyses expressible purely in terms of the monotone framework.