Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix expansion of signed ranges to masks #3212

Merged
merged 11 commits into from
Apr 20, 2022
Merged

Conversation

vlstill
Copy link
Contributor

@vlstill vlstill commented Apr 13, 2022

This fixes "#2800 Check whether the expansion of a range into ternary values is correct for signed numbers".

Copy link
Contributor

@jafingerhut jafingerhut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed the test programs and their -midend results output by the compiler, and they all look good to me. I have not attempted to review the C++ code changes.

@jafingerhut
Copy link
Contributor

I am not familiar enough with the error output from the failing test to understand what it means, but perhaps that is output from the Gauntlet tests looking for non-equivalent programs output by the front or midend? If so, it might actually be a bug in the Gauntlet equivalence checker for this new feature, but again, I do not know enough about its output or implementation to be able to tell. @fruffy ?

@@ -21,7 +21,7 @@
namespace P4 {

std::vector<const IR::Mask *>
DoReplaceSelectRange::rangeToMasks(const IR::Range *r) {
DoReplaceSelectRange::rangeToMasks(const IR::Range *r, int keyIndex) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's actually nicer to return a list of vectors for the result.
one in the common case, two if the range contains 0.
the keyIndex and signedIndicesToReplace make the data flow much harder to understand.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that signedIndicesToReplace does make the flow harder to understand. But this is mainly because it is indeed complex -- the indices are detected when range is expanded, but they are used in the corresponding select through which we backtrack later (making use of postorder). This needs to be done this way as at the time we encounter the select first, we don't know which values will contain ranges and when working with ranges we cannot replace a parent node (at least I don't think so, I could be wrong since I am a beginner to p4c). The only alternative to indices I see would be to remember the actual corresponding select expressions, but that in my opinion improves nothing and could lead to danger in case of some weird select with duplicated keys:

select (foo.a, foo.b, foo.a) {
   -5..5, 42, _ : ...
   _, 16, -32000 : ...
}

In this (rather pathological) case, we need to insert the bitcast only to the first occurence of foo.a not to both instances of foo.a expression in select.


As for the vector-of-vectors. I don't think this is a good idea as returning a vector of vectors implies that there can be any number of vectors, including 0 and more than 2. Therefore I think that would be bad design of the function as the type would allow too many invalid possibilities. It could make sense to return pair (isSigned, vectors) and resolve the index one level up in the postorder of mask.


For now, I have just added a comment about the indices into the class.


range_size_remaining -= match_stride;
min += match_stride;
while (range_size_remaining > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should become a separate function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have refactored the rangeToMasks function a bit since indeed it was quite long.

state start {
packet.extract(hdr.h);
transition select(hdr.h.a, hdr.h.b) {
(0 .. 7, -5 .. 4): parse;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a test that has 2 ranges containing 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I have added one more test with two signed fields (not both of them are going over 0 since I realized I did not have any test for purely negative range. I think the "going over zero" case is covered sufficiently and this test mainly covers the missing "multiple fields are signed" case.

@fruffy
Copy link
Collaborator

fruffy commented Apr 13, 2022

@jafingerhut Let me check. Considering this is a new feature I am guessing the tool does not have support for it yet.

@vlstill
Copy link
Contributor Author

vlstill commented Apr 14, 2022

@jafingerhut, @fruffy, looking at the output (not knowing what is is really :-D, but masing it on my knowledge of bit-vector SMT)

  • it seems to me this is having problems with frontend code (issue2800b-FrontEnd_0_P4V1::getV1ModelVersion.p4, issue2800b-FrontEnd_12_TypeInference.p4) which suggest the problem is with the input, not the code produced by midend
  • for hdr.h.a (model issue2800b) one of the inferred bounds is 65531 (0xfffc) which is unsigned reinterpretation of -5 which is the lower bound in the code. This suggests (together with use of bvule instead of bvsle (bit-vector unsigned/signed less then)) that the verifier is not working correctly with signed values
  • I think the condition is even more wrong (reformated & commented):
(let ((a!1 (ite  ; if
      (and (bvule #xfffb ((_ extract 31 16) extract_hdr))        ; 0xfffb <= extract_hdr[31:16]    (a)
           (= ((_ extract 31 19) extract_hdr) #b0000000000000)   ; and extract_hdr[31:19] == 0     (b)
           (bvule ((_ extract 18 16) extract_hdr) #b100)         ; and extract_hdr[18:16] <= 4     (c)
           (= ((_ extract 15 3) extract_hdr) #b0000000000000))   ; and extract_hdr[15:3] == 0      (d)
      #xffff ; then the result is 0xffff (that is -1, goign through 'parse' state)
      ((_ extract 31 16) extract_hdr)))) ; else it is the value extracted from the packet
    (ite extract_hdr_valid a!1 invalid))
  • obviously, the conditions (a) and (c) overlap (and (a) is wrong), but also (b) interacts weirdly with (a) in my opinion (if higher 13 bits are zero, then obviously the largest possible number would be 0x7 not 0xfffc) -> I would expect the zero check to be part of the other bound, but maybe this is just artefact of the mishandled constants again

@vlstill
Copy link
Contributor Author

vlstill commented Apr 14, 2022

The condition is even clearer for the a test with only one field:

(let ((a!1 (ite ( ; if
        and (bvule #xfffb extract_hdr)                          ; 0xfffb <= extract_hdr   (a)
            (= ((_ extract 15 3) extract_hdr) #b0000000000000)  ; extract_hdr[15:3] == 0  (b)
            (bvule ((_ extract 2 0) extract_hdr) #b100)         ; extract_hdr[2:0] == 4   (c)
        ) ; end condition and
        #xffff ; <- then
        extract_hdr ; <- else
     )))
     (ite extract_hdr_valid a!1 invalid))

A correct and would have something like: (and (bvsle #xfffb extract_hdr) (bvsle extract_hdr 0x0004)).

@fruffy
Copy link
Collaborator

fruffy commented Apr 14, 2022

Thanks for looking into this? Yeah I believe it is a type conversion issue in the range matching. The validator uses its own type inference which may have flaws. I xfailed those couple programs until I have implemented the fix.

@vlstill
Copy link
Contributor Author

vlstill commented Apr 14, 2022

@fruffy, thanks. So if @mbudiu-vmw is happy with my changes this can be merged.

@@ -115,12 +143,47 @@ DoReplaceSelectRange::cartesianAppend(const std::vector<IR::Vector<IR::Expressio
return newVecs;
}

const IR::Node *DoReplaceSelectRange::preorder(IR::SelectExpression *e) {
BUG_CHECK(!inSelect, "A select nested in select: %1%", e);
inSelect = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general you don't need such a boolean flag.
It can be replaced with findContext<IR::SelectExpression>() != nullptr, called from the child node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This flag is there mostly to avoid introduction of more inconsistencies if AST is unexpected (or visitor is broken, which is probably significantly less likely). We could avoid it altogether but I believe a cheap assert is better then possibility on long debugging in future. If I replace it with findContext I don't think it would be that cheap.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler code is not optimized for speed, but rather for correctness, readability, and maintainability. I think that findContext is more readable, and it is in only one place, so it is more maintainable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

// Collects select indices which will need to be replaced with bitcast of
// the original value to unsigned. This is needed if we encounter a range
// over a signed value at the given index.
std::set<int> signedIndicesToReplace;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not a set of size_t or unsigned values?
I would change this comment to say "an index i is in this set if selectExpression->components[i] needs to be cast from int to bit". This is only needed if there is a label that has in the i-th position a range expression that contains 0 inside the range".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Because old habits die hard.) Changed to size_t.

As for the comment, sure, except that we need to replace it any time the value is signed, not only when it crosses over zero. Masks are only defined for bit values in P4. I changed it.

@mihaibudiu
Copy link
Contributor

These are minor changes requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants