Skip to content

C++: Value number performance fix #2835

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Feb 15, 2020
Merged

Conversation

MathiasVP
Copy link
Contributor

@MathiasVP MathiasVP commented Feb 13, 2020

This PR fixes a performance problem observed when an FieldAddressInstruction's getField predicate returns more than one field.

The fix is the same as in the AST library: we simply filter out those FieldAddressInstructions that have too many fields.

The PR also fixes a problem caused by a ConstantInstruction having multiple value numbers due to the instruction having more than one IR return type.

@MathiasVP MathiasVP added the C++ label Feb 13, 2020
@MathiasVP MathiasVP requested a review from jbj February 13, 2020 17:05
@MathiasVP MathiasVP requested a review from a team as a code owner February 13, 2020 17:05
@MathiasVP
Copy link
Contributor Author

MathiasVP commented Feb 13, 2020

3 tests are failing now due to the change in value numbering for ConstantInstructions. This snippet is representative of all the failing testcases:

 #   56|     r56_21(int)           = Constant[1]          :
-#   56|         valnum = unique
+#   56|         valnum = r56_21, r62_3
...
 #   62|     r62_3(unsigned int)        = Constant[1]             :
-#   62|         valnum = unique
+#   62|         valnum = r56_21, r62_3

I'm guessing we do want to distinquish these two values with different value numbers.

Edit: Fixed this in 57613d5

MathiasVP and others added 3 commits February 13, 2020 21:49
…stant with different signed-ness the same value number. Instead filter those with more than one type out.
Instructions that are removed from the normal value numbering recursion
because they have a duplicated type or AST element get unique value
numbers rather than going unnumbered. This ensures comparisons of value
numbers using `!=` hold for filtered instructions.
) {
loadTotalOverlapValueNumber(_, irFunc, type, memOperand, operand)
TLoadTotalOverlapValueNumber(IRFunction irFunc, TValueNumber memOperand, TValueNumber operand) {
loadTotalOverlapValueNumber(_, irFunc, memOperand, operand)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the removal of type here make us confused about the following (inspired by #2772 (comment))?

double foo(int* p, int choice) {
  if (choice == 1)
    return *(int*)p; // 1
  else if (choice == 2)
    return (int)*(char*)p; // 2
  else
    return *(float*)p; // 3
}

I think one could argue that 1 and 3 should have the same value number before their conversion to double since they're the same bit pattern, but it seems wrong to give 1 and 2 the same value number. It seems especially wrong to give the same value number to the * in 1 and the (int) in 2.

Copy link
Contributor Author

@MathiasVP MathiasVP Feb 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the following test function to ir_gvn.ql:

void foo(int* p) {
  double d1 = *(int*)p;
  double d2 = (int)*(char*)p;
  double d3 = *(float*)p;
}

gives the following value numbers with this PR:

r162_5 = *(int*)p // r162_5, r163_5, r164_5
r163_5 = *(char*)p // r162_5, r163_5, r164_5
r162_6 = (double)*(int*)p  // r162_6, r163_6, r164_6
r163_6 = (int)*(char*)p // r162_6, r163_6, r164_6
r164_5 = *(float*)p // r162_5, r163_5, r164_5
r164_6 = (double)*(float*)p // r162_6, r163_6, r164_6

So the expressions *(int*)p and *(float*)p have the same value number. (i.e., // 1 and // 3 have the same value number before conversions to double).
But d1 and d2 are assigned the same value number, sadly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least based on the current design, *(int*)p and *(float*)p should not have the same value number. The intention is that two results will only have the same value number if they have the same bit pattern and have the same IRType.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can someone check if the removal of type here is necessary for performance of bloomberg/bde?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal is unnecessary for performance on bloomberg/bde, but we also need #2844 for it.

Copy link
Contributor

@jbj jbj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM. I only contributed an autoformat commit, so I'll allow myself to merge it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants