Skip to content

Conversation

@ilevkivskyi
Copy link
Member

After some discussions with Sully we agreed that essentially inlining all attribute access may be a good idea. For trait attribute access, this requires setting up offset tables.

I didn't add many tests, relying on existing ones, since this PR doesn't add any new functionality. Anyway, please review carefully (ignore whitespace option may be useful).

@ilevkivskyi ilevkivskyi requested a review from msullivan April 11, 2020 20:56
@ilevkivskyi
Copy link
Member Author

It looks like Windows build failed because of empty initializers. Btw @msullivan if you have suggestions for tricky corner cases worth testing (interactions with interpreted classes, separate compilation, etc) please let me know.

@ilevkivskyi
Copy link
Member Author

OK, all green now.

@ilevkivskyi
Copy link
Member Author

In other news: this PR gives a 15%-17% perf boost (unless I messed up something).

@ilevkivskyi
Copy link
Member Author

Unfortunately, the above result, only holds for gcc, for clang the perf win is only 2%.

CPyType_T2,
{offsetof(native__C, _x2), offsetof(native__C, _y2), ...},
...
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we included the offset tables as part of the trait vtables?

Copy link
Collaborator

@msullivan msullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parts I've reviewed mostly look good (involving the getattr and setattr code directly). I haven't looked that closely at the offset table stuff because it seems to me that it should be possible to merge that back into trait vtables and thus save the extra word. (Maybe keep VTableAttr if we are doing it that way, but we can keep ditching it if that makes things cleaner)

if decl_cl.is_trait and op.class_type.class_ir.is_trait:
# For pure trait access find the offset first, offsets
# are ordered by attribute position in the cl.attributes dict.
trait_attr_index = list(decl_cl.attributes).index(op.attr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put a TODO here or something so we can find this easy if we ever want to improve the perf

@ilevkivskyi
Copy link
Member Author

@msullivan
Thanks for the suggestion, merging offset tables in the trait vtables indeed removed the memory overhead. I tried to play with this a bit and I decided to ditch VTableAttr, it is probably subjective, but it seems simpler to me to have the 3-layer table layout than mix attributes and methods in trait vtables. Also if we will need to make this backend-agnostic, then it should be easy to adjust the logic depending on particular architecture we will choose.

@ilevkivskyi ilevkivskyi requested a review from msullivan April 14, 2020 13:27
Copy link
Collaborator

@msullivan msullivan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple things, but looks great. Thanks for this!

{
CPyType_T1,
C_T1_trait_vtable, // array of method pointers
{offsetof(native__C, _x1), offsetof(native__C, _y1), ...},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it clear that this is a pointer to a table of that, not an inlined table?

op.attr))
# Otherwise, use direct or offset struct access.
attr_expr = self.get_attr_expr(obj, op, decl_cl)
if attr_rtype.is_refcounted:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. This is the existing behavior, so I think it's correct to not change it in this diff, but we really probably should do undefined checks for non-refcounted types. (Which is like, only bool and tuples of bool, but.) In another PR, maybe

'} else {')
self.emitter.emit_inc_ref(attr_expr, attr_rtype)
self.emitter.emit_line('}')
self.emitter.emit_line('{} = {};'.format(dest, attr_expr))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do the assignment to the destination before we do the checks. I think that will actually save us a load, since the compiler won't be able to assume that the value stays the same across the incref call.

@ilevkivskyi ilevkivskyi merged commit ecbd190 into python:master Apr 15, 2020
@ilevkivskyi ilevkivskyi deleted the mypyc-attr-offset branch April 15, 2020 09:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants