-
-
Notifications
You must be signed in to change notification settings - Fork 3k
[mypyc] Use struct access instead of native getters/setters #8660
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
It looks like Windows build failed because of empty initializers. Btw @msullivan if you have suggestions for tricky corner cases worth testing (interactions with interpreted classes, separate compilation, etc) please let me know. |
|
OK, all green now. |
|
In other news: this PR gives a 15%-17% perf boost (unless I messed up something). |
|
Unfortunately, the above result, only holds for gcc, for clang the perf win is only 2%. |
mypyc/codegen/emitclass.py
Outdated
| CPyType_T2, | ||
| {offsetof(native__C, _x2), offsetof(native__C, _y2), ...}, | ||
| ... | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we included the offset tables as part of the trait vtables?
msullivan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parts I've reviewed mostly look good (involving the getattr and setattr code directly). I haven't looked that closely at the offset table stuff because it seems to me that it should be possible to merge that back into trait vtables and thus save the extra word. (Maybe keep VTableAttr if we are doing it that way, but we can keep ditching it if that makes things cleaner)
| if decl_cl.is_trait and op.class_type.class_ir.is_trait: | ||
| # For pure trait access find the offset first, offsets | ||
| # are ordered by attribute position in the cl.attributes dict. | ||
| trait_attr_index = list(decl_cl.attributes).index(op.attr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put a TODO here or something so we can find this easy if we ever want to improve the perf
|
@msullivan |
msullivan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple things, but looks great. Thanks for this!
mypyc/codegen/emitclass.py
Outdated
| { | ||
| CPyType_T1, | ||
| C_T1_trait_vtable, // array of method pointers | ||
| {offsetof(native__C, _x1), offsetof(native__C, _y1), ...}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make it clear that this is a pointer to a table of that, not an inlined table?
| op.attr)) | ||
| # Otherwise, use direct or offset struct access. | ||
| attr_expr = self.get_attr_expr(obj, op, decl_cl) | ||
| if attr_rtype.is_refcounted: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh. This is the existing behavior, so I think it's correct to not change it in this diff, but we really probably should do undefined checks for non-refcounted types. (Which is like, only bool and tuples of bool, but.) In another PR, maybe
mypyc/codegen/emitfunc.py
Outdated
| '} else {') | ||
| self.emitter.emit_inc_ref(attr_expr, attr_rtype) | ||
| self.emitter.emit_line('}') | ||
| self.emitter.emit_line('{} = {};'.format(dest, attr_expr)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do the assignment to the destination before we do the checks. I think that will actually save us a load, since the compiler won't be able to assume that the value stays the same across the incref call.
After some discussions with Sully we agreed that essentially inlining all attribute access may be a good idea. For trait attribute access, this requires setting up offset tables.
I didn't add many tests, relying on existing ones, since this PR doesn't add any new functionality. Anyway, please review carefully (ignore whitespace option may be useful).