-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
forward type alignments to allocators #12430
Conversation
I have tested this one and I find it very useful. It makes type alignment information to flow though all the levels way down to allocator. Allocator is not yet alignment aware but it can be fixed in next PR. One more issue needs to be addressed. Closure environments needs to be aligned too when they capture a variable of aligned type |
Test case for closures. Compile in release mode. type
m128d {.importc: "__m128d", header: "immintrin.h".} = object
proc add(a: m128d; b: m128d): m128d {.importc: "_mm_add_pd", header: "immintrin.h".}
proc set1*(a: float): m128d {.importc: "_mm_set1_pd", header: "immintrin.h".}
func `+`(a,b: m128d): m128d = add(a, b)
proc lambdaGen(a, b: float) : auto =
let x1 = set1(2.0 + a)
let x2 = set1(-23.0 - b)
let capturingLambda = proc(x: m128d): m128d =
let cc = x1 + x1
let bb = x2 + set1(12.5)
result = cc + bb + x
return capturingLambda
let f1 = lambdaGen(2.0 , 2.221)
let f2 = lambdaGen(-1.226 , 3.5)
echo f1(set1(2.0))
echo f2(set1(-23.0 |
8c0ea49
to
7bcbba5
Compare
87bdd83
to
698676e
Compare
when defined(nimAlignPragma): | ||
data {.align: MemAlign.}: UncheckedArray[byte] # start of usable memory | ||
else: | ||
data: UncheckedArray[byte] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dangerous for bootstrapping purposes. Why not keep the old, more correct variant for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I thought that you might comment on this part. Well, I changed the value for MemAlign
from 8 to 16, this is also alignment guarantee for malloc in C. This will also work for many SIMD data types if native optimizations are not turned on.
Why not keep the old, more correct variant? Well the old variant is with the new value of MemAlign
not correct anymore. It is true that alignment constraints aren't met during bootstrapping. But the question is: "Are the alignment constraints necessary during compilation?", and I would say: "No". There are no SIMD types used in the compiler yet. And if there are ever SIMD types used in the compiler, the branch without the align pragme should be dead by then.
I also wouldn't call it "dangerous", everything that might happen is, during Bootstrapping on 32 bit systems, float64 operations might be not as fast as they could be.
disabled: true | ||
""" | ||
|
||
# does not yet work |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well the tests are green now, at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is disabled. Maybe I should remove it since you don't like disabled tests.
@cooldome you example is incomplete, but it got the point. I could verify it on my end that the example worked with this PR applied. But I can't explain why, I didn't do anything in lambda lifting. All I do is forwarding the alignment information to the allocators. When I added your example to the tests, it failed on 32 bit systems. So I wouldn't say I fixed anything in that direction. But I think this PR will help to make it work in the future. |
Somehow I missed that a big progress in this PR. Starting active testing. @krux02: in 32bit systems the problem that stack it self is not 16 byte aligned hence fixing allocator doesn't safe a day. I gave up SSE vectorization in 32bit mode even in C++. I do the following builds these days: 32 bit unvectorised, 64bit SSE, 64bit AVX, 64bit AVX2. Fixing 32bit would require your own calling convention which out of scope of this PR. This calling convention should be used for proc and closures. I don't think you need to do anything special in lambda lifting. LL defines the environment object and as long as |
I have tried it, looks like SSE with 16 bytes alignment now works, but not the AVX that needs 32 byte alignment. I guess it boils down to allocator that still have one fixed alignment, |
This was very painful to merge with the development branch. I won't do it again. Either merge it or reject it. |
Irrelevant CI failure, merging. |
NIM_ALIGNOF
innimbase.h
in the style ofNIM_ALIGN
.nimAlignPragma
for bootstrapping (original alignas PR didn't need it).This PR was previeously knows as
Fix for newSeq on Aligned Type
, since that part isn't solved, I renamed it.When the tests are passing, this PR should be ready for review (and possibly merge).