Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

forward type alignments to allocators #12430

Merged
merged 17 commits into from
Apr 19, 2020
Merged

Conversation

krux02
Copy link
Contributor

@krux02 krux02 commented Oct 15, 2019

  • Introduce NIM_ALIGNOF in nimbase.h in the style of NIM_ALIGN.
  • Add alignment to RTTI.
  • Forward all alignment constraints of allocations to the allocator (allocator does not support custom alignment yet).
  • Fixes a problem with SIMD types in lambda lifting.
  • This is some foundation work to solve Memory corruption on newSeq[MyAlignedType](1) #7865.
  • Introduce define nimAlignPragma for bootstrapping (original alignas PR didn't need it).

This PR was previeously knows as Fix for newSeq on Aligned Type, since that part isn't solved, I renamed it.

When the tests are passing, this PR should be ready for review (and possibly merge).

@cooldome
Copy link
Member

cooldome commented Nov 13, 2019

I have tested this one and I find it very useful. It makes type alignment information to flow though all the levels way down to allocator. Allocator is not yet alignment aware but it can be fixed in next PR.

One more issue needs to be addressed. Closure environments needs to be aligned too when they capture a variable of aligned type

@cooldome
Copy link
Member

Test case for closures. Compile in release mode.

type
  m128d {.importc: "__m128d", header: "immintrin.h".} = object

proc add(a: m128d; b: m128d): m128d {.importc: "_mm_add_pd", header: "immintrin.h".}
proc set1*(a: float): m128d {.importc: "_mm_set1_pd", header: "immintrin.h".}
func `+`(a,b: m128d): m128d = add(a, b)


proc lambdaGen(a, b: float) : auto =
  let x1 = set1(2.0 + a)
  let x2 = set1(-23.0 - b)
  let capturingLambda = proc(x: m128d): m128d =
    let cc = x1 + x1
    let bb = x2 + set1(12.5)
    result = cc + bb + x
  return capturingLambda

let f1 = lambdaGen(2.0 , 2.221)
let f2 = lambdaGen(-1.226 , 3.5)

echo f1(set1(2.0))
echo f2(set1(-23.0

@krux02 krux02 force-pushed the fix-newseq-alignedtype branch from 8c0ea49 to 7bcbba5 Compare November 22, 2019 13:43
@krux02 krux02 force-pushed the fix-newseq-alignedtype branch from 87bdd83 to 698676e Compare November 22, 2019 17:18
@krux02 krux02 changed the title WIP: Fix for newSeq on Aligned Type forward type alignments to allocators Nov 22, 2019
when defined(nimAlignPragma):
data {.align: MemAlign.}: UncheckedArray[byte] # start of usable memory
else:
data: UncheckedArray[byte]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dangerous for bootstrapping purposes. Why not keep the old, more correct variant for it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I thought that you might comment on this part. Well, I changed the value for MemAlign from 8 to 16, this is also alignment guarantee for malloc in C. This will also work for many SIMD data types if native optimizations are not turned on.

Why not keep the old, more correct variant? Well the old variant is with the new value of MemAlign not correct anymore. It is true that alignment constraints aren't met during bootstrapping. But the question is: "Are the alignment constraints necessary during compilation?", and I would say: "No". There are no SIMD types used in the compiler yet. And if there are ever SIMD types used in the compiler, the branch without the align pragme should be dead by then.
I also wouldn't call it "dangerous", everything that might happen is, during Bootstrapping on 32 bit systems, float64 operations might be not as fast as they could be.

disabled: true
"""

# does not yet work
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the tests are green now, at least.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is disabled. Maybe I should remove it since you don't like disabled tests.

@krux02
Copy link
Contributor Author

krux02 commented Nov 29, 2019

@cooldome you example is incomplete, but it got the point. I could verify it on my end that the example worked with this PR applied. But I can't explain why, I didn't do anything in lambda lifting. All I do is forwarding the alignment information to the allocators. When I added your example to the tests, it failed on 32 bit systems. So I wouldn't say I fixed anything in that direction. But I think this PR will help to make it work in the future.

@cooldome
Copy link
Member

cooldome commented Dec 3, 2019

Somehow I missed that a big progress in this PR. Starting active testing.

@krux02: in 32bit systems the problem that stack it self is not 16 byte aligned hence fixing allocator doesn't safe a day. I gave up SSE vectorization in 32bit mode even in C++. I do the following builds these days: 32 bit unvectorised, 64bit SSE, 64bit AVX, 64bit AVX2. Fixing 32bit would require your own calling convention which out of scope of this PR. This calling convention should be used for proc and closures.

I don't think you need to do anything special in lambda lifting. LL defines the environment object and as long as new(MyEnvObject) is aligned then everything should work

@cooldome
Copy link
Member

cooldome commented Dec 3, 2019

I have tried it, looks like SSE with 16 bytes alignment now works, but not the AVX that needs 32 byte alignment. I guess it boils down to allocator that still have one fixed alignment,

@krux02
Copy link
Contributor Author

krux02 commented Apr 17, 2020

This was very painful to merge with the development branch. I won't do it again. Either merge it or reject it.

@Araq
Copy link
Member

Araq commented Apr 19, 2020

Irrelevant CI failure, merging.

@Araq Araq merged commit 4005f0d into nim-lang:devel Apr 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants