Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LDC 1.3 (1.2) ARMv5 issues #2058

Closed
rracariu opened this issue Apr 3, 2017 · 38 comments
Closed

LDC 1.3 (1.2) ARMv5 issues #2058

rracariu opened this issue Apr 3, 2017 · 38 comments

Comments

@rracariu
Copy link
Contributor

rracariu commented Apr 3, 2017

This is a continuation of #2024 (comment)

The issues so far observed are related to exception handling when compiled with optimizations turned on.

For example, compiling a simple test:

void main()
{
  throw new Exception("");
}

ldc druntime and phobos compiled with -O2 or -O3

= Throwing new exception of type object.Exception: 0x402ff280 (struct at 0x4008c4ac, classinfo at 0x61980, 1 structs in flight)

  • entering personality function. state: 0; ucb: 0x4008c4b0, context: 0xbe823970
  • LSDA: 0x49e94
  • callsite: 0x49e9b, action: 0x49ec2, classinfo_table: 0x49ec8, ciEncoding: 144
  • ip=0x2c1b3 654508032 32 0
  • entering personality function. state: 0; ucb: 0x4008c4b0, context: 0xbe823970
  • LSDA: 0x49e40
  • callsite: 0x49e45, action: 0x49e86, classinfo_table: 0x49e8c, ciEncoding: 144
  • ip=0x2c0ab 1090519040 396 0
    Fatal error in EH code: _Unwind_RaiseException failed with reason code: 9

ldc druntime and phobos compiled with -O0 (No optimization) seems to work:

= Throwing new exception of type object.Exception: 0x403be280 (struct at 0x400704ac, classinfo at 0x87170, 1 structs in flight)

  • entering personality function. state: 0; ucb: 0x400704b0, context: 0xbeead7e0
  • LSDA: 0x6e980
  • callsite: 0x6e988, action: 0x6e9a2, classinfo_table: 0x6e9a8, ciEncoding: 144
  • ip=0x4125b 72 8 a4
  • Found correct landing pad and actionTableStartOffset 1
  • ci_size: 4, ci_encoding: 144
  • ti_offset: 1
  • Comparing catch object.Throwable to exception object.Exception
  • Found catch clause for 0x400704ac
  • entering personality function. state: 1; ucb: 0x400704b0, context: 0xbeeada3c
  • LSDA: 0x6e980
  • callsite: 0x6e988, action: 0x6e9a2, classinfo_table: 0x6e9a8, ciEncoding: 144
  • ip=0x4125b 72 8 a4
  • Found correct landing pad and actionTableStartOffset 1
  • ci_size: 4, ci_encoding: 144
  • ti_offset: 1
  • Comparing catch object.Throwable to exception object.Exception
  • Found catch clause for 0x400704ac
  • Calling catch block for 0x403be280 (struct at 0x400704ac)
  • Setting switch value to: 0x1
  • Setting landing pad to: 0x412b0
  • Destroyed exception struct at 0x400704ac, 0 structs in flight
    object.Exception@test.d(23)
    ----------------Segmentation fault (core dumped)

When compiling with optimizations turned on the landing pads are not found by the unwind function.

The segmentation fault is reported as a possible stack corruption by gdb (could be alignment issues, see bellow).

One problem I notices is that the ARM ehabi (http://infocenter.arm.com/help/topic/com.arm.doc.ihi0038b/IHI0038B_ehabi.pdf) suggest that _Unwind_Control_Block struct should be aligned to 8 byte boundary. Playing with that didn't do any good.

@joakim-noah
Copy link
Contributor

joakim-noah commented Apr 3, 2017

Hmm, this may not be so bad if it's only an optimization issue. Can you supply a backtrace for the segfault? I suggest you try building and running the test runner at various optimization levels and see how many modules pass.

@rracariu
Copy link
Contributor Author

rracariu commented Apr 3, 2017

The seg fault back trace looks like:

core was generated by `/usr/bin/test'.

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0005f330 in rt.backtrace.dwarf.__T4readThZ.read() ()
(gdb) bt
#0  0x0005f330 in rt.backtrace.dwarf.__T4readThZ.read() ()
#1  0x0005ebd8 in rt.backtrace.dwarf.runStateMachine() ()
#2  0x0005e564 in rt.backtrace.dwarf.resolveAddresses() ()
#3  0x0005dad4 in rt.backtrace.dwarf.traceHandlerOpApplyImpl() ()
#4  0x00050b9c in core.runtime.defaultTraceHandler() ()
#5  0x00050b34 in core.runtime.defaultTraceHandler() ()
#6  0x0003de60 in object.Throwable.toString() ()
#7  0x00041464 in rt.dmain2.formatThrowable() ()
#8  0x00040c50 in _d_print_throwable ()
#9  0x0004127c in rt.dmain2._d_run_main() ()
#10 0x0004132c in rt.dmain2._d_run_main() ()
#11 0x0004125c in rt.dmain2._d_run_main() ()
#12 0x00041164 in _d_run_main ()
#13 0x0001ca7c in main ()

I had some trouble getting the whole testing app to run. I decided to add tests one by one (or group them).

Another thorny issue is that I can't change the kernel or (easily) the libc version. For this reason core atomic fails as I get a "A newer kernel is required to run this binary. (__kernel_cmpxchg64 helper)" message when trying to run the atomic test.

But a good chunk of druntime tests are running, which is good news.

I will investigate more as time allows and keep you posted.

@kinke
Copy link
Member

kinke commented Apr 3, 2017

You should be able to disable the few relevant core.atomic tests by setting this enum to false and hope that there are no other parts requiring 64-bit atomic ops. ;)

@kinke
Copy link
Member

kinke commented Apr 3, 2017

Oh and as the segfault seems to occur when constructing the stacktrace for the exception msg, you should be able to skip the tracing for now by setting the runtime traceHandler to null.

@rracariu
Copy link
Contributor Author

rracariu commented Apr 5, 2017

@kinke Setting the atomic enum doesn't solve the problem. I did that first time. Don't know why, so I just removed the atomic test altogether.

I managed to make a full blown tester app and I observed this:

0.000s PASS debug32 core.bitop
0.000s PASS debug32 core.checkedint
0.070s PASS debug32 core.demangle
0.000s PASS debug32 core.exception
0.000s PASS debug32 core.internal.convert
****** FAIL debug32 core.internal.hash
core.exception.AssertError@/home/radur/ldc/ldc/runtime/druntime/src/core/internal/hash.d(370): Assertion failure
----------------
0.005s PASS debug32 core.internal.string
0.000s PASS debug32 core.math
0.005s PASS debug32 core.memory
[0x34cc0]
[0x190570]
[0x135da8]
[.......]
Segmentation fault (core dumped)

The segfault stack trace looks like:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  rt.util.container.array.__T5ArrayTS2rt19sections_elf_shared9ThreadDSOZ.Array.__invariant123() (this=...) at /home/radur/ldc/ldc/runtime/druntime/src/rt/util/container/array.d:135
135             assert(!_ptr == !_length);
(gdb) bt
#0  rt.util.container.array.__T5ArrayTS2rt19sections_elf_shared9ThreadDSOZ.Array.__invariant123() (this=...) at /home/radur/ldc/ldc/runtime/druntime/src/rt/util/container/array.d:135
#1  0x001351e8 in rt.util.container.array.__T5ArrayTS2rt19sections_elf_shared9ThreadDSOZ.Array.__invariant() (this=...) at /home/radur/ldc/ldc/runtime/druntime/src/rt/util/container/array.d:14
#2  0x0013571c in rt.util.container.array.__T5ArrayTS2rt19sections_elf_shared9ThreadDSOZ.Array.empty() (this=...) at /home/radur/ldc/ldc/runtime/druntime/src/rt/util/container/array.d:54
#3  0x00149114 in rt.sections_elf_shared.inheritLoadedLibraries() (p=0x25e7e0 "(S&") at sections_elf_shared.d:226
#4  0x0003bfbc in thread_entryPoint (arg=0x25ef88 "\360S&") at thread.d:318
#5  0x00172044 in start_thread ()
#6  0x001b2b8c in ?? ()

Which is strange

@rracariu
Copy link
Contributor Author

rracariu commented Apr 5, 2017

The exception I got in the array module was most-likely there because I was using "-static" and that messed up something (different glibc version).

Here are the results for druntime:

0.000s PASS debug32 core.bitop
0.000s PASS debug32 core.checkedint
0.185s PASS debug32 core.demangle
0.000s PASS debug32 core.exception
0.000s PASS debug32 core.internal.convert
****** FAIL debug32 core.internal.hash
core.exception.AssertError@/home/radur/ldc/ldc/runtime/druntime/src/core/internal/hash.d(370): Assertion failure
----------------
0.020s PASS debug32 core.internal.string
0.000s PASS debug32 core.math
0.025s PASS debug32 core.memory
0.035s PASS debug32 core.sync.barrier
0.120s PASS debug32 core.sync.condition
0.000s PASS debug32 core.sync.config
0.070s PASS debug32 core.sync.mutex
0.120s PASS debug32 core.sync.rwmutex
0.040s PASS debug32 core.sync.semaphore
Not safe to migrate Fibers between Threads on your system. Consider setting version CheckFiberMigration for this system in thread.d
1.645s PASS debug32 core.thread
1.090s PASS debug32 core.time
0.005s PASS debug32 gc.bits
0.010s PASS debug32 gc.config
0.010s PASS debug32 gc.impl.conservative.gc
0.000s PASS debug32 gc.pooltable
0.000s PASS debug32 ldc.eh.fixedpool
0.030s PASS debug32 object
0.000s PASS debug32 rt.aApply
0.000s PASS debug32 rt.aApplyR
0.000s PASS debug32 rt.aaA
****** FAIL debug32 rt.adi
core.exception.AssertError@/home/radur/ldc/ldc/runtime/druntime/src/rt/adi.d(202): Assertion failure
----------------
0.000s PASS debug32 rt.arrayassign
0.010s PASS debug32 rt.arraybyte
0.000s PASS debug32 rt.arraycast
0.065s PASS debug32 rt.arraydouble
0.045s PASS debug32 rt.arrayfloat
0.080s PASS debug32 rt.arrayint
0.000s PASS debug32 rt.arrayreal
0.015s PASS debug32 rt.arrayshort
0.000s PASS debug32 rt.backtrace.dwarf
0.020s PASS debug32 rt.cover
0.035s PASS debug32 rt.lifetime
0.015s PASS debug32 rt.minfo
0.000s PASS debug32 rt.monitor_
0.000s PASS debug32 rt.qsort
0.000s PASS debug32 rt.switch_
0.000s PASS debug32 rt.typeinfo.ti_Aint
0.000s PASS debug32 rt.util.container.array
0.005s PASS debug32 rt.util.container.hashtab
2.920s PASS debug32 rt.util.container.treap
0.000s PASS debug32 rt.util.hash
0.015s PASS debug32 rt.util.typeinfo
0.005s PASS debug32 rt.util.utf
0.000s PASS debug32 core.sys.posix.sys.select
0.000s PASS debug32 core.sys.linux.stdio
0.000s PASS debug32 core.sys.linux.tipc

@rracariu
Copy link
Contributor Author

rracariu commented Apr 5, 2017

And this is the most I got from a full druntime and phobos test before running out of memory:

0.000s PASS debug32 core.bitop
0.005s PASS debug32 core.checkedint
0.270s PASS debug32 core.demangle
0.025s PASS debug32 core.exception
0.000s PASS debug32 core.internal.convert
****** FAIL debug32 core.internal.hash
core.exception.AssertError@/home/radur/ldc/ldc/runtime/druntime/src/core/internal/hash.d(370): Assertion failure
----------------
0.000s PASS debug32 core.internal.string
0.000s PASS debug32 core.math
0.025s PASS debug32 core.memory
0.035s PASS debug32 core.sync.barrier
0.080s PASS debug32 core.sync.condition
0.000s PASS debug32 core.sync.config
0.110s PASS debug32 core.sync.mutex
0.050s PASS debug32 core.sync.rwmutex
0.030s PASS debug32 core.sync.semaphore
Not safe to migrate Fibers between Threads on your system. Consider setting version CheckFiberMigration for this system in thread.d
****** FAIL debug32 core.thread
core.thread.ThreadError@/home/radur/ldc/ldc/runtime/druntime/src/core/thread.d(3099): Error creating thread
----------------
1.125s PASS debug32 core.time
0.000s PASS debug32 gc.bits
0.025s PASS debug32 gc.config
0.690s PASS debug32 gc.impl.conservative.gc
0.000s PASS debug32 gc.pooltable
0.000s PASS debug32 ldc.eh.fixedpool
0.045s PASS debug32 object
0.005s PASS debug32 rt.aApply
0.000s PASS debug32 rt.aApplyR
0.000s PASS debug32 rt.aaA
****** FAIL debug32 rt.adi
core.exception.AssertError@/home/radur/ldc/ldc/runtime/druntime/src/rt/adi.d(202): Assertion failure
----------------
0.000s PASS debug32 rt.arrayassign
0.010s PASS debug32 rt.arraybyte
0.000s PASS debug32 rt.arraycast
0.065s PASS debug32 rt.arraydouble
0.050s PASS debug32 rt.arrayfloat
0.420s PASS debug32 rt.arrayint
0.000s PASS debug32 rt.arrayreal
0.015s PASS debug32 rt.arrayshort
0.000s PASS debug32 rt.backtrace.dwarf
0.000s PASS debug32 rt.cover
0.380s PASS debug32 rt.lifetime
0.020s PASS debug32 rt.minfo
0.000s PASS debug32 rt.monitor_
0.000s PASS debug32 rt.qsort
0.000s PASS debug32 rt.switch_
0.000s PASS debug32 rt.typeinfo.ti_Aint
0.000s PASS debug32 rt.util.container.array
0.010s PASS debug32 rt.util.container.hashtab
3.630s PASS debug32 rt.util.container.treap
0.000s PASS debug32 rt.util.hash
0.055s PASS debug32 rt.util.typeinfo
0.020s PASS debug32 rt.util.utf
0.000s PASS debug32 core.sys.posix.sys.select
0.000s PASS debug32 core.sys.linux.stdio
0.000s PASS debug32 core.sys.linux.tipc
0.185s PASS debug32 std.algorithm.comparison
****** FAIL debug32 std.algorithm.iteration
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/algorithm/iteration.d(2087): joiner: internal error
----------------
****** FAIL debug32 std.algorithm.mutation
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/algorithm/mutation.d(467): Assertion failure
----------------
1.165s PASS debug32 std.algorithm.searching
0.080s PASS debug32 std.algorithm.setops
0.720s PASS debug32 std.algorithm.sorting
****** FAIL debug32 std.array
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/array.d(3422): Assertion failure
----------------
0.730s PASS debug32 std.ascii
0.080s PASS debug32 std.base64
0.200s PASS debug32 std.bigint
****** FAIL debug32 std.bitmanip
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/bitmanip.d(2752): Assertion failure
----------------
0.025s PASS debug32 std.complex
0.075s PASS debug32 std.concurrency
0.045s PASS debug32 std.container.array
0.010s PASS debug32 std.container.binaryheap
0.005s PASS debug32 std.container.dlist
0.005s PASS debug32 std.container
****** FAIL debug32 std.container.rbtree
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/container/rbtree.d(994): Assertion failure
----------------
0.420s PASS debug32 std.container.slist
0.100s PASS debug32 std.container.util
0.815s PASS debug32 std.conv
0.050s PASS debug32 std.csv
****** FAIL debug32 std.datetime
core.time.TimeException@/home/radur/ldc/ldc/runtime/phobos/std/datetime.d(29706): Directory /usr/share/zoneinfo/ does not exist.
----------------
****** FAIL debug32 std.digest.crc
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/digest/crc.d(458): Assertion failure
----------------
****** FAIL debug32 std.digest.digest
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/digest/digest.d(266): Assertion failure
----------------
0.030s PASS debug32 std.digest.hmac
****** FAIL debug32 std.digest.md
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/digest/md.d(491): Assertion failure
----------------
****** FAIL debug32 std.digest.murmurhash
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/digest/murmurhash.d(630): Assertion failure
----------------
****** FAIL debug32 std.digest.ripemd
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/digest/ripemd.d(660): Assertion failure
----------------
****** FAIL debug32 std.digest.sha
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/digest/sha.d(1109): Assertion failure
----------------
1.080s PASS debug32 std.encoding
0.050s PASS debug32 std.exception
0.060s PASS debug32 std.experimental.allocator.building_blocks.affix_allocator

@joakim-noah
Copy link
Contributor

joakim-noah commented Apr 6, 2017

Looks pretty good. You can start the tester up with a list of just the remaining phobos tests and try those too, right?

Update: Are these built without optimization, ie -O0? Because it doesn't appear to be exception-handling that's failing with these tests.

@rracariu
Copy link
Contributor Author

rracariu commented Apr 6, 2017

Yep. Using -O0 for test runner.
Yeah I can run them manually.

There are some issues on some modules, as the test shows, but they appear to be isolated.

@rracariu
Copy link
Contributor Author

Here is the list of the test that run in release mode:

0.000s PASS release32 core.atomic
0.000s PASS release32 core.bitop
0.000s PASS release32 core.checkedint
0.090s PASS release32 core.demangle
0.005s PASS release32 core.exception
0.000s PASS release32 core.internal.convert
****** FAIL release32 core.internal.hash
core.exception.AssertError@/home/radur/ldc/ldc/runtime/druntime/src/core/internal/hash.d(370): Assertion failure
----------------
0.000s PASS release32 core.internal.string
0.000s PASS release32 core.math
0.030s PASS release32 core.memory
0.020s PASS release32 core.sync.barrier
0.110s PASS release32 core.sync.condition
0.000s PASS release32 core.sync.config
0.075s PASS release32 core.sync.mutex
0.085s PASS release32 core.sync.rwmutex
0.045s PASS release32 core.sync.semaphore
Not safe to migrate Fibers between Threads on your system. Consider setting version CheckFiberMigration for this system in thread.d
****** FAIL release32 core.thread
core.thread.ThreadError@/home/radur/ldc/ldc/runtime/druntime/src/core/thread.d(3099): Error creating thread
----------------
1.900s PASS release32 core.time
0.000s PASS release32 gc.bits
0.000s PASS release32 gc.config
1.820s PASS release32 gc.impl.conservative.gc
0.000s PASS release32 gc.pooltable
0.005s PASS release32 ldc.eh.fixedpool
0.035s PASS release32 object
0.000s PASS release32 rt.aApply
0.000s PASS release32 rt.aApplyR
0.000s PASS release32 rt.aaA
****** FAIL release32 rt.adi
core.exception.AssertError@/home/radur/ldc/ldc/runtime/druntime/src/rt/adi.d(202): Assertion failure
----------------
0.005s PASS release32 rt.arrayassign
0.010s PASS release32 rt.arraybyte
0.000s PASS release32 rt.arraycast
0.060s PASS release32 rt.arraydouble
0.045s PASS release32 rt.arrayfloat
0.405s PASS release32 rt.arrayint
0.000s PASS release32 rt.arrayreal
0.015s PASS release32 rt.arrayshort
0.000s PASS release32 rt.backtrace.dwarf
0.000s PASS release32 rt.cover
0.375s PASS release32 rt.lifetime
0.020s PASS release32 rt.minfo
0.000s PASS release32 rt.monitor_
0.000s PASS release32 rt.qsort
0.000s PASS release32 rt.switch_
0.000s PASS release32 rt.typeinfo.ti_Aint
0.000s PASS release32 rt.util.container.array
0.005s PASS release32 rt.util.container.hashtab
2.670s PASS release32 rt.util.container.treap
0.000s PASS release32 rt.util.hash
0.015s PASS release32 rt.util.typeinfo
0.005s PASS release32 rt.util.utf
0.005s PASS release32 core.sys.posix.sys.select
0.005s PASS release32 core.sys.linux.stdio
0.000s PASS release32 core.sys.linux.tipc
0.065s PASS release32 std.algorithm.comparison
****** FAIL release32 std.algorithm.iteration
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/algorithm/iteration.d(2087): joiner: internal error
----------------
****** FAIL release32 std.algorithm.mutation
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/algorithm/mutation.d(467): Assertion failure
----------------
0.430s PASS release32 std.algorithm.searching
0.030s PASS release32 std.algorithm.setops
0.285s PASS release32 std.algorithm.sorting
****** FAIL release32 std.array
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/array.d(3422): Assertion failure
----------------
0.745s PASS release32 std.ascii
0.025s PASS release32 std.base64
0.170s PASS release32 std.bigint
****** FAIL release32 std.bitmanip
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/bitmanip.d(2752): Assertion failure
----------------
0.030s PASS release32 std.complex
0.075s PASS release32 std.concurrency
0.035s PASS release32 std.container.array
0.000s PASS release32 std.container.binaryheap
0.005s PASS release32 std.container.dlist
0.000s PASS release32 std.container
****** FAIL release32 std.container.rbtree
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/container/rbtree.d(994): Assertion failure
----------------
0.410s PASS release32 std.container.slist
0.060s PASS release32 std.container.util
0.640s PASS release32 std.conv
0.055s PASS release32 std.csv
****** FAIL release32 std.datetime
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/datetime.d(1034): Assertion failure
----------------
****** FAIL release32 std.digest.crc
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/digest/crc.d(458): Assertion failure
----------------
****** FAIL release32 std.digest.digest
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/digest/digest.d(266): Assertion failure
----------------
0.035s PASS release32 std.digest.hmac
****** FAIL release32 std.digest.md
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/digest/md.d(491): Assertion failure
----------------
****** FAIL release32 std.digest.murmurhash
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/digest/murmurhash.d(630): Assertion failure
----------------
****** FAIL release32 std.digest.ripemd
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/digest/ripemd.d(660): Assertion failure
----------------
****** FAIL release32 std.digest.sha
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/digest/sha.d(1109): Assertion failure
----------------
0.805s PASS release32 std.encoding
0.100s PASS release32 std.exception
0.045s PASS release32 std.experimental.allocator.building_blocks.affix_allocator

[manual run]
3.790s PASS release32 std.experimental.allocator.building_blocks.allocator_list
1.120s PASS release32 std.experimental.allocator.building_blocks.bitmapped_block
0.000s PASS release32 std.experimental.allocator.building_blocks.bucketizer
0.130s PASS release32 std.experimental.allocator.building_blocks.fallback_allocator

[hangs forever] std.experimental.allocator.building_blocks.free_list

0.140s PASS release32 std.experimental.allocator.building_blocks.free_tree
****** FAIL release32 std.experimental.allocator.building_blocks.kernighan_ritchie
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/experimental/allocator/building_blocks/kernighan_ritchie.d(319): Assertion failure
----------------
0.000s PASS release32 std.experimental.allocator.building_blocks.null_allocator
0.030s PASS release32 std.experimental.allocator.building_blocks.quantizer
0.045s PASS release32 std.experimental.allocator.building_blocks.region
0.000s PASS release32 std.experimental.allocator.building_blocks.scoped_allocator
0.000s PASS release32 std.experimental.allocator.building_blocks.segregator
0.015s PASS release32 std.experimental.allocator.building_blocks.stats_collector
0.000s PASS release32 std.experimental.allocator.common
0.025s PASS release32 std.experimental.allocator.gc_allocator
0.000s PASS release32 std.experimental.allocator.mallocator
0.000s PASS release32 std.experimental.allocator.mmap_allocator
0.020s PASS release32 std.experimental.allocator
0.025s PASS release32 std.experimental.allocator.showcase
0.010s PASS release32 std.experimental.allocator.typed
****** FAIL release32 std.experimental.logger.core
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/experimental/logger/core.d(1878): Assertion failure
----------------
0.015s PASS release32 std.experimental.logger.filelogger
****** FAIL release32 std.experimental.logger.multilogger
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/experimental/logger/multilogger.d(136): Assertion failure
----------------
0.005s PASS release32 std.experimental.logger.nulllogger
0.000s PASS release32 std.experimental.ndslice.internal
0.015s PASS release32 std.experimental.ndslice.iteration
0.050s PASS release32 std.experimental.ndslice
0.025s PASS release32 std.experimental.ndslice.selection
****** FAIL release32 std.experimental.ndslice.slice
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/experimental/ndslice/slice.d(3350): byte
----------------
****** FAIL release32 std.net.curl
std.net.curl.CurlException@/home/radur/ldc/ldc/runtime/phobos/std/net/curl.d(3943): Failed to load curl, tried "libcurl.so", "libcurl.so.4", "libcurl-gnutls.so.4", "libcurl-nss.so.4", "libcurl.so.3".
----------------
2.090s PASS release32 std.net.isemail
0.180s PASS release32 std.range.interfaces
****** FAIL release32 std.range
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/range/package.d(7528): Assertion failure
----------------
0.115s PASS release32 std.range.primitives
0.005s PASS release32 std.regex.internal.backtracking
0.585s PASS release32 std.regex.internal.generator
0.005s PASS release32 std.regex.internal.ir
0.110s PASS release32 std.regex.internal.kickstart
0.300s PASS release32 std.regex.internal.parser
18.475s PASS release32 std.regex.internal.tests
****** FAIL release32 std.regex
core.exception.AssertError@/home/radur/ldc/ldc/runtime/phobos/std/regex/package.d(1682): Assertion failure
----------------
0.000s PASS release32 std.internal.cstring
409.750s PASS release32 std.internal.math.biguintarm
0.010s PASS release32 std.internal.math.biguintcore
0.000s PASS release32 std.internal.math.biguintnoasm
0.020s PASS release32 std.internal.math.errorfunction
0.070s PASS release32 std.internal.math.gammafunction
0.000s PASS release32 std.internal.scopebuffer
0.030s PASS release32 std.internal.test.dummyrange

@rracariu
Copy link
Contributor Author

As you can see I got the atomic test included, by patching core.atomic as follows:

--- a/src/core/atomic.d
+++ b/src/core/atomic.d
@@ -12,7 +12,7 @@ module core.atomic;

 version (LDC)
 {
-    enum has64BitCAS = true;
+    enum has64BitCAS = false;

     // Enable 128bit CAS on 64bit platforms if supported.
     version(D_LP64)
@@ -285,7 +285,7 @@ else version( LDC )
                     cast(shared int*)here, *cast(int*)&ifThis, *cast(int*)&writeThis);
                 res = *(cast(T*)&rawRes);
             }
-            else static if(T.sizeof == long.sizeof)
+            else static if(T.sizeof == long.sizeof && has64BitCAS)
             {
                 static assert(is(T : double));
                 long rawRes = cast(T)llvm_atomic_cmp_xchg!long(
@@ -1510,7 +1510,7 @@ if(__traits(isFloating, T))
         auto asInt = atomicLoad!(ms)(*ptr);
         return *(cast(typeof(return)*) &asInt);
     }
-    else static if(T.sizeof == long.sizeof)
+    else static if(T.sizeof == long.sizeof && has64BitCAS)
     {
         static assert(is(T : double));
         auto ptr = cast(const shared long*) &val;
@@ -1590,10 +1590,10 @@ version( unittest )
         testCAS!(shared Klass)( new shared(Klass) );

         testType!(float)(1.0f);
-        testType!(double)(1.0);

         static if( has64BitCAS )
         {
+            testType!(double)(1.0);
             testType!(long)();
             testType!(ulong)();
         }
@@ -1609,10 +1609,12 @@ version( unittest )
         shared float f = 0;
         atomicOp!"+="( f, 1 );
         assert( f == 1 );
-
-        shared double d = 0;
-        atomicOp!"+="( d, 1 );
-        assert( d == 1 );
+        static if( has64BitCAS )
+        {
+            shared double d = 0;
+            atomicOp!"+="( d, 1 );
+            assert( d == 1 );
+        }
     }

     pure nothrow unittest
@@ -1768,10 +1770,13 @@ version( unittest )

     pure nothrow @nogc @safe unittest // issue 16651
     {
-        shared ulong a = 2;
-        uint b = 1;
-        atomicOp!"-="( a, b );
-        assert(a == 1);
+        static if( has64BitCAS )
+        {
+            shared ulong a = 2;
+            uint b = 1;
+            atomicOp!"-="( a, b );
+            assert(a == 1);
+        }

         shared uint c = 2;
         ubyte d = 1;

@joakim-noah
Copy link
Contributor

Nice work, only 3-4 druntime modules failing is a good sign. Now the job is to figure out the remaining issues on ARMv5 by investigating the failing tests.

@rracariu
Copy link
Contributor Author

Well, there is the original issue with exception handling that I kinda worked around to get this far.

I did a test and looks like that you only need to compile ldc.eh.common with -O0 and EH works. The rest of druntime, phobos and user code can be compiled with -O3 -release.

Any idea on how to attack the EH issue?

@joakim-noah
Copy link
Contributor

We had a similar EH issue with ARMv7 failing at higher optimization levels a couple years ago, which Dan fixed last year, ldc-developers/druntime#51. Maybe it's still causing problems on ARMv5, or maybe there's an issue elsewhere in the EH code this time. Only way to find out is to carefully step through in a debugger and see what's going wrong.

What makes it much easier for you is you can compile that module without optimizations, see what's happening in the debugger, then compare to the same module with optimizations, ie only re-compile and link the single module. That will give you an idea of how the two diverge.

@rracariu
Copy link
Contributor Author

Thanks @joakim-noah for the links.

Reading through them I suspected that something funky is still happening with ldc.eh.common.udata4_read that Dan fixed, at least on armv5.

I did a test with building with -O3 but using @optStrategy("none") to disable optimizations for one case. Turned out that building with -O3 but disabling the optimizations for ldc.eh.common.udata4_read made the EH work and correct landing pads where found.

Here is compare dump for both optimizations on and off:

Opt -O3

Assembly:

    .section    .text._D3ldc2eh6common11udata4_readFKPhZk,"axG",%progbits,_D3ldc2eh6common11udata4_readFKPhZk,comdat
    .globl  _D3ldc2eh6common11udata4_readFKPhZk
    .p2align    2
    .type   _D3ldc2eh6common11udata4_readFKPhZk,%function
_D3ldc2eh6common11udata4_readFKPhZk:
    .fnstart
    ldr r2, [r0]
    ldr r1, [r2], #4
    str r2, [r0]
    mov r0, r1
    bx  lr
.Lfunc_end1:
    .size   _D3ldc2eh6common11udata4_readFKPhZk, .Lfunc_end1-_D3ldc2eh6common11udata4_readFKPhZk
    .cantunwind
    .fnend 
    
LL:

; [#uses = 0]
; Function Attrs: norecurse nounwind
define i32 @_D3ldc2eh6common11udata4_readFKPhZk(i8** nocapture dereferenceable(4) %addr) local_unnamed_addr #2 comdat {
  %1 = load i8*, i8** %addr, align 4              ; [#uses = 2]
  %2 = bitcast i8* %1 to i32*                     ; [#uses = 1]
  %3 = load i32, i32* %2, align 1                 ; [#uses = 1]
  %4 = getelementptr i8, i8* %1, i32 4            ; [#uses = 1, type = i8*]
  store i8* %4, i8** %addr, align 4
  ret i32 %3
}

Disabling optimizatins for udata4_read

@optStrategy("none")

Assembly:

.section    .text._D3ldc2eh6common11udata4_readFKPhZk,"axG",%progbits,_D3ldc2eh6common11udata4_readFKPhZk,comdat
    .globl  _D3ldc2eh6common11udata4_readFKPhZk
    .p2align    2
    .type   _D3ldc2eh6common11udata4_readFKPhZk,%function
_D3ldc2eh6common11udata4_readFKPhZk:
    .fnstart
    .save   {r4, lr}
    push    {r4, lr}
    .pad    #8
    sub sp, sp, #8
    mov r4, r0
    mov r0, #0
    str r0, [sp, #4]
    ldr r1, [r4]
    add r0, sp, #4
    mov r2, #4
    bl  memcpy
    ldr r0, [r4]
    add r0, r0, #4
    str r0, [r4]
    ldr r0, [sp, #4]
    add sp, sp, #8
    pop {r4, lr}
    bx  lr
.Lfunc_end1:
    .size   _D3ldc2eh6common11udata4_readFKPhZk, .Lfunc_end1-_D3ldc2eh6common11udata4_readFKPhZk
    .cantunwind
    .fnend
 
 LL:
 
; [#uses = 0]
; Function Attrs: noinline nounwind optnone
define i32 @_D3ldc2eh6common11udata4_readFKPhZk(i8** dereferenceable(4) %addr) local_unnamed_addr #2 comdat {
  %udata4 = alloca i32, align 4                   ; [#uses = 4, size/byte = 4]
  store i32 0, i32* %udata4
  %1 = bitcast i32* %udata4 to i8*                ; [#uses = 1]
  %2 = load i8*, i8** %addr                       ; [#uses = 1]
  %3 = call i8* @memcpy(i8* %1, i8* %2, i32 4) #0 ; [#uses = 0]
  %4 = load i8*, i8** %addr                       ; [#uses = 1]
  %5 = getelementptr i8, i8* %4, i32 4            ; [#uses = 1, type = i8*]
  store i8* %5, i8** %addr
  %6 = load i32, i32* %udata4                     ; [#uses = 0]
  %7 = load i32, i32* %udata4                     ; [#uses = 1]
  ret i32 %7
}

Looks that the alignment is still causing issues, check http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka15414.html section "Code Generation"

@kinke
Copy link
Member

kinke commented Apr 24, 2017

LL-wise, all relevant stuff the optimized version does differently is

  1. replacing memcpy by an (unaligned, i.e., align 1) load from addr, and
  2. assuming &addr is 4-bytes aligned when loading from and storing to &addr.

If case 1 is the issue, that's most likely an LLVM codegen bug. Case 2 would be invalid code I think (not-word-aligned storage for the pointer passed by ref?!). Checking the registers while debugging should show what the problem here is.

@rracariu
Copy link
Contributor Author

rracariu commented Apr 24, 2017

Here is a registry dump when debugging the optimized function

Reg start:
r0             0xbefff948       3204446536
r1             0x356f6  218870
r2             0x196d4  104148
r3             0xbefff8e4       3204446436
r4             0xbefff948       3204446536
r5             0x4001f190       1073869200
r6             0x0      0
r7             0x196d3  104147
r8             0xbefff94c       3204446540
r9             0x196a8  104104
r10            0xbefff98c       3204446604
r11            0xbefffce0       3204447456
r12            0xbefff9b0       3204446640
sp             0xbefff8f0       0xbefff8f0
lr             0x15884  88196
pc             0x23358  0x23358 <ldc.eh.common.udata4_read()>
cpsr           0x80000010       2147483664

Debug:

0x23358 <_D3ldc2eh6common11udata4_readFKPhZk>           ldr    r2, [r0]
    r2             0x356cf  218831
    r0             0xbefff948       3204446536

0x2335c <_D3ldc2eh6common11udata4_readFKPhZk+4>         ldr    r1, [r2], #4
    r1             0x27030000       654508032
    r2             0x356d3  218835
0x23360 <_D3ldc2eh6common11udata4_readFKPhZk+8>         str    r2, [r0]
    r2             0x356d3  218835
    r0             0xbefff948       3204446536
0x23364 <_D3ldc2eh6common11udata4_readFKPhZk+12>        mov    r0, r1
    r0             0x27030000       654508032
    r1             0x27030000       654508032

---------------------------------------------------

exit fn:
r0             0x27030000       654508032
r1             0x27030000       654508032
r2             0x356d3  218835
r3             0xbefff8e4       3204446436
r4             0xbefff948       3204446536
r5             0x4001f190       1073869200
r6             0x0      0
r7             0x196d3  104147
r8             0xbefff94c       3204446540
r9             0x196a8  104104
r10            0xbefff98c       3204446604
r11            0xbefffce0       3204447456
r12            0xbefff9b0       3204446640
sp             0xbefff8f0       0xbefff8f0
lr             0x15884  88196
pc             0x15884  0x15884 <ldc.eh.common.__T21eh_personality_commonTS3ldc2eh9libunwind13NativeContextZ.eh_personality_common()+148>
cpsr           0x80000010       2147483664


@kinke
Copy link
Member

kinke commented Apr 24, 2017

Alright, it's case 1, and I think it's an LLVM ARM codegen bug. It ignores the align 1 for the LL load of addr and goes ahead and emits a ldr r1, [r2], #4, with r2 containing the value of addr (0x356cf in your case, clearly not a multiple of 4). It's apparently a nice instruction loading the value from memory and incrementing the source pointer r2 by 4 (addr += 4) at the same time, but apparently requires an aligned address.

So either the previous LLVM optimization wrt. replacing alloca+memcpy by an unaligned load is problematic for ARM, or it's an ARM codegen bug which can't simply ignore an align 1 for load instructions.

@rracariu
Copy link
Contributor Author

Awesome @kinke!

On a positive note, I was able to run a pretty sofisticated vibe.d app on my controller (with the tweaks for eh and for some missing functions on my glibc version).

So great job guys, looks like we can run D apps on industrial controllers!
We just need to polish the nits to make it easy for more people to do it.

@kinke
Copy link
Member

kinke commented Apr 24, 2017

Good news, thanks!
@joakim-noah: Could you please file an LLVM issue about this? I don't have an account... :]

@dnadlinger
Copy link
Member

@rracariu: Great news – Many thanks for your perseverance!

@kinke
Copy link
Member

kinke commented Apr 25, 2017

We just need to polish the nits to make it easy for more people to do it.

Yep, and IMO the best way to start is to group your changes to logical commits and choose an appropriate pull-request target.
E.g., your core.atomic fixes wrt. guarding a few double tests with that has64BitCAS enum should go to upstream druntime.
The EH workaround with don't-optimize-function-attribute makes sense for LDC and ARMv5 targets (the ldr instruction apparently handles unaligned addresses for all later ARM versions), so we may consider introducing a new predefined D version such as ARM_V_<N> if there's no such thing already etc.

@joakim-noah
Copy link
Contributor

I can report an llvm bug, but are we sure what the problem is? If it's another optimization improperly replacing alloca+memcpy that could be the real issue, we should narrow that down first before reporting.

@kinke
Copy link
Member

kinke commented Apr 26, 2017

I'm 99.9% sure emitting a load instruction requiring 4-bytes alignment for an IR load with explicit align 1 has got to be bug. I'm speculating it has to do with the following increment of the source address by 4, i.e., ARM codegen eagerly trying to exploit this with the combo-instruction without taking alignment into proper consideration (for ARM architectures older than v6).

From Radu's link:

The ARMv6 architecture introduced the first hardware support for unaligned accesses. ARM11 and Cortex-A/R processors can deal with unaligned accesses in hardware, removing the need for software routines.
Support for unaligned accesses is limited to a sub-set of load/store instructions:
LDRB/LDRSB/STRB
LDRH/LDRSH/STRH
LDR/STR

@joakim-noah
Copy link
Contributor

joakim-noah commented Apr 26, 2017

OK, I will report the bug. @rracariu, which llvm version did you build ldc against?

Update: reported here, let me know if i got any of the details wrong.

@rracariu
Copy link
Contributor Author

Here is what I used:

ldc2 --version
LDC - the LLVM D compiler (1.3.0git-c2678f6-dirty):
  based on DMD v2.073.2 and LLVM 3.9.1
  built with DMD64 D Compiler v2.073.2
  Default target: x86_64-unknown-linux-gnu
  Host CPU: broadwell
  http://dlang.org - http://wiki.dlang.org/LDC

  Registered Targets:
    arm     - ARM
    armeb   - ARM (big endian)
    thumb   - Thumb
    thumbeb - Thumb (big endian)

Thanks @joakim-noah !

@kinke
Copy link
Member

kinke commented Apr 26, 2017

Thanks Joakim, looks really good.

@joakim-noah
Copy link
Contributor

@rracariu, can you build ldc against llvm 4.0 or trunk and see if it has been fixed? 3.9.1 isn't the latest.

@kinke
Copy link
Member

kinke commented Apr 26, 2017

There's been a reply regarding the LLVM issue. Apparently adding -mattr=+strict-align to the LDC command line makes it work, I can confirm that with LLVM 3.9.1. The optimized assembly then looks like this (loading each byte separately etc):

_D7current11udata4_readFKPhZk:
	.fnstart
	.save	{r11, lr}
	push	{r11, lr}
	ldr	r1, [r0]
	ldrb	r12, [r1]
	ldrb	lr, [r1, #1]
	ldrb	r2, [r1, #2]
	ldrb	r3, [r1, #3]
	add	r1, r1, #4
	str	r1, [r0]
	orr	r0, r2, r3, lsl #8
	orr	r1, r12, lr, lsl #8
	orr	r0, r1, r0, lsl #16
	pop	{r11, pc}

@rracariu
Copy link
Contributor Author

I can confirm that it works with -mattr=+strict-align on LLVM 3.9.1.

Also, looks that the segfault when throwing the exception is no logger there.

I thing the align attribute fixes a lot of other issues I encountered. I will make some time soon to rebuild the tests with the new option and see how it looks.

@JohanEngelen
Copy link
Member

Shouldn't we (for now) just simply always set the LLVM flag -mattr=+strict-align for ARMv5?
(It's easy to add that)

@joakim-noah
Copy link
Contributor

I haven't messed with the TargetMachine stuff much. Is the problem that this should be automatically set for ARMv5 by llvm and we can have ldc do it instead? If so, isn't it better to submit a PR to llvm upstream, if this is an intrinsic feature of this CPU arch?

@redstar
Copy link
Member

redstar commented Jul 18, 2017

I think it is a fast solution to add -mattr=+strict-align for ARMv5.
It looks like this is really an LLVM bug and we should submit an PR. But getting this into LLVM can take some time.

@joakim-noah
Copy link
Contributor

OK, I was trying to understand where exactly the problem comes from. I'm not sure a fast solution does much, not like we have many ldc users clamoring to use ARMv5 chips. :) Better to just get it into llvm.

Does anybody here have commit access to llvm? I think Amaury does, so we could always ask him, if nobody here does.

@redstar
Copy link
Member

redstar commented Jul 18, 2017

I have LLVM commit access. Most important is to get the ok in the review. In general, you can ask one of the reviewers to commit, too.

@thewilsonator
Copy link
Contributor

The LLVM mailing lists are usually pretty fast. I fixed a bug (2 line change uncontroversial fixed crash) and it was merged in less than a week. I think the main problem of relying on am LLVM fix it the time it will take to get into versions of LLVM we support (let alone release with). We should probably submit a patch anyway though.

@rracariu
Copy link
Contributor Author

I guess this issue can be closed, the original problem reported is fixable by using -mattr=+strict-align for ARMv5.

I moved to other HW configurations since then, so this particular configuration is not relevant for me, however, this particular processor ARM926EJ and the ARMv5-TEJ micro-architecture are still relevant today and might be used by others.

Also, for reference, here's a Go thread on why they still support it golang/go#17082

@joakim-noah
Copy link
Contributor

OK, I don't think anybody else is using this anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants