Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

myfuns[runtimeIndex]() gives FieldError when myfuns is a CT array of closures #14340

Closed
akavel opened this issue May 13, 2020 · 26 comments · Fixed by #16386
Closed

myfuns[runtimeIndex]() gives FieldError when myfuns is a CT array of closures #14340

akavel opened this issue May 13, 2020 · 26 comments · Fixed by #16386
Labels
VM see also `const` label

Comments

@akavel
Copy link

akavel commented May 13, 2020

When trying to run complex code at compilation time, I'm getting a weird error, with no stack trace, that I don't know how to further debug. Nimvm compiler bug?

Example

$ mkdir dali
$ cd dali
$ git clone https://github.com/akavel/dali -b comptime-exp1 .
$ nim c q_comptime.nim

Current Output

[...]
fatal.nim(49)            sysFatal
Error: unhandled exception: 'sons' is not accessible using discriminant 'kind' of type 'TNode' [FieldError]

No stacktrace, no idea where this comes from :( running git grep sons on my code doesn't seem to show up any occurences of sons :/

Expected Output

A q_comptime.dex file should be silently created and written to.

Possible Solution

  • Nimvm bug?

Additional Information

  • I am trying to make it possible to run my dali project at comptime, to allow easy building of Android .apk packages from Nim.

Nim version:

C:\prog\dali>nim -v
Nim Compiler Version 1.2.0 [Windows: amd64]
Compiled at 2020-04-03
Copyright (c) 2006-2020 by Andreas Rumpf

git hash: 7e83adff84be5d0c401a213eccb61e321a3fb1ff
active boot switches: -d:release
@akavel
Copy link
Author

akavel commented May 13, 2020

Same with choosenim devel:

fatal.nim(49)            sysFatal
Error: unhandled exception: 'sons' is not accessible using discriminant 'kind' of type 'TNode' [FieldDefect]

C:\prog\tmp01>nim -v
Nim Compiler Version 1.3.5 [Windows: amd64]
Compiled at 2020-05-13
Copyright (c) 2006-2020 by Andreas Rumpf

git hash: a3719df8b327ce33d4fdbfc7be06b058e9a2ea48
active boot switches: -d:release

akavel added a commit to akavel/jnim that referenced this issue May 13, 2020
This change makes it possible to use `jexport` together with `jnimDexWrite` to
emit .dex files. The `jnimDexWrite` proc builds a .nim file, which can then be
compiled using package dali to generate a .dex file.

Currently, the intermediate step of generating a .nim file is required, as
Nim's compile-time has a lot of limitations. In this particular case, see e.g.:
- nim-lang/Nim#14339
- nim-lang/Nim#14340
akavel added a commit to akavel/jnim that referenced this issue May 13, 2020
This change makes it possible to use `jexport` together with `jnimDexWrite` to
emit .dex files. The `jnimDexWrite` proc builds a .nim file, which can then be
compiled using package dali to generate a .dex file.

Currently, the intermediate step of generating a .nim file is required, as
Nim's compile-time has a lot of limitations. In this particular case, see e.g.:
- nim-lang/Nim#14339
- nim-lang/Nim#14340
yglukhov pushed a commit to yglukhov/jnim that referenced this issue May 13, 2020
This change makes it possible to use `jexport` together with `jnimDexWrite` to
emit .dex files. The `jnimDexWrite` proc builds a .nim file, which can then be
compiled using package dali to generate a .dex file.

Currently, the intermediate step of generating a .nim file is required, as
Nim's compile-time has a lot of limitations. In this particular case, see e.g.:
- nim-lang/Nim#14339
- nim-lang/Nim#14340
@Araq
Copy link
Member

Araq commented May 14, 2020

I "fixed" the crash but your dex code is better with staticExec on a helper tool you provide. Nim's VM isn't capable of running your dex.render and once it is, it might be too slow.

@Araq Araq closed this as completed in 278b458 May 14, 2020
@timotheecour
Copy link
Member

timotheecour commented May 14, 2020

No stacktrace, no idea where this comes from :(

@akavel here are some debugging tips for next time:

  • you can compile nim with --stacktrace:on (see koch temp), it should print a stacktrace when nim crashes

and some more advanced flags when --stacktrace isn't enough:

  • -d:nimVMDebug helps to trace VM instructions
  • --stackTraceMsgs

running git grep sons on my code doesn't seem to show up any occurences of sons

it's inside compiler code (as would've been apparent with a nim compiled in debug mode)

anyway, after araq's fix, you should now get a cleaner stacktrace.

Also, always try to minimize your issues, even if a bit more work for your (eg you have Circular dependency detected warnings which may point to some other issue)

@akavel
Copy link
Author

akavel commented May 14, 2020

@timotheecour thanks; can I do it somehow quickly if I'm just using choosenim? or do I have to git clone nim, then find and follow the instructions to build it, then somehow inject to nimble or something?

@akavel
Copy link
Author

akavel commented May 14, 2020

@Araq @timotheecour Also, after some thinking, I realized I'm honestly somewhat confused as to how in particular I could best understand this issue being closed now; if that's not too much to ask for, I'd be super grateful if you could try and help me better understand a few things! Please assume no ill will in those questions, they're 100% sincere; I fully understand you guys are busy, and I'd just like to try and be helpful in a way that's best use of your time, and mine as well! Therefore, in particular:

  1. Should I understand this being closed as because the reproducer is too big for you to have time to analyze? Thus, if I try and reduce it with nim devel / koch (?), you'd still be interested in me opening an issue about this with a minimal reproducer?
  2. Or are you just not interested in the reason for this exception? Say, there's no plan to develop nimvm much more than what it is now? I always kinda assumed that the idea is for nimvm to be at feature parity with Nim proper, but I can understand I may be wrong, is that so? Thus, I should not bother minimizing this even if I were to find some time to try and do that? If that is so, could you possibly guide me to some document/comment where the intended goals & usecases of nimvm are explained in more detail, so that I know when and where to use it, vs. not to try using it?
  3. Or, is the root cause of the exception indeed already fixed, and once I try to git clone and run koch (?), or otherwise wait till tomorrow's nightly and run choosenim devel again, the exception won't show up anymore?
  4. Or something else alltogether?

Thanks!

@Araq Araq reopened this May 15, 2020
@Araq
Copy link
Member

Araq commented May 15, 2020

I'm sorry, the reason this was closed was mostly an accident. That were two bugs here, one of which I fixed but that doesn't help you. :-)

However, whether your code can work with the VM is unclear, the VM doesn't support copyMem for example. So yeah, a small reproducible example helps.

@Araq Araq added the VM see also `const` label label May 15, 2020
@akavel
Copy link
Author

akavel commented May 15, 2020

@Araq thanks! :)

As a side note, I'm happy to try and use workarounds for features not available in nimvm when possible, for example I specifically:

But when the compiler itself is breaking, it's waaaay over my head, so all I can do for now is try and report an issue, and hope one day it might get fixed... :) Will try to minimize it If I Find Time At Some Point™; for now, I find consolation for myself in that what I reported seems to be at least a deterministic reproducer ;)

@akavel
Copy link
Author

akavel commented May 16, 2020

Uh oh, so I actually couldn't resist and tried to at least find the line of code which triggers this, and it seems to be a call to sha1's secureHash. If you checkout the new branch comptime-exp2, I seem to have isolated it to a code block like below:

  #...
  when defined(comptime02):
    if true: return
  let tmp03 = secureHash(tmp02)
  if true: return
  #...

With it, I'm getting the following results

C:\prog\dali>nim c -r q_comptime.nim
[...]
fatal.nim(49)            sysFatal
Error: unhandled exception: 'sons' is not accessible using discriminant 'kind' of type 'TNode' [FieldDefect]

C:\prog\dali>nim c -r -d:comptime02 q_comptime.nim
[...]
CC: stdlib_io.nim
CC: stdlib_system.nim
CC: q_comptime.nim
Hint:  [Link]
Hint: 85514 LOC; 2.713 sec; 117.832MiB peakmem; Debug build; proj: C:\prog\dali\q_comptime.nim; out: C:\prog\dali\q_comptime.exe [SuccessX]
Hint: C:\prog\dali\q_comptime.exe  [Exec]

Surprisingly to me, when I tried to extract just the problematic lines to a separate file (included in the new branch), I got a very different error:

C:\prog\dali>nim c q_ct02.nim
[...]
gcc!
C:\Users\Mateusz\.choosenim\toolchains\nim-#devel\lib\pure\endians.nim(59, 24) Error: cannot 'importc' variable at compile time; builtin_bswap64

That's all I could get to for now; maybe this helps you understand more about it already? Trying to reduce the code gradually while still reproducing the error would be more work than the "line bisect" I did for now, so again not sure when and if I'll have time for that. Cheers!

@timotheecour
Copy link
Member

@akavel thanks for not giving up! bugs should be fixed, not worked around (otherwise it'll just impact the next person).
A good reduction removes as much code as possible, including stdlib modules (right now you still have import std/sha1, which hides the problem)

=> I've reduced the problem here: #14368

note (for next time): reducing this was really a mechanical operation. All i did was add some debugging statement echo (data.len, ctx.buf.len, i, j) right above the line that gives a stacktrace error copyMem(addr ctx.buf[i], unsafeAddr data[j], 64 - i); then that led me to try on a similar input in a standalone file, and that was it; after reducing the next step was to modify the test to see the boundary between fails/succeeds.

@akavel
Copy link
Author

akavel commented May 16, 2020

@timotheecour Um, but #14368 seems to print a different error message than what I reported here, no? so I'm now confused how #14368 is a reduction of this one? Also what copyMem are you referring to in the original code?

In addition to that, with regards to the side comments, if this is about modifying stdlib or compiler, please kindly note that jumping into koch territory is definitely not trivial from a casual nim & choosenim user's persective. I know there are better reports and worse reports, but please also note this is a spectrum, and I believe reporting a poor reproducer is still better than not reporting any, so that's what I did... and also I must go with workarounds first for best effectiveness on both mine and Araq's side. I'm happy to also report simpler issues once the tougher ones are out of the way.

@timotheecour
Copy link
Member

timotheecour commented May 18, 2020

Um, but #14368 seems to print a different error message than what I reported here, no?

it's the same as one you get in original post: following your OP instructions, and updating to devel >= 278b458 (I don't think you're using latest devel since Error: unhandled exception: 'sons' is not accessible using discriminant 'kind' was already fixed in 278b458)

git clone https://github.com/akavel/dali -b comptime-exp1 && cd dali
nim c q_comptime.nim

stack trace: (most recent call last)
/Users/timothee/git_clone/nim/Nim_devel/dali/q_comptime.nim(39, 34) q_comptime
/Users/timothee/git_clone/nim/Nim_devel/dali/src/dali/dex.nim(358, 24) render
/Users/timothee/git_clone/nim/Nim_devel/lib/std/sha1.nim(207, 8) secureHash
/Users/timothee/git_clone/nim/Nim_devel/lib/std/sha1.nim(155, 45) update
/Users/timothee/git_clone/nim/Nim_devel/dali/q_comptime.nim(5, 1) template/generic instantiation from here
/Users/timothee/git_clone/nim/Nim_devel/lib/std/sha1.nim(155, 45) Error: index out of bounds, the container is empty
      copyMem(addr ctx.buf[i], unsafeAddr data[j], 64 - i)
                                              ^

which is the same error and same root cause as in reduction I gave in #14368

different error message than what I reported here

nim c -r -d:comptime02 q_comptime.nim is a different issue altogether (even if originating from your same initial use case)

Also what copyMem are you referring to in the original code?

the bug happens before copyMem is called as shown in stacktrace: it's happening while evaluating one of its arguments, unsafeAddr data[j]

jumping into koch territory is definitely not trivial

not judging whether it's easy/hard but koch territory just means (in this context) building nim from source with --stacktrace:on to get stacktraces in nim compiler:

it's really just: ./koch temp (once you've built koch); or, if starting from a fresh clone, it's:

git clone https://github.com/nim-lang/Nim
sh build_all.sh
./koch temp

if this is about modifying stdlib or compiler [...]

I was just trying to give helpful tips so that you can make progress debugging on your end if you're the impatient kind (like I am and some other ppl as well) and would rather dig into a problem than wait for someone to fix it. The issue you've reported is real and thanks for reporting them (which will happen in time regardless). Fixing a bug is often a sequence of steps, and anyone (including reporter) is welcome to help along those steps:
reporting reproducible bug => minimizing (ideally with minimal dependencies) => finding root cause => fixing it locally => PR => merge

Error: cannot 'importc' variable at compile time; builtin_bswap64

there is experimental support for CT FFI but it's not default enabled (and currently doesn't support intrinsics,like __builtin_bswap64, but definitely could); until then, there are 3 choices:

  • avoid it in your code altogether with when nimvm: fallbackCT() else: usualCodeRT() (simplest fix; the question is where to branch in the code)
  • adding it to vmops (not ideal)
  • make CT FFI work with intrinsics (future work)

@akavel
Copy link
Author

akavel commented May 18, 2020

@timotheecour ❤️ Thanks for your patience. After writing my last reply, I started thinking about it more, what is the reason I'm whining and complaining so much, and I realized what would probably help me overcome it and ease the testing for me would be, I think, if I had a good simple guide how to build nim devel from source in a choosenim-based environment, in a way where I could be sure it won't break my usual working Nim setup. I will try to test the koch temp commands you provided for this function (when I have time). If they work for me, I will try and open a PR for including them as a guidance/hint in the Nim bug report template.

@timotheecour
Copy link
Member

open a PR for including them as a guidance/hint in the Nim bug report template

I think we already have guides for that so you could link to it instead of creating such guide

https://github.com/nim-lang/Nim/blob/devel/.github/ISSUE_TEMPLATE/bug_report.md could IMO emphasize more on need for generating minimized reproducible examples tested against latest devel, which puts a bit more burden on the person who's writing the bug report but less burden on whoever's gonna fix it.

Another thing that could help is if compiler messages start linking to relevant doc section, so that instead of

No stack traceback available
To create a stacktrace, rerun compilation with ./koch temp r <file>

you'll get:

No stack traceback available
To create a stacktrace, rerun compilation with ./koch temp r <file>, see https://nim-lang.github.io/Nim/intern.html#bootstrapping-the-compiler

likewise with many more error messages. Good docs, like good code, should be DRY (only explained in a single place, and linked to where needed)

@BarrOff
Copy link
Contributor

BarrOff commented Nov 17, 2020

Hello, I am writing a program and got the same error message. I can't give a condensed example, as I don't know what in about 33k lines of code causes it.

Versions of Nim used: 1.0.6, 1.4.0, devel @61352d5a203e76e3334b21766bdbd4ebc71c300a
Version 1.0.6 does not seem to be affected by the problem.
1.4.0 and devel create the same error.

Using the koch temp method, I got a stacktrace, which points to the file ast.nim of the compiler, at line 1113.
This is part of proc len*(n: Indexable): int {.inline.} =
The stacktrace can be found here: stacktrace.txt

@timotheecour
Copy link
Member

timotheecour commented Nov 17, 2020

I don't know what in about 33k lines of code causes it

compile nim with -d:debug --stacktrace:on --excessiveStackTrace --stackTraceMsgs -d:nimCompilerStackraceHints

(introduced in #13351, see PR for what this does)

and show the stacktrace again

@timotheecour timotheecour changed the title nimvm: sysFatal: unhandled exception: 'sons' is not accessible using discriminant 'kind' of type 'TNode' [FieldError] vm: sysFatal: unhandled exception: 'sons' is not accessible using discriminant 'kind' of type 'TNode' [FieldError] Nov 17, 2020
@BarrOff
Copy link
Contributor

BarrOff commented Nov 17, 2020

Ok, here is the new stacktrace2.txt

@narimiran
Copy link
Member

When trying to run complex code at compilation time, I'm getting a weird error, with no stack trace, that I don't know how to further debug. Nimvm compiler bug?

Example

$ mkdir dali
$ cd dali
$ git clone https://github.com/akavel/dali -b comptime-exp1 .
$ nim c q_comptime.nim

Current Output

[...]
fatal.nim(49)            sysFatal
Error: unhandled exception: 'sons' is not accessible using discriminant 'kind' of type 'TNode' [FieldError]

@akavel I've tried to reproduce the problem, and with Nim 1.2.x I can reproduce it, but with Nim 1.4.x and devel the reported error is not happening anymore. (There is an error happening at src/dali/dex.nim(358, 24), which is let sha1 = secureHash(blob.string.substr(0x20)).Sha1Digest, but it is unrelated to this)

@narimiran
Copy link
Member

narimiran commented Nov 24, 2020

Hello, I am writing a program and got the same error message. I can't give a condensed example, as I don't know what in about 33k lines of code causes it.

Using the koch temp method, I got a stacktrace, which points to the file ast.nim of the compiler, at line 1113.
This is part of proc len*(n: Indexable): int {.inline.} =
The stacktrace can be found here: stacktrace.txt

@BarrOff Yes, it is usually something that calls len instead of safeLen, but to be able to find out what it is, the original code would be helpful.
Even if you cannot condense those 33k lines, if it is part of a public repo you can give us a link and steps how to reproduce it. (Just like OP did for their example)

If you cannot share your code, I would recommend you to go to compiler/ccgexprs.nim in Nim's repo and inside of proc genBracedInit change the line if typ.callConv == ccClosure and n.len > 1 and n[1].kind == nkNilLit: to now use n.safeLen > 1 - this is the problematic line based on your stack traces.
Tell us if it helps or not.

@akavel
Copy link
Author

akavel commented Nov 25, 2020

@narimiran from my side I agree, the error I'm getting is now different, so for my original case the issue is resolved; I guess I'm handing off to @BarrOff if they want to double down on this from some more info from their side (or if you want them to open a separate issue for their case).

@BarrOff
Copy link
Contributor

BarrOff commented Nov 27, 2020

@narimiran Right now it is still a private repo, but I hope to publish it in the near future. Please be patient just a little bit.

Just compiled Nim on Linux and FreeBSD with your suggested patch and can confirm that it now works! The error disappears and code creation works, on both plattforms.

Nim version: devel@c9a10bb9e47c5227b32f49f5876e965cc2308541

narimiran added a commit to narimiran/Nim that referenced this issue Nov 28, 2020
@BarrOff
Copy link
Contributor

BarrOff commented Dec 16, 2020

@narimiran: the repo can now be found here.

The error can be reproduced simply by calling nimble build in the repository root.

As I pointed out above, your commit fixes it for me.

@timotheecour
Copy link
Member

timotheecour commented Dec 17, 2020

As I pointed out above, your commit fixes it for me.

use the PR instead for reference + discussion: #16168
it wasn't clear that this PR actually fixed the underlying problem, instead hiding one of its symptom

EDIT: case in point, see #16375

To compile using nim > 1.0 this patch has to be applied when compiling nim, because of this issue.
Using those versions, the game reproducibly crashes, whereas it works for Nim 1.0.x

@BarrOff
Copy link
Contributor

BarrOff commented Dec 17, 2020

@timotheecour as the PR is seems to be rejected and is currently closed, I have to continue here

@narimiran @timotheecour Using the simple bracketing algorithm, I found commit 6152eb3 to cause the compilation error. Until the previous commit, compilation works, from this commit on, it doesn't.
However my knowledge of the compiler code is to limited for me to exactly point out what causes the issue. I hope this helps nevertheless.

@timotheecour
Copy link
Member

timotheecour commented Dec 17, 2020

@BarrOff

Using the simple bracketing algorithm

you mean git bisect? if not, use it, it's magic :-)

next step to debug this:

since according to https://gist.github.com/BarrOff/9ca157692d7a894749305c1911580538 it crashes here

/opt/local/nim/compiler/ccgexprs.nim(2671) expr
/opt/local/nim/compiler/ccgexprs.nim(2521) genComplexConst
/opt/local/nim/compiler/cgen.nim(1199) requestConstImpl
/opt/local/nim/compiler/ccgexprs.nim(3108) genBracedInit
/opt/local/nim/compiler/ccgexprs.nim(3010) genConstSimpleList
/opt/local/nim/compiler/ccgexprs.nim(3089) genBracedInit
/opt/local/nim/compiler/ast.nim(1115) len
/opt/local/nim/lib/system/fatal.nim(49) sysFatal

we need line info in user code + more context at the place it crashes, can you try:

    of tyProc:
      if typ.callConv == ccClosure and n.len > 1 and n[1].kind == nkNilLit:

=>

    of tyProc:
      if typ.callConv == ccClosure and n.safeLen == 0:
        echo ("BUG", p.config$n.info, isConst, p.prc.sym)
        debug(n) # from astalgo
        debug(typ)
      if typ.callConv == ccClosure and n.len > 1 and n[1].kind == nkNilLit:

(adapt as needed)

EDIT: one obvious difference from the commit that introduces the regression 6152eb3 is this:

before commit:
of nkBracket, nkPar, nkTupleConstr, nkClosure:

after commit:
no such test.

Once again, debug(n) and debug(typ) as mentioned above should help figure out what's going on

@BarrOff
Copy link
Contributor

BarrOff commented Dec 17, 2020

@timotheecour
I did not know about git-bisect, but did the same, just manually :-).

Adding your suggested change, I got an error message, trying to compile nim:
Error: undeclared field: 'sym' for type ast.PSym
Therefore I changed p.prc.sym to p.prc, which compiled.

The output can be found here
It points to a place, where I create a constant closure, the corresponding code is here

EDIT: I just changed the closure from const to be declared using let, and the error disappears. The same is true, when using var instead of const

@timotheecour
Copy link
Member

timotheecour commented Dec 18, 2020

I've reduced your code:

when true:
  #[
  Error: unhandled exception: 'sons' is not accessible using discriminant 'kind' of type 'TNode' [FieldDefect]
  ]#
  proc opl3EnvelopeCalcSin0() = discard
  type EnvelopeSinfunc = proc()
  # const EnvelopeCalcSin0 = opl3EnvelopeCalcSin0 # ok
  const EnvelopeCalcSin0: EnvelopeSinfunc = opl3EnvelopeCalcSin0 # bug
  const envelopeSin = [EnvelopeCalcSin0]
  var a = 0
  envelopeSin[a]()

that's what I meant by "please reduce" :-)

=> see PR #16386

timotheecour added a commit to timotheecour/Nim that referenced this issue Dec 18, 2020
@timotheecour timotheecour changed the title vm: sysFatal: unhandled exception: 'sons' is not accessible using discriminant 'kind' of type 'TNode' [FieldError] myfuns[runtimeIndex]() gives FieldError when myfuns is a CT array of closures Dec 18, 2020
timotheecour added a commit to timotheecour/Nim that referenced this issue Dec 19, 2020
Araq pushed a commit that referenced this issue Jan 1, 2021
mildred pushed a commit to mildred/Nim that referenced this issue Jan 11, 2021
ardek66 pushed a commit to ardek66/Nim that referenced this issue Mar 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
VM see also `const` label
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants