New rules for type inference


This example shows a problem with the current type inference algorithm.

```nim
  # this one doesn't work
  var x : set[char] = (echo "a"; {})
  # where this one does work
  var y : set[char] = {}
```

In the compiler the expression ``(echo "a"; {})`` is semantically
checked and the inferred type is ``set[empty]``. And then when the
assignment is checked, the type of the expression on the right hand
side is patched as an afterthought. This is the code in the compiler
that does it.

```nim
  proc fitNodePostMatch(c: PContext, formal: PType, arg: PNode): PNode =
    result = arg
    let x = result.skipConv
    if x.kind in {nkPar, nkTupleConstr, nkCurly} and formal.kind != tyUntyped:
      changeType(c, x, formal, check=true)
    else:
      result = skipHiddenSubConv(result)
```

Basically this only works for very simple expressions. According to
Araq this function does: "uoh, two types here that we need to combine
somehow into a real type". I see this as a problematic hack in many
ways. First of all it is a special case that only takes the simple
case into consideration ``x = {}`` where x is fully typed. Then the
second problem is, changes the type of the AST an a mutable
transformation as an afterthought (mutable changes might lead to
bugs). That is not a clean way to solve this problem, and there is a
better way to do it.

My proposal is the following:

Get rid of ``fidNode`` and ``fitNodePostMatch`` entirely. These procedures
don't work recursively on the AST, and making them traverse
recursively will introduce yet another AST traversal. This is costly
maybe incorrect and I don't think it is worth it. So I think the types
can be inferred in the first sem checking phase.

Currently when an expression like ``var x : set[char] = (echo "a"; {})``
is checked, we know that we expect the right hand side to be of type
``set[char]``, but this expected type is not passed to
``semExprWithType``.  So I suggest to add another parameter to
``semExprWithType``, the expected type. The expected type may always be
nil, but when it isn't it should be forwarded properly. With this
additional parameter the sem-checker will eventually do a sem-check on
``{}`` with with an expected type. This means that the expression ``{}``
instantly get the correct type, it won't be inferred as an empty node
anymore.

This expected type can then be used recursively to have expected types
in other contexts, such as tuple constructors, arrays and sequences.
Here is an example on how the expected type should be used recursively
in other contexts (pseudo syntax):

```nim
  var xxx : byte = (echo("a"); 123)

  # The expected type in every stmtListExpr before the last is `void`.
  # The last expression in a stmtListExpr is expected to be of the same
  # type as the whole expression.

  semExprWithType( {. (echo("a"); 123) .}, expectedType = {.byte.} )
  | semExprWithType( {. echo("a") .}, expectedType = {.void.} )
  | | semExprWithType( {. "a" .}, expectedType = nil ) -> ("a", string)
  | `-> (echo("a"), void)
  | semExprWithType( {. 123 .}, expectedType = {.byte.} ) -> (123'u8, byte)
  `-> ((echo("a"); 123), byte)

  var myarray: array[3, uint8] = [1,2,3]

  # When an array is expected, the elements of a array literal should be
  # expected to be of the arrays element type.

  semExprWithType( {. [1,2,3] .}, expectedType = {. array[3, uint8] .} )
  | semExprWithType( {. 1 .}, expectedType = {. uint8 .} ) -> (1'u8, uint8)
  | semExprWithType( {. 2 .}, expectedType = {. uint8 .} ) -> (2'u8, uint8)
  | semExprWithType( {. 3 .}, expectedType = {. uint8 .} ) -> (3'u8, uint8)
  `-> ( [1'u8, 2'u8, 3'u8],  array[3, uint8] )

  # When a tuple is expected, the elements of a tuple literal should be
  # expected to be of their corresponding type in the tuple literal.

  var mytuple: (uint8,string) = (123,"abc")
  semExprWithType( {. (123,"abc") .}, expectedType = {. (uint8,string) .} )
  | semExprWithType( {. 123 .}, expectedType = {. uint8 .} ) -> (123'u8, uint8)
  | semExprWithType( {. "abc" .}, expectedType = {. string .} ) -> ("abc", string)
  `-> ( (123'u8, "abc"), (uint8, string) )

  # The following example already works today, but how it works would change.

  type BaseObject = ref object of RootObj
  type ObjA = ref object of BaseObject
  type ObjB = ref object of BaseObject
  var x: seq[BaseObject] = @[ObjA(), ObjB()]

  semExprWithType( {. @[ObjA(), ObjB()] .}, expectedType = {. seq[BaseObject] .} )
  | semExprWithType( {. ObjA() .}, expectedType = {. BaseObject .} ) -> (ObjA(), BaseObject)
  | semExprWithType( {. ObjB() .}, expectedType = {. BaseObject .} ) -> (ObjB(), BaseObject)
  `-> (@[ObjA(), ObjB()], BaseObject), seq[BaseObject])

  # This is an example on how error messages could be improved. The bug
  # in the program is `ObjC` inherits from `RootObj` instead of `BaseObject`.
  # The current error message is ``type mismatch: got <seq[ref RootObj]> but expected 'seq[BaseObject]'``
  # The error message does not specify which element is wrong.

  type ObjC = ref object of RootObj
  var x: seq[BaseObject] = @[ObjA(), ObjB(), ObjC()]

  semExprWithType( {. @[ObjA(), ObjB(), ObjC()] .}, expectedType = {. seq[BaseObject] .} )
  | semExprWithType( {. ObjA() .}, expectedType = {. BaseObject .} ) -> (ObjA(), BaseObject)
  | semExprWithType( {. ObjB() .}, expectedType = {. BaseObject .} ) -> (ObjB(), BaseObject)
  | semExprWithType( {. ObjC() .}, expectedType = {. BaseObject .} ) -> error("type mismatch: got <ObjC> but expected 'BaseObject'")

  # This error is much more specific as it can point out the exact type
  # mismatch and the position in the array where the type mismatch
  # happens.
```

Tuples like the example above might also be detected as correct code,
because the sem checker will be able to know that in the first element
of the tuple constructor an ``uint8`` is expected, not a string literal.

steps that need to be done before this can be implemented

1. ``enforceVoidContext`` is currently ``tyTyped``. This needs to be
   changed to ``tyVoid`` consistently though the compiler.
2. The void type in the compiler is currently represented as ``nil``. I
   this is problematic as ``typ.kind`` cannot be accessed safely. I
   think the ``nil`` type can be used better as "no type expectation",
   or when seen in the AST: "type not yet inferred".

I hope that the expression flags ``efWantStmt`` and ``efDetermineType``
can be superseded by this and therefore removed. As ``efWantStmt`` is
represented as ``efWantValue``.

This could potentially improve the usability of pure enums, because
in contexts where a concrete enum type is expected, the owner of that
enum identifier is clear.

```nim
type
  MyEnum {.pure.} = enum
    valueA, valueB, valueC, valueD, amb

  OtherEnum {.pure.} = enum
    valueX, valueY, valueZ, amb

let mySet: set[MyEnum] = {valueX, amb}
```

``amb`` is declared in both MyEnum and OtherEnum, but it is clear that
``MyEnum.amb`` is meant and not ``OtherEnum.amb``, because the expected
type will be ``MyEnum``.

This RFC supersedes the following issues:
https://github.com/nim-lang/Nim/issues/11109



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New rules for type inference #149

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

New rules for type inference #149

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions