-
Notifications
You must be signed in to change notification settings - Fork 22
Description
This example shows a problem with the current type inference algorithm.
# this one doesn't work
var x : set[char] = (echo "a"; {})
# where this one does work
var y : set[char] = {}In the compiler the expression (echo "a"; {}) is semantically
checked and the inferred type is set[empty]. And then when the
assignment is checked, the type of the expression on the right hand
side is patched as an afterthought. This is the code in the compiler
that does it.
proc fitNodePostMatch(c: PContext, formal: PType, arg: PNode): PNode =
result = arg
let x = result.skipConv
if x.kind in {nkPar, nkTupleConstr, nkCurly} and formal.kind != tyUntyped:
changeType(c, x, formal, check=true)
else:
result = skipHiddenSubConv(result)Basically this only works for very simple expressions. According to
Araq this function does: "uoh, two types here that we need to combine
somehow into a real type". I see this as a problematic hack in many
ways. First of all it is a special case that only takes the simple
case into consideration x = {} where x is fully typed. Then the
second problem is, changes the type of the AST an a mutable
transformation as an afterthought (mutable changes might lead to
bugs). That is not a clean way to solve this problem, and there is a
better way to do it.
My proposal is the following:
Get rid of fidNode and fitNodePostMatch entirely. These procedures
don't work recursively on the AST, and making them traverse
recursively will introduce yet another AST traversal. This is costly
maybe incorrect and I don't think it is worth it. So I think the types
can be inferred in the first sem checking phase.
Currently when an expression like var x : set[char] = (echo "a"; {})
is checked, we know that we expect the right hand side to be of type
set[char], but this expected type is not passed to
semExprWithType. So I suggest to add another parameter to
semExprWithType, the expected type. The expected type may always be
nil, but when it isn't it should be forwarded properly. With this
additional parameter the sem-checker will eventually do a sem-check on
{} with with an expected type. This means that the expression {}
instantly get the correct type, it won't be inferred as an empty node
anymore.
This expected type can then be used recursively to have expected types
in other contexts, such as tuple constructors, arrays and sequences.
Here is an example on how the expected type should be used recursively
in other contexts (pseudo syntax):
var xxx : byte = (echo("a"); 123)
# The expected type in every stmtListExpr before the last is `void`.
# The last expression in a stmtListExpr is expected to be of the same
# type as the whole expression.
semExprWithType( {. (echo("a"); 123) .}, expectedType = {.byte.} )
| semExprWithType( {. echo("a") .}, expectedType = {.void.} )
| | semExprWithType( {. "a" .}, expectedType = nil ) -> ("a", string)
| `-> (echo("a"), void)
| semExprWithType( {. 123 .}, expectedType = {.byte.} ) -> (123'u8, byte)
`-> ((echo("a"); 123), byte)
var myarray: array[3, uint8] = [1,2,3]
# When an array is expected, the elements of a array literal should be
# expected to be of the arrays element type.
semExprWithType( {. [1,2,3] .}, expectedType = {. array[3, uint8] .} )
| semExprWithType( {. 1 .}, expectedType = {. uint8 .} ) -> (1'u8, uint8)
| semExprWithType( {. 2 .}, expectedType = {. uint8 .} ) -> (2'u8, uint8)
| semExprWithType( {. 3 .}, expectedType = {. uint8 .} ) -> (3'u8, uint8)
`-> ( [1'u8, 2'u8, 3'u8], array[3, uint8] )
# When a tuple is expected, the elements of a tuple literal should be
# expected to be of their corresponding type in the tuple literal.
var mytuple: (uint8,string) = (123,"abc")
semExprWithType( {. (123,"abc") .}, expectedType = {. (uint8,string) .} )
| semExprWithType( {. 123 .}, expectedType = {. uint8 .} ) -> (123'u8, uint8)
| semExprWithType( {. "abc" .}, expectedType = {. string .} ) -> ("abc", string)
`-> ( (123'u8, "abc"), (uint8, string) )
# The following example already works today, but how it works would change.
type BaseObject = ref object of RootObj
type ObjA = ref object of BaseObject
type ObjB = ref object of BaseObject
var x: seq[BaseObject] = @[ObjA(), ObjB()]
semExprWithType( {. @[ObjA(), ObjB()] .}, expectedType = {. seq[BaseObject] .} )
| semExprWithType( {. ObjA() .}, expectedType = {. BaseObject .} ) -> (ObjA(), BaseObject)
| semExprWithType( {. ObjB() .}, expectedType = {. BaseObject .} ) -> (ObjB(), BaseObject)
`-> (@[ObjA(), ObjB()], BaseObject), seq[BaseObject])
# This is an example on how error messages could be improved. The bug
# in the program is `ObjC` inherits from `RootObj` instead of `BaseObject`.
# The current error message is ``type mismatch: got <seq[ref RootObj]> but expected 'seq[BaseObject]'``
# The error message does not specify which element is wrong.
type ObjC = ref object of RootObj
var x: seq[BaseObject] = @[ObjA(), ObjB(), ObjC()]
semExprWithType( {. @[ObjA(), ObjB(), ObjC()] .}, expectedType = {. seq[BaseObject] .} )
| semExprWithType( {. ObjA() .}, expectedType = {. BaseObject .} ) -> (ObjA(), BaseObject)
| semExprWithType( {. ObjB() .}, expectedType = {. BaseObject .} ) -> (ObjB(), BaseObject)
| semExprWithType( {. ObjC() .}, expectedType = {. BaseObject .} ) -> error("type mismatch: got <ObjC> but expected 'BaseObject'")
# This error is much more specific as it can point out the exact type
# mismatch and the position in the array where the type mismatch
# happens.Tuples like the example above might also be detected as correct code,
because the sem checker will be able to know that in the first element
of the tuple constructor an uint8 is expected, not a string literal.
steps that need to be done before this can be implemented
enforceVoidContextis currentlytyTyped. This needs to be
changed totyVoidconsistently though the compiler.- The void type in the compiler is currently represented as
nil. I
this is problematic astyp.kindcannot be accessed safely. I
think theniltype can be used better as "no type expectation",
or when seen in the AST: "type not yet inferred".
I hope that the expression flags efWantStmt and efDetermineType
can be superseded by this and therefore removed. As efWantStmt is
represented as efWantValue.
This could potentially improve the usability of pure enums, because
in contexts where a concrete enum type is expected, the owner of that
enum identifier is clear.
type
MyEnum {.pure.} = enum
valueA, valueB, valueC, valueD, amb
OtherEnum {.pure.} = enum
valueX, valueY, valueZ, amb
let mySet: set[MyEnum] = {valueX, amb}amb is declared in both MyEnum and OtherEnum, but it is clear that
MyEnum.amb is meant and not OtherEnum.amb, because the expected
type will be MyEnum.
This RFC supersedes the following issues:
nim-lang/Nim#11109