forked from nim-lang/Nim
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathmanual.rst
7689 lines (5576 loc) · 237 KB
/
manual.rst
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
==========
Nim Manual
==========
:Authors: Andreas Rumpf, Zahary Karadjov
:Version: |nimversion|
.. contents::
"Complexity" seems to be a lot like "energy": you can transfer it from the
end-user to one/some of the other players, but the total amount seems to remain
pretty much constant for a given task. -- Ran
About this document
===================
**Note**: This document is a draft! Several of Nim's features may need more
precise wording. This manual is constantly evolving into a proper specification.
**Note**: The experimental features of Nim are
covered `here <manual_experimental.html>`_.
**Note**: Assignments, moves, and destruction are specified in
the `destructors <destructors.html>`_ document.
This document describes the lexis, the syntax, and the semantics of the Nim language.
To learn how to compile Nim programs and generate documentation see
`Compiler User Guide <nimc.html>`_ and `DocGen Tools Guide <docgen.html>`_.
The language constructs are explained using an extended BNF, in which ``(a)*``
means 0 or more ``a``'s, ``a+`` means 1 or more ``a``'s, and ``(a)?`` means an
optional *a*. Parentheses may be used to group elements.
``&`` is the lookahead operator; ``&a`` means that an ``a`` is expected but
not consumed. It will be consumed in the following rule.
The ``|``, ``/`` symbols are used to mark alternatives and have the lowest
precedence. ``/`` is the ordered choice that requires the parser to try the
alternatives in the given order. ``/`` is often used to ensure the grammar
is not ambiguous.
Non-terminals start with a lowercase letter, abstract terminal symbols are in
UPPERCASE. Verbatim terminal symbols (including keywords) are quoted
with ``'``. An example::
ifStmt = 'if' expr ':' stmts ('elif' expr ':' stmts)* ('else' stmts)?
The binary ``^*`` operator is used as a shorthand for 0 or more occurrences
separated by its second argument; likewise ``^+`` means 1 or more
occurrences: ``a ^+ b`` is short for ``a (b a)*``
and ``a ^* b`` is short for ``(a (b a)*)?``. Example::
arrayConstructor = '[' expr ^* ',' ']'
Other parts of Nim, like scoping rules or runtime semantics, are
described informally.
Definitions
===========
Nim code specifies a computation that acts on a memory consisting of
components called `locations`:idx:. A variable is basically a name for a
location. Each variable and location is of a certain `type`:idx:. The
variable's type is called `static type`:idx:, the location's type is called
`dynamic type`:idx:. If the static type is not the same as the dynamic type,
it is a super-type or subtype of the dynamic type.
An `identifier`:idx: is a symbol declared as a name for a variable, type,
procedure, etc. The region of the program over which a declaration applies is
called the `scope`:idx: of the declaration. Scopes can be nested. The meaning
of an identifier is determined by the smallest enclosing scope in which the
identifier is declared unless overloading resolution rules suggest otherwise.
An expression specifies a computation that produces a value or location.
Expressions that produce locations are called `l-values`:idx:. An l-value
can denote either a location or the value the location contains, depending on
the context.
A Nim `program`:idx: consists of one or more text `source files`:idx: containing
Nim code. It is processed by a Nim `compiler`:idx: into an `executable`:idx:.
The nature of this executable depends on the compiler implementation; it may,
for example, be a native binary or JavaScript source code.
In a typical Nim program, most of the code is compiled into the executable.
However, some of the code may be executed at
`compile-time`:idx:. This can include constant expressions, macro definitions,
and Nim procedures used by macro definitions. Most of the Nim language is
supported at compile-time, but there are some restrictions -- see `Restrictions
on Compile-Time Execution <#restrictions-on-compileminustime-execution>`_ for
details. We use the term `runtime`:idx: to cover both compile-time execution
and code execution in the executable.
The compiler parses Nim source code into an internal data structure called the
`abstract syntax tree`:idx: (`AST`:idx:). Then, before executing the code or
compiling it into the executable, it transforms the AST through
`semantic analysis`:idx:. This adds semantic information such as expression types,
identifier meanings, and in some cases expression values. An error detected
during semantic analysis is called a `static error`:idx:. Errors described in
this manual are static errors when not otherwise specified.
A `panic`:idx: is an error that the implementation detects
and reports at runtime. The method for reporting such errors is via
*raising exceptions* or *dying with a fatal error*. However, the implementation
provides a means to disable these `runtime checks`:idx:. See the section
pragmas_ for details.
Whether a panic results in an exception or in a fatal error is
implementation specific. Thus the following program is invalid; even though the
code purports to catch the `IndexDefect` from an out-of-bounds array access, the
compiler may instead choose to allow the program to die with a fatal error.
.. code-block:: nim
var a: array[0..1, char]
let i = 5
try:
a[i] = 'N'
except IndexDefect:
echo "invalid index"
The current implementation allows to switch between these different behaviors
via ``--panics:on|off``. When panics are turned on, the program dies with a
panic, if they are turned off the runtime errors are turned into
exceptions. The benefit of ``--panics:on`` is that it produces smaller binary
code and the compiler has more freedom to optimize the code.
An `unchecked runtime error`:idx: is an error that is not guaranteed to be
detected and can cause the subsequent behavior of the computation to
be arbitrary. Unchecked runtime errors cannot occur if only `safe`:idx:
language features are used and if no runtime checks are disabled.
A `constant expression`:idx: is an expression whose value can be computed during
a semantic analysis of the code in which it appears. It is never an l-value and
never has side effects. Constant expressions are not limited to the capabilities
of semantic analysis, such as constant folding; they can use all Nim language
features that are supported for compile-time execution. Since constant
expressions can be used as an input to semantic analysis (such as for defining
array bounds), this flexibility requires the compiler to interleave semantic
analysis and compile-time code execution.
It is mostly accurate to picture semantic analysis proceeding top to bottom and
left to right in the source code, with compile-time code execution interleaved
when necessary to compute values that are required for subsequent semantic
analysis. We will see much later in this document that macro invocation not only
requires this interleaving, but also creates a situation where semantic analysis
does not entirely proceed top to bottom and left to right.
Lexical Analysis
================
Encoding
--------
All Nim source files are in the UTF-8 encoding (or its ASCII subset). Other
encodings are not supported. Any of the standard platform line termination
sequences can be used - the Unix form using ASCII LF (linefeed), the Windows
form using the ASCII sequence CR LF (return followed by linefeed), or the old
Macintosh form using the ASCII CR (return) character. All of these forms can be
used equally, regardless of the platform.
Indentation
-----------
Nim's standard grammar describes an `indentation sensitive`:idx: language.
This means that all the control structures are recognized by indentation.
Indentation consists only of spaces; tabulators are not allowed.
The indentation handling is implemented as follows: The lexer annotates the
following token with the preceding number of spaces; indentation is not
a separate token. This trick allows parsing of Nim with only 1 token of
lookahead.
The parser uses a stack of indentation levels: the stack consists of integers
counting the spaces. The indentation information is queried at strategic
places in the parser but ignored otherwise: The pseudo-terminal ``IND{>}``
denotes an indentation that consists of more spaces than the entry at the top
of the stack; ``IND{=}`` an indentation that has the same number of spaces. ``DED``
is another pseudo terminal that describes the *action* of popping a value
from the stack, ``IND{>}`` then implies to push onto the stack.
With this notation we can now easily define the core of the grammar: A block of
statements (simplified example)::
ifStmt = 'if' expr ':' stmt
(IND{=} 'elif' expr ':' stmt)*
(IND{=} 'else' ':' stmt)?
simpleStmt = ifStmt / ...
stmt = IND{>} stmt ^+ IND{=} DED # list of statements
/ simpleStmt # or a simple statement
Comments
--------
Comments start anywhere outside a string or character literal with the
hash character ``#``.
Comments consist of a concatenation of `comment pieces`:idx:. A comment piece
starts with ``#`` and runs until the end of the line. The end of line characters
belong to the piece. If the next line only consists of a comment piece with
no other tokens between it and the preceding one, it does not start a new
comment:
.. code-block:: nim
i = 0 # This is a single comment over multiple lines.
# The scanner merges these two pieces.
# The comment continues here.
`Documentation comments`:idx: are comments that start with two ``##``.
Documentation comments are tokens; they are only allowed at certain places in
the input file as they belong to the syntax tree!
Multiline comments
------------------
Starting with version 0.13.0 of the language Nim supports multiline comments.
They look like:
.. code-block:: nim
#[Comment here.
Multiple lines
are not a problem.]#
Multiline comments support nesting:
.. code-block:: nim
#[ #[ Multiline comment in already
commented out code. ]#
proc p[T](x: T) = discard
]#
Multiline documentation comments also exist and support nesting too:
.. code-block:: nim
proc foo =
##[Long documentation comment
here.
]##
Identifiers & Keywords
----------------------
Identifiers in Nim can be any string of letters, digits
and underscores, with the following restrictions:
* begins with a letter
* does not end with an underscore ``_``
* two immediate following underscores ``__`` are not allowed::
letter ::= 'A'..'Z' | 'a'..'z' | '\x80'..'\xff'
digit ::= '0'..'9'
IDENTIFIER ::= letter ( ['_'] (letter | digit) )*
Currently, any Unicode character with an ordinal value > 127 (non-ASCII) is
classified as a ``letter`` and may thus be part of an identifier but later
versions of the language may assign some Unicode characters to belong to the
operator characters instead.
The following keywords are reserved and cannot be used as identifiers:
.. code-block:: nim
:file: keywords.txt
Some keywords are unused; they are reserved for future developments of the
language.
Identifier equality
-------------------
Two identifiers are considered equal if the following algorithm returns true:
.. code-block:: nim
proc sameIdentifier(a, b: string): bool =
a[0] == b[0] and
a.replace("_", "").toLowerAscii == b.replace("_", "").toLowerAscii
That means only the first letters are compared in a case-sensitive manner. Other
letters are compared case-insensitively within the ASCII range and underscores are ignored.
This rather unorthodox way to do identifier comparisons is called
`partial case-insensitivity`:idx: and has some advantages over the conventional
case sensitivity:
It allows programmers to mostly use their own preferred
spelling style, be it humpStyle or snake_style, and libraries written
by different programmers cannot use incompatible conventions.
A Nim-aware editor or IDE can show the identifiers as preferred.
Another advantage is that it frees the programmer from remembering
the exact spelling of an identifier. The exception with respect to the first
letter allows common code like ``var foo: Foo`` to be parsed unambiguously.
Note that this rule also applies to keywords, meaning that ``notin`` is
the same as ``notIn`` and ``not_in`` (all-lowercase version (``notin``, ``isnot``)
is the preferred way of writing keywords).
Historically, Nim was a fully `style-insensitive`:idx: language. This meant that
it was not case-sensitive and underscores were ignored and there was not even a
distinction between ``foo`` and ``Foo``.
Stropping
---------
If a keyword is enclosed in backticks it loses its keyword property and becomes an ordinary identifier.
Examples
.. code-block:: nim
var `var` = "Hello Stropping"
.. code-block:: nim
type Type = object
`int`: int
let `object` = Type(`int`: 9)
assert `object` is Type
assert `object`.`int` == 9
var `var` = 42
let `let` = 8
assert `var` + `let` == 50
const `assert` = true
assert `assert`
String literals
---------------
Terminal symbol in the grammar: ``STR_LIT``.
String literals can be delimited by matching double quotes, and can
contain the following `escape sequences`:idx:\ :
================== ===================================================
Escape sequence Meaning
================== ===================================================
``\p`` platform specific newline: CRLF on Windows,
LF on Unix
``\r``, ``\c`` `carriage return`:idx:
``\n``, ``\l`` `line feed`:idx: (often called `newline`:idx:)
``\f`` `form feed`:idx:
``\t`` `tabulator`:idx:
``\v`` `vertical tabulator`:idx:
``\\`` `backslash`:idx:
``\"`` `quotation mark`:idx:
``\'`` `apostrophe`:idx:
``\`` '0'..'9'+ `character with decimal value d`:idx:;
all decimal digits directly
following are used for the character
``\a`` `alert`:idx:
``\b`` `backspace`:idx:
``\e`` `escape`:idx: `[ESC]`:idx:
``\x`` HH `character with hex value HH`:idx:;
exactly two hex digits are allowed
``\u`` HHHH `unicode codepoint with hex value HHHH`:idx:;
exactly four hex digits are allowed
``\u`` {H+} `unicode codepoint`:idx:;
all hex digits enclosed in ``{}`` are used for
the codepoint
================== ===================================================
Strings in Nim may contain any 8-bit value, even embedded zeros. However
some operations may interpret the first binary zero as a terminator.
Triple quoted string literals
-----------------------------
Terminal symbol in the grammar: ``TRIPLESTR_LIT``.
String literals can also be delimited by three double quotes
``"""`` ... ``"""``.
Literals in this form may run for several lines, may contain ``"`` and do not
interpret any escape sequences.
For convenience, when the opening ``"""`` is followed by a newline (there may
be whitespace between the opening ``"""`` and the newline),
the newline (and the preceding whitespace) is not included in the string. The
ending of the string literal is defined by the pattern ``"""[^"]``, so this:
.. code-block:: nim
""""long string within quotes""""
Produces::
"long string within quotes"
Raw string literals
-------------------
Terminal symbol in the grammar: ``RSTR_LIT``.
There are also raw string literals that are preceded with the
letter ``r`` (or ``R``) and are delimited by matching double quotes (just
like ordinary string literals) and do not interpret the escape sequences.
This is especially convenient for regular expressions or Windows paths:
.. code-block:: nim
var f = openFile(r"C:\texts\text.txt") # a raw string, so ``\t`` is no tab
To produce a single ``"`` within a raw string literal, it has to be doubled:
.. code-block:: nim
r"a""b"
Produces::
a"b
``r""""`` is not possible with this notation, because the three leading
quotes introduce a triple quoted string literal. ``r"""`` is the same
as ``"""`` since triple quoted string literals do not interpret escape
sequences either.
Generalized raw string literals
-------------------------------
Terminal symbols in the grammar: ``GENERALIZED_STR_LIT``,
``GENERALIZED_TRIPLESTR_LIT``.
The construct ``identifier"string literal"`` (without whitespace between the
identifier and the opening quotation mark) is a
generalized raw string literal. It is a shortcut for the construct
``identifier(r"string literal")``, so it denotes a procedure call with a
raw string literal as its only argument. Generalized raw string literals
are especially convenient for embedding mini languages directly into Nim
(for example regular expressions).
The construct ``identifier"""string literal"""`` exists too. It is a shortcut
for ``identifier("""string literal""")``.
Character literals
------------------
Character literals are enclosed in single quotes ``''`` and can contain the
same escape sequences as strings - with one exception: the platform
dependent `newline`:idx: (``\p``)
is not allowed as it may be wider than one character (often it is the pair
CR/LF for example). Here are the valid `escape sequences`:idx: for character
literals:
================== ===================================================
Escape sequence Meaning
================== ===================================================
``\r``, ``\c`` `carriage return`:idx:
``\n``, ``\l`` `line feed`:idx:
``\f`` `form feed`:idx:
``\t`` `tabulator`:idx:
``\v`` `vertical tabulator`:idx:
``\\`` `backslash`:idx:
``\"`` `quotation mark`:idx:
``\'`` `apostrophe`:idx:
``\`` '0'..'9'+ `character with decimal value d`:idx:;
all decimal digits directly
following are used for the character
``\a`` `alert`:idx:
``\b`` `backspace`:idx:
``\e`` `escape`:idx: `[ESC]`:idx:
``\x`` HH `character with hex value HH`:idx:;
exactly two hex digits are allowed
================== ===================================================
A character is not a Unicode character but a single byte. The reason for this
is efficiency: for the overwhelming majority of use-cases, the resulting
programs will still handle UTF-8 properly as UTF-8 was specially designed for
this. Another reason is that Nim can thus support ``array[char, int]`` or
``set[char]`` efficiently as many algorithms rely on this feature. The `Rune`
type is used for Unicode characters, it can represent any Unicode character.
``Rune`` is declared in the `unicode module <unicode.html>`_.
Numerical constants
-------------------
Numerical constants are of a single type and have the form::
hexdigit = digit | 'A'..'F' | 'a'..'f'
octdigit = '0'..'7'
bindigit = '0'..'1'
HEX_LIT = '0' ('x' | 'X' ) hexdigit ( ['_'] hexdigit )*
DEC_LIT = digit ( ['_'] digit )*
OCT_LIT = '0' 'o' octdigit ( ['_'] octdigit )*
BIN_LIT = '0' ('b' | 'B' ) bindigit ( ['_'] bindigit )*
INT_LIT = HEX_LIT
| DEC_LIT
| OCT_LIT
| BIN_LIT
INT8_LIT = INT_LIT ['\''] ('i' | 'I') '8'
INT16_LIT = INT_LIT ['\''] ('i' | 'I') '16'
INT32_LIT = INT_LIT ['\''] ('i' | 'I') '32'
INT64_LIT = INT_LIT ['\''] ('i' | 'I') '64'
UINT_LIT = INT_LIT ['\''] ('u' | 'U')
UINT8_LIT = INT_LIT ['\''] ('u' | 'U') '8'
UINT16_LIT = INT_LIT ['\''] ('u' | 'U') '16'
UINT32_LIT = INT_LIT ['\''] ('u' | 'U') '32'
UINT64_LIT = INT_LIT ['\''] ('u' | 'U') '64'
exponent = ('e' | 'E' ) ['+' | '-'] digit ( ['_'] digit )*
FLOAT_LIT = digit (['_'] digit)* (('.' digit (['_'] digit)* [exponent]) |exponent)
FLOAT32_SUFFIX = ('f' | 'F') ['32']
FLOAT32_LIT = HEX_LIT '\'' FLOAT32_SUFFIX
| (FLOAT_LIT | DEC_LIT | OCT_LIT | BIN_LIT) ['\''] FLOAT32_SUFFIX
FLOAT64_SUFFIX = ( ('f' | 'F') '64' ) | 'd' | 'D'
FLOAT64_LIT = HEX_LIT '\'' FLOAT64_SUFFIX
| (FLOAT_LIT | DEC_LIT | OCT_LIT | BIN_LIT) ['\''] FLOAT64_SUFFIX
As can be seen in the productions, numerical constants can contain underscores
for readability. Integer and floating-point literals may be given in decimal (no
prefix), binary (prefix ``0b``), octal (prefix ``0o``), and hexadecimal
(prefix ``0x``) notation.
There exists a literal for each numerical type that is
defined. The suffix starting with an apostrophe ('\'') is called a
`type suffix`:idx:. Literals without a type suffix are of an integer type
unless the literal contains a dot or ``E|e`` in which case it is of
type ``float``. This integer type is ``int`` if the literal is in the range
``low(i32)..high(i32)``, otherwise it is ``int64``.
For notational convenience, the apostrophe of a type suffix
is optional if it is not ambiguous (only hexadecimal floating-point literals
with a type suffix can be ambiguous).
The type suffixes are:
================= =========================
Type Suffix Resulting type of literal
================= =========================
``'i8`` int8
``'i16`` int16
``'i32`` int32
``'i64`` int64
``'u`` uint
``'u8`` uint8
``'u16`` uint16
``'u32`` uint32
``'u64`` uint64
``'f`` float32
``'d`` float64
``'f32`` float32
``'f64`` float64
================= =========================
Floating-point literals may also be in binary, octal or hexadecimal
notation:
``0B0_10001110100_0000101001000111101011101111111011000101001101001001'f64``
is approximately 1.72826e35 according to the IEEE floating-point standard.
Literals are bounds checked so that they fit the datatype. Non-base-10
literals are used mainly for flags and bit pattern representations, therefore
bounds checking is done on bit width, not value range. If the literal fits in
the bit width of the datatype, it is accepted.
Hence: 0b10000000'u8 == 0x80'u8 == 128, but, 0b10000000'i8 == 0x80'i8 == -1
instead of causing an overflow error.
Operators
---------
Nim allows user defined operators. An operator is any combination of the
following characters::
= + - * / < >
@ $ ~ & % |
! ? ^ . : \
(The grammar uses the terminal OPR to refer to operator symbols as
defined here.)
These keywords are also operators:
``and or not xor shl shr div mod in notin is isnot of as from``.
`.`:tok: `=`:tok:, `:`:tok:, `::`:tok: are not available as general operators; they
are used for other notational purposes.
``*:`` is as a special case treated as the two tokens `*`:tok: and `:`:tok:
(to support ``var v*: T``).
The ``not`` keyword is always a unary operator, ``a not b`` is parsed
as ``a(not b)``, not as ``(a) not (b)``.
Other tokens
------------
The following strings denote other tokens::
` ( ) { } [ ] , ; [. .] {. .} (. .) [:
The `slice`:idx: operator `..`:tok: takes precedence over other tokens that
contain a dot: `{..}`:tok: are the three tokens `{`:tok:, `..`:tok:, `}`:tok:
and not the two tokens `{.`:tok:, `.}`:tok:.
Syntax
======
This section lists Nim's standard syntax. How the parser handles
the indentation is already described in the `Lexical Analysis`_ section.
Nim allows user-definable operators.
Binary operators have 11 different levels of precedence.
Associativity
-------------
Binary operators whose first character is ``^`` are right-associative, all
other binary operators are left-associative.
.. code-block:: nim
proc `^/`(x, y: float): float =
# a right-associative division operator
result = x / y
echo 12 ^/ 4 ^/ 8 # 24.0 (4 / 8 = 0.5, then 12 / 0.5 = 24.0)
echo 12 / 4 / 8 # 0.375 (12 / 4 = 3.0, then 3 / 8 = 0.375)
Precedence
----------
Unary operators always bind stronger than any binary
operator: ``$a + b`` is ``($a) + b`` and not ``$(a + b)``.
If an unary operator's first character is ``@`` it is a `sigil-like`:idx:
operator which binds stronger than a ``primarySuffix``: ``@x.abc`` is parsed
as ``(@x).abc`` whereas ``$x.abc`` is parsed as ``$(x.abc)``.
For binary operators that are not keywords, the precedence is determined by the
following rules:
Operators ending in either ``->``, ``~>`` or ``=>`` are called
`arrow like`:idx:, and have the lowest precedence of all operators.
If the operator ends with ``=`` and its first character is none of
``<``, ``>``, ``!``, ``=``, ``~``, ``?``, it is an *assignment operator* which
has the second-lowest precedence.
Otherwise, precedence is determined by the first character.
================ ======================================================= ================== ===============
Precedence level Operators First character Terminal symbol
================ ======================================================= ================== ===============
10 (highest) ``$ ^`` OP10
9 ``* / div mod shl shr %`` ``* % \ /`` OP9
8 ``+ -`` ``+ - ~ |`` OP8
7 ``&`` ``&`` OP7
6 ``..`` ``.`` OP6
5 ``== <= < >= > != in notin is isnot not of as from`` ``= < > !`` OP5
4 ``and`` OP4
3 ``or xor`` OP3
2 ``@ : ?`` OP2
1 *assignment operator* (like ``+=``, ``*=``) OP1
0 (lowest) *arrow like operator* (like ``->``, ``=>``) OP0
================ ======================================================= ================== ===============
Whether an operator is used as a prefix operator is also affected by preceding
whitespace (this parsing change was introduced with version 0.13.0):
.. code-block:: nim
echo $foo
# is parsed as
echo($foo)
Spacing also determines whether ``(a, b)`` is parsed as an argument list
of a call or whether it is parsed as a tuple constructor:
.. code-block:: nim
echo(1, 2) # pass 1 and 2 to echo
.. code-block:: nim
echo (1, 2) # pass the tuple (1, 2) to echo
Grammar
-------
The grammar's start symbol is ``module``.
.. include:: grammar.txt
:literal:
Order of evaluation
===================
Order of evaluation is strictly left-to-right, inside-out as it is typical for most others
imperative programming languages:
.. code-block:: nim
:test: "nim c $1"
var s = ""
proc p(arg: int): int =
s.add $arg
result = arg
discard p(p(1) + p(2))
doAssert s == "123"
Assignments are not special, the left-hand-side expression is evaluated before the
right-hand side:
.. code-block:: nim
:test: "nim c $1"
var v = 0
proc getI(): int =
result = v
inc v
var a, b: array[0..2, int]
proc someCopy(a: var int; b: int) = a = b
a[getI()] = getI()
doAssert a == [1, 0, 0]
v = 0
someCopy(b[getI()], getI())
doAssert b == [1, 0, 0]
Rationale: Consistency with overloaded assignment or assignment-like operations,
``a = b`` can be read as ``performSomeCopy(a, b)``.
However, the concept of "order of evaluation" is only applicable after the code
was normalized: The normalization involves template expansions and argument
reorderings that have been passed to named parameters:
.. code-block:: nim
:test: "nim c $1"
var s = ""
proc p(): int =
s.add "p"
result = 5
proc q(): int =
s.add "q"
result = 3
# Evaluation order is 'b' before 'a' due to template
# expansion's semantics.
template swapArgs(a, b): untyped =
b + a
doAssert swapArgs(p() + q(), q() - p()) == 6
doAssert s == "qppq"
# Evaluation order is not influenced by named parameters:
proc construct(first, second: int) =
discard
# 'p' is evaluated before 'q'!
construct(second = q(), first = p())
doAssert s == "qppqpq"
Rationale: This is far easier to implement than hypothetical alternatives.
Constants and Constant Expressions
==================================
A `constant`:idx: is a symbol that is bound to the value of a constant
expression. Constant expressions are restricted to depend only on the following
categories of values and operations, because these are either built into the
language or declared and evaluated before semantic analysis of the constant
expression:
* literals
* built-in operators
* previously declared constants and compile-time variables
* previously declared macros and templates
* previously declared procedures that have no side effects beyond
possibly modifying compile-time variables
A constant expression can contain code blocks that may internally use all Nim
features supported at compile time (as detailed in the next section below).
Within such a code block, it is possible to declare variables and then later
read and update them, or declare variables and pass them to procedures that
modify them. However, the code in such a block must still adhere to the
restrictions listed above for referencing values and operations outside the
block.
The ability to access and modify compile-time variables adds flexibility to
constant expressions that may be surprising to those coming from other
statically typed languages. For example, the following code echoes the beginning
of the Fibonacci series **at compile-time**. (This is a demonstration of
flexibility in defining constants, not a recommended style for solving this
problem!)
.. code-block:: nim
:test: "nim c $1"
import strformat
var fib_n {.compileTime.}: int
var fib_prev {.compileTime.}: int
var fib_prev_prev {.compileTime.}: int
proc next_fib(): int =
result = if fib_n < 2:
fib_n
else:
fib_prev_prev + fib_prev
inc(fib_n)
fib_prev_prev = fib_prev
fib_prev = result
const f0 = next_fib()
const f1 = next_fib()
const display_fib = block:
const f2 = next_fib()
var result = fmt"Fibonacci sequence: {f0}, {f1}, {f2}"
for i in 3..12:
add(result, fmt", {next_fib()}")
result
static:
echo display_fib
Restrictions on Compile-Time Execution
======================================
Nim code that will be executed at compile time cannot use the following
language features:
* methods
* closure iterators
* the ``cast`` operator
* reference (pointer) types
* FFI
The use of wrappers that use FFI and/or ``cast`` is also disallowed. Note that
these wrappers include the ones in the standard libraries.
Some or all of these restrictions are likely to be lifted over time.
Types
=====
All expressions have a type that is known during semantic analysis. Nim
is statically typed. One can declare new types, which is in essence defining
an identifier that can be used to denote this custom type.
These are the major type classes:
* ordinal types (consist of integer, bool, character, enumeration
(and subranges thereof) types)
* floating-point types
* string type
* structured types
* reference (pointer) type
* procedural type
* generic type
Ordinal types
-------------
Ordinal types have the following characteristics:
- Ordinal types are countable and ordered. This property allows
the operation of functions as ``inc``, ``ord``, ``dec`` on ordinal types to
be defined.
- Ordinal values have the smallest possible value. Trying to count further
down than the smallest value produces a panic or a static error.
- Ordinal values have the largest possible value. Trying to count further
than the largest value produces a panic or a static error.
Integers, bool, characters, and enumeration types (and subranges of these
types) belong to ordinal types.
A distinct type is an ordinal type if its base type is an ordinal type.
Pre-defined integer types
-------------------------
These integer types are pre-defined:
``int``
the generic signed integer type; its size is platform-dependent and has the
same size as a pointer. This type should be used in general. An integer
literal that has no type suffix is of this type if it is in the range
``low(int32)..high(int32)`` otherwise the literal's type is ``int64``.
intXX
additional signed integer types of XX bits use this naming scheme
(example: int16 is a 16-bit wide integer).
The current implementation supports ``int8``, ``int16``, ``int32``, ``int64``.
Literals of these types have the suffix 'iXX.
``uint``
the generic `unsigned integer`:idx: type; its size is platform-dependent and has the same size as a pointer. An integer literal with the type suffix ``'u`` is of this type.
uintXX
additional unsigned integer types of XX bits use this naming scheme
(example: uint16 is a 16-bit wide unsigned integer).
The current implementation supports ``uint8``, ``uint16``, ``uint32``,
``uint64``. Literals of these types have the suffix 'uXX.
Unsigned operations all wrap around; they cannot lead to over- or
underflow errors.
In addition to the usual arithmetic operators for signed and unsigned integers
(``+ - *`` etc.) there are also operators that formally work on *signed*
integers but treat their arguments as *unsigned*: They are mostly provided
for backwards compatibility with older versions of the language that lacked
unsigned integer types. These unsigned operations for signed integers use
the ``%`` suffix as convention:
====================== ======================================================
operation meaning
====================== ======================================================
``a +% b`` unsigned integer addition
``a -% b`` unsigned integer subtraction
``a *% b`` unsigned integer multiplication
``a /% b`` unsigned integer division
``a %% b`` unsigned integer modulo operation
``a <% b`` treat ``a`` and ``b`` as unsigned and compare
``a <=% b`` treat ``a`` and ``b`` as unsigned and compare
``ze(a)`` extends the bits of ``a`` with zeros until it has the
width of the ``int`` type
``toU8(a)`` treats ``a`` as unsigned and converts it to an
unsigned integer of 8 bits (but still the
``int8`` type)
``toU16(a)`` treats ``a`` as unsigned and converts it to an
unsigned integer of 16 bits (but still the
``int16`` type)
``toU32(a)`` treats ``a`` as unsigned and converts it to an
unsigned integer of 32 bits (but still the
``int32`` type)
====================== ======================================================
`Automatic type conversion`:idx: is performed in expressions where different
kinds of integer types are used: the smaller type is converted to the larger.
A `narrowing type conversion`:idx: converts a larger to a smaller type (for
example ``int32 -> int16``. A `widening type conversion`:idx: converts a
smaller type to a larger type (for example ``int16 -> int32``). In Nim only
widening type conversions are *implicit*:
.. code-block:: nim
var myInt16 = 5i16
var myInt: int
myInt16 + 34 # of type ``int16``
myInt16 + myInt # of type ``int``
myInt16 + 2i32 # of type ``int32``
However, ``int`` literals are implicitly convertible to a smaller integer type
if the literal's value fits this smaller type and such a conversion is less
expensive than other implicit conversions, so ``myInt16 + 34`` produces
an ``int16`` result.
For further details, see `Convertible relation
<#type-relations-convertible-relation>`_.
Subrange types