-
Notifications
You must be signed in to change notification settings - Fork 89
Nemerle language (part 6)
Author: Chistyakov Vladislav
Original: http://www.rsdn.ru/article/nemerle/TheNemerle-part-6/Nemerle-Lang-part-6.rsdnml.xml
Translator: Pavel Klinov
There are several key things to note before starting to learn Nemerle’s syntax.
First, not only is it extensible by-design but extensible dynamically. Any “using” directive in the source code may plug in a new syntax extension. This means, in particular, that any expression or even an upper-level syntactic construct, such as a method or a property, located in the scope of a syntax extension could be interpreted in a different (with respect to the base syntax) way.
However, it is also important to understand that there exists the so-called base syntax which is in effect even if no macros are used.
In fact, the language of Nemerle can be classified as follows (depending on whether syntax macros are used):
- The base language: the syntax parsed by the compiler when no syntax extension is plugged.
- The language extended with macros from the Nemerle.Core namespace (those are a part of the standard macro library).
- The language extended with macros from other namespaces which belong to the standard macro library or even from external macro libraries.
The base language is extremely minimalistic and could barely be used. It even lacks such features as operator priorities. Therefore, its reasonable use cases are limited to education (see the first article in this series) or developing a replacement for the standard library, for example, to alter the syntax or the semantics of the standard operators or expressions.
However, in spite of its limited usability we will still use the base language in all sections related to Nemerle’s syntax.
One can get some impression of Nemerle’s syntax just by enriching the syntax of the base language by the standard macros. Other syntax extensions should be considered as pluggable DSLs (domain-specific languages).
Even though Nemerle allows syntax extensions, it is not possible to change the syntax of the base language or the standard macros (other than by changing the compiler and the standard macro library, respectively). However, one can:
- Extend the syntax.
- Adjust interpretation of existing syntactic constructs.
The principal tool for using both capabilities is macros.
Extending the syntax is conceptually simple: adding a new syntactic macro and using its namespace in the code will force the compiler to recognize the new syntax.
On the other hand, it may well be less clear what is meant by interpretation of existing syntactic constructs.
The crux is that the compiler actually parses a more abstract language than is specified by the syntax of Nemerle. For example, imagine the compiler parsing references to a certain type. What it actually does is not parsing the syntax of a reference but rather the syntax of a Nemerle’s expression. Later, when the compiler begins typing the obtained AST (the abstract syntax tree), it will check whether the parsed expression matches the type of the reference and will report an error if it does not.
One may wonder whether such a two-stage syntactic checking actually makes sense. The key here is that the second check does not happen immediately upon the end of the parsing stage. A macro could intervene between the parsing and the typing stages and re-write the expression such that it becomes a syntactically valid piece of Nemerle code. For example, so do the macros from the Nemerle.ComputationExpressions namespace which implement functionality similar to that of Computation Expressions in F#.
This capability is called syntax re-interpretation. Along with syntax changing it is used for implementation of various sorts of pluggable DSLs.
WARNING
We will use the final syntax of Nemerle (i.e. the one accepted by the compiler and not just the parser) to describe the base syntax. However, we will explicitly designate constructs which allow for syntax re-interpretation.
A CompilationUnit defines the structure of a file being compiled. Such a structure in Nemerly is somewhat simpler than its C# prototype, thanks to the recursive nature of the language design.
CompilationUnit = NamespaceBody;
A compilation unit in Nemerle works just like the body of a namespace, which allows for:
- A simpler syntax and, consequently, faster learning.
- Using global (assembly-level) attributes inside both compilation units and namespaces. This feature is of little use for simple attributes. However, for macros it lets the programmer define the namespace which the macro is bound to.
An example of adding types via a macro-attribute is shown below. This macro uses the method Define from the type GlobalEnv (a reference to which is obtained via the property typer.Env) to add a type. This enables declaring new types in the current namespace.
using Nemerle; using Nemerle.Collections; using Nemerle.Compiler; using Nemerle.Compiler.Parsetree; using Nemerle.Compiler.Typedtree; using Nemerle.Utility;
namespace MacroLibrary { [MacroUsage(MacroPhase.BeforeInheritance, MacroTargets.Assembly)] macro CreateClass(name : PExpr) { def typer = Macros.ImplicitCTX();
def toName(expr : PExpr) : Name { | <[ $(name : name) ]> => name | _ => Message.FatalError(expr.Location, "Expected a simple name.") }
def newTypeBuilder = typer.Env.Define( <[ decl: public class $(toName(name) : name) { } ]>);
Message.Hint($"The $(newTypeBuilder.FullName) type is defined."); newTypeBuilder.Compile(); } }
This macro could be used as follows:
using MacroLibrary;
[assembly: CreateClass(TestClass1)]
namespace NamespaceA { [assembly: CreateClass(TestClass2)]
namespace NamespaceB { [assembly: CreateClass(TestClass3)] } }
This example will print the following in the console:
hint: The TestClass1 type is defined. hint: The NamespaceA.TestClass2 type is defined. hint: The NamespaceA.NamespaceB.TestClass3 type is defined.
which indicates that the types have been created in those namespaces where the macro was applied.
A namespace declaration syntax is exactly the same as in C#:
NamespaceDeclaration = "namespace" QualifiedIdentifier
"{" NamespaceBody "}" ";"?;
Here is an example:
namespace NamespaceA { // NamespaceA body namespace NamespaceB { // NamespaceA.NamespaceB body namespace NamespaceC { // NamespaceA.NamespaceB.NamespaceC body } }
namespace NamespaceC { // NamespaceA.NamespaceC body } }
namespace NamespaceC { // NamespaceC body }
namespace NamespaceA.NamespaceB.NamespaceC { // NamespaceA.NamespaceB.NamespaceC body }
A namespace body consists of a collection of “using” directives, global attribute declarations, and a list of members. All these constructs are optional, as stated in the following grammar:
NamespaceBody = (UsingDirective
| GlobalAttributeSection
| NamespaceDeclaration
| TypeDeclaration
)*;
Example:
using System;
namespace NamespaceA { using System.Text.RegularExpressions;
namespace NamespaceB { public class Class1 // class: NamespaceA.NamespaceB.Class1 { private WordRegex : Regex = Regex(@"\w+"); } }
public class Class1 // class: NamespaceA.Class1 { // System.Text.RegularExpressions.Regex private WordRegex : Regex = Regex(@"\w+"); } }
This directive comes in two flavors which enable the developer to:
- open a namespace or a type,
- define a synonym for a type or a namespace.
UsingDirective = UsingAliasDirective | UsingNamespaceDirective;
UsingAliasDirective = "using" Identifier "=" Type ";"
UsingNamespaceDirective = "using" NamespaceName ";"
Types in an open namespace can be referred to via short, i.e., non-qualified, names.
namespace NamespaceA { class ClassA { } }
using NamespaceA;
module Program { Main() : void { def a = ClassA(); // equivalent to NamespaceA.ClassA() } }
Now, an open namespace may contain syntactic macros. In that case, the parser will honor their syntax extensions in the scope of the using directive which has opened the namespace.
using Nemerle.Text;
def res = regexp match("10va") // regexp match is a syntactic macro { | @"(?<a : int>\d+)(va)?" => a | _ => -1 };
assert(res == 10);
Also, using may open a type, in which case all of its static members are accessible via non-qualified names.
using System.Console; // The System.Console module
WriteLine("Test"); // System.Console.WriteLine
The second using flavor allows for declaring synonyms for types or namespaces. This helps avoid ambiguities in case several namespaces contain types with the same name:
using SCG = System.Collections.Generic; using SC = System.Collections;
def q = SCG.Queue(); // q's type is System.Collections.Generic.Queue[int] q.Enqueue(1);
def q = SC.Queue(); // q's type is System.Collections.Queue q.Enqueue(1);
The list of types supported in Nemerle can be found in the corresponding section. It includes predefined types, such as tuples and the functional type, and user-defined (or just user) types.
Nemerle supports the following user-defined types:
- Classes (class)
- Structs (struct)
- Interfaces (Interface)
- Variant types (variant)
- Enumerations (enum)
- Delegates (delegate)
In addition, Nemerle supports type alias definitions:
TypeDeclaration = ClassDeclaration | StructDeclaration
| InterfaceDeclaration | EnumDeclaration
| DelegateDeclaration | VariantDeclaration
| TypeAliasDeclaration;
TypeAliasDeclaration = "type" Identifier TypeParameters? "=" Type;
A type alias allows one to define an alternative, usually a shorter, name for type or move the type to another namespace.
All references to aliases are automatically substituted by references to the corresponding types. As such, no alias references are present in the generated code.
A reference to an alias is equivalent to a reference to the corresponding type, therefore types and their aliases could be freely mixed in the source code.
NOTE
Note that aliases do not introduce new types. They are simply synonyms of existing types.
Examples:
using SCG = System.Collections.Generic;
namespace Nemerle.Collections { public type Seq[T] = SCG.IEnumerable[T]; }
using Nemerle.Collections;
module Program { Test() : Seq[int] { Enumerable.Range(0, int.MaxValue) } }
This example introduces a short alias Seq for IEnumerable. It is declared in the Nemerle.Collections namespace so will be accessible wherever it is open or via the fully qualified name Nemerle.Collections.Seq.
NOTE
The alias Seq is a part of the standard library so can be used without a prior declaration.
Some more examples:
internal type MapStrToList[T] = SCG.Dictionary[string, SCG.List[T]];
type DynStr = System.Text.StringBuilder;
As opposed to the “using” directive, aliases, once declared, could be used in all files of the project. Furthermore, if they are declared using the keyword “public”, they are accessible in other projects which use the corresponding assembly.
h3, Classes, structs, interfaces, and modules (ClassDeclaration, StructDeclaration, InterfaceDeclaration)
The syntax of classes, modules, interfaces, and structs, is basically the same. Therefore, it is easier to describe the class syntax and the differences for other constructs than to describe each in its own section.
ClassDeclaration = Attribits? Modifiers? "partial"? "class" Name TypeParameters? SuperTypes? ConstraintsClause* "{" TypeMemberDeclaration* "}";
Name = Identifier;
The syntactic differences between classes, modules, interfaces, and structs are the following:
- Keywords: class, module, interface, and struct.
- Allowed modifiers (see the “Modifiers” section for more information on the applicability of modifiers).
- Modules cannot implement interfaces since all their methods are treated as static. However, a module can have the base class (for most practical purposes, modules are similar to static classes in C#).
- Interfaces cannot have base classes, only base interfaces.
- Interface member declarations look slightly different, in particular, they cannot have access modifiers and bodies.
SuperTypes = ":" TypeNames;
TypeNames = TypeName ("," TypeName)*;
Base types, also known as supertypes, all the types that the given type is a subtype of.
A class could have a base type. Any class has at least one base class. If that is object (System.object), it need not be stated explicitly.
A class may, but is not required to, implement one or more interfaces. If it extends a base class, then the list of implemented interfaces follows the name of the base class.
Extending a base class allows for inheriting implementation of its methods. Implementing an interface allows only for interface inheritance, i.e., enables the programmer to use the class in all contexts where the interface is expected.
An interface may extend one or more other interfaces (which are called base interfaces in that case).
Only interfaces may act as base types for structs since the latter cannot inherit method implementations.
A module may have a base class but may not implement interfaces. This is due to the fact that modules only contain static members whereas interfaces, on the contrary, only allow for non-static members.
Example:
using System; using System.Console;
public class Person { // Constructor for creating and initializing class instances public this(name : string, lastName : string, age : DateTime) { Name = name; LastName = lastName; Age = age; }
public Name : string { get; } public LastName : string { get; } public Age : DateTime { get; } }
[Record] // This generates a constructor for initializing all fields of this struct instances public class Employee : Person { public Position : string { get; } public Department : string { get; } }
def employee1 = Employee( name="John", lastName="Carmack", age=DateTime(1970, 08, 20), position="Co-founder", department="Game development"); def employee2 = Employee("John", "Romero", DateTime(1967, 10, 28), "Co-founder", "Game design"); def employees = [employee1, employee2]; def result = employees.Filter(e => e.Age.Year >= 1970);
foreach (employee in result) WriteLine($"$(employee.Name), $(employee.LastName) $(employee.Age)");
Output:
John, Carmack 20.08.1970 00:00:00
TypeParameters = "[" TypeParamName ("," TypeParamName)* "]";
TypeParamName = Identifier;
Type parameters stand for names of the types to be substituted by real types upon instantiation of the parameterized type or a parameterized method call.
A type or a method which declaration uses type parameters are called generic.
Although generic types and methods resemble C++ templates, they are not quite the same. C++ templates are compile-time entities whereas generic types and methods exist in runtime. On the one hand, they could be compiled into IL and placed into an assembly but on the other hand, they are substantially more restricted than templates.
The restrictions are the following:
- A type parameter cannot directly be used as a base type. However, it can be used as a parameter of a base type.
- Unless there are constraints on a type parameter (c.f. the next section), only System.object methods could be invoked via its instances, that is, Equals, GetHashCode, GetType, ToString. In addition, type parameter instances can be implicitly cast to object.
- If constraints (interfaces) are declared for a type parameter, then it is possible to call methods of those interfaces on instances of the type parameter, as well as the methods of System.object. Also, instances can be implicitly or explicitly cast to System.object or the interfaces.
- If the special “new” constraint appears in the list of constraints, then one may instantiate the type parameter using the zero-argument constructor. In this case, the constructor call is replaced by Activator.CreateInstance(), which works relatively slow. No other constructors could be used to instantiate the type parameter. When necessary, this can be achieved via the Abstract Factory pattern or simply by using a functional object to instantiate the required type. The factory or the functional object may be stored in fields of the generic type or passed as arguments.
- The literal “null” can be cast to the type of a type parameter (or be used in its context) only if there is a constraint “class” declared for that type parameter (c.f. next section). Instead, one may put the default value by using the macro default() from the standard library. That macro is responsible for creating the default value for the given type. For reference types such value is always “null”, while for value types it is the value generated by the corresponding zero-argument constructor (for example, for integers such value is zero).
- Type parameters may not be used in attribute declarations.
- Type parameters may not be used to refer to static members or inner types.
WARNING
Note that, differently from C#, an object whose type is a type parameter may be compared to null only if there is the “null” constraint for that type parameter.
An interested reader may gather more information about generic types and methods in any chapter or a manual on generics in .NET or C#.
Below is a simplified example of implementing a “dynamic array”: a collection allowing for storing an arbitrary number of objects or arbitrary types.
using Nemerle.Imperative; // required to use "break"
using System; using System.Collections.Generic; using System.Console;
public class MyList[T] : IList[T] { public Count : int { get; private set; } mutable _items : array[T] = array(16);
private EnsureCapacity(capacity : int) : void { when (capacity > _items.Length) { def data = array(Math.Max(_items.Length * 2, capacity)); Array.Copy(_items, data, _items.Length); } }
#region System.Collections.Generic.IList[T] Members
public Add(value : T) : void { Insert(Count, value); }
public IndexOf(item : T) : int { Array.IndexOf(_items, item, 0, Count) }
public Insert(index : int, item : T) : void { EnsureCapacity(Count + 1);
if (index > Count) throw IndexOutOfRangeException("index"); else when (index < Count) Array.Copy(_items, index, _items, index + 1, Count - index);
_items[index] = item; Count++; }
public Item[index : int] : T // enables index-based access { get { _items[index] } set { _items[index] = value; } }
public RemoveAt(index : int) : void { Array.Copy(_items, index + 1, _items, index, Count - index); Count--; }
#endregion
public Clear() : void{ Count = 0; }
public Contains(item : T) : bool { IndexOf(item) >= 0 }
public CopyTo(arr : array[T], arrayIndex : int) : void { Array.Copy(_items, 0, arr, arrayIndex, Count); }
public IsReadOnly : bool { get { false } }
public Remove(item : T) : bool { def index = IndexOf(item);
when (index > 0) RemoveAt(index);
index >= 0 }
#region System.Collections.Generic.IEnumerable[T] Members
public GetEnumerator() : IEnumerator[T] { foreach (item in _items with i) if (i == Count) break; else yield item; }
#endregion }
public module Program { Main() : void { def myList = MyList();
myList.Add(1); myList.Add(2); myList.Add(3); myList.Insert(1, 42); myList.Insert(0, 0);
WriteLine($"xs = '..$myList' Count=$(myList.Count)");
def x = myList[2]; // index-based access WriteLine($"xs[2] = $x");
_ = myList.Remove(2); myList.RemoveAt(2);
WriteLine($"xs = '..$myList' Count=$(myList.Count)"); } }
Here is the output:
xs = '0, 1, 42, 2, 3' Count=5
xs[2] = 42
xs = '0, 1, 3' Count=3
ConstraintsClause = "where" TypeParamName ":" Constraints;
Constraints = Constraint | (", " Constraint)*;
Constraint = "class" | "struct" | "new" | "enum" | TypeName;
SuperType = TypeName;
Type parameters may be constrained. Constraints restrict the list of valid type substitutes for the given type parameter. They also allow the programmer to refer to members of the type parameter.
Nemerle supports the following kinds of constraints on allowed substitutes for type parameters:
- class – substitutes must be reference types.
- struct – substitutes must be value types, that is, structs or primitive number types.
- new – substitutes must have the default public zero-argument constructor.
- enum – substitutes are enums.
- «SuperType» – substitutes must be subtypes of the type declared in the constraint.
The constraint list may optionally include one or more interfaces and one base type (SuperType), which is not an interface, e.g., a struct or a class. Other type parameters, constrained or not, may also be used as supertypes for the given type parameter.
In contrast to C#, Nemerle allows for specifying the same interface multiple times with different type parameters if they are not unifyiable, i.e., if two substitutions may not lead to the same real type. For example, one may declare IComparable[string], IComparable[object], but not IComparable[string], IComparable[T] (where T is a type parameter) because T=string would unify the last two types.
A type substituted for a type parameter must satisfy all constraints applied to the parameter. In particular, it must be a subtype of all types specified in the SuperType constraint. For example, if ICollection[T] and IDictionary[TKey, TValue] appear in the constraint list, then valid substitutes must implement both interfaces. One such class is Dictionary[TKey, TValue].
EnumDeclaration = Attribits? Modifiers? "enum" Name EnumBase? "{" EnumMember* "}";
EnumBase = ":" IntegralType; IntegralType = "sbyte" | "byte" | "short" | "ushort" | "int" | "uint" | "long" | "ulong" | "decimal"; EnumMember = Attribits? "|" Name ("=" ConstantExpression)?
ConstantExpression = PExpr; // only constant expressions are allowed Name = Identifier;
If there is no value specified for a member of an enumeration, it is taken as the value of the previous member plus one. The default value of the first member is zero. However, if the value is specified, then it takes precedence.
It is possible to use the Systems.Flags attribute when using enumerations for specifying bit flags. This improves robustness of such enumerations.
WARNING
It is quite possible that future versions of Nemerle will prohibit bit operations over enumerations not tagged with that attribute.
Example 1:
using System.Console;
public enum X : long { | A | B | C }
def pring(x) { def type = x.GetType().Name; def underlyingType = x.GetType().GetEnumUnderlyingType();
WriteLine($"$type.$x = $(x :> long) ($underlyingType)"); }
pring(X.A); pring(X.B); pring(X.C);
Here is the output:
X.A = 0 (System.Int64)
X.B = 1 (System.Int64)
X.C = 2 (System.Int64)
Example 2:
using System.Console;
enum X : long { | A | B = 40 + 2 | C }
def pring(x) { def type = x.GetType().Name; def underlyingType = x.GetType().GetEnumUnderlyingType();
WriteLine($"$type.$x = $(x :> long) ($underlyingType)"); }
pring(X.A); pring(X.B); pring(X.C);
Output:
X.A = 0 (System.Int64)
X.B = 42 (System.Int64)
X.C = 43 (System.Int64)
Example 3:
using System.Console;
[Flags] public enum X : long { | A = 0x1 | B = 0x2 | C = 0x4 }
def pring(x) { def type = x.GetType().Name; def underlyingType = x.GetType().GetEnumUnderlyingType();
WriteLine($"$type.$x = $(x :> long) ($underlyingType)"); }
pring(X.A); pring(X.B | X.C); pring(X.C);
Output:
X.A = 1 (System.Int64)
X.B, C = 6 (System.Int64
X.C = 4 (System.Int64)
VariantDeclaration =
Attribits? Modifiers? "variant" Name TypeParameters?
SuperTypes?
Constraints?
"{"
(TypeMemberDeclaration | VariantOptionDeclaration)*
"}";
Variant types in Nemerle are a particular implementation of algebraic data types (ADT). They have the following characteristics:
- Each variant type may have the so-called options. Options are subtypes of the variant type (or derived types, in OOP parlance). Consequently, the variant type is the base type for all its options.
- Each option defines at least one constructor (which describes the option). As such, any variant type is described by a set of constructors which are used to, first, construct instances of the type and, second, recognize options via pattern matching.
The set of options is fixed and cannot be extended, e.g., through inheritance. In OOP terminology, all options are sealed (or final) types.
Each variant type itself is abstract (again, in OOP terms), i.e. cannot be directly instantiated. Only options may have instances.
The variant type combines traits of ATD implementations in functional languages, such as ML and OCaml, and object-oriented languages.
Its OOP features include the capability of being inherited from a base types (which in this case has to be a class), the capability of implementing interfaces, and introducing and overriding virtual members. Therefore, the variant type in Nemerle is not a purely instantiation of ATD but rather is a hybrid tool which mixes OOP and ATD features.
Each variant type is closed in the sense that its only legitimate subtypes are its options, which have to be enumerated in its declaration. The options, in turn, may not have subtypes. However, the variant type itself may be a subtype of another, unsealed type. If that base type is not explicitly specified, then it is assumed to be System.object.
The variant type is a reference type.
Examples of variant types are given in the next section.
VariantOptionDeclaration = Attribits? "|" Name VariantOptionBody?;
VariantOptionBody = "{" TypeMemberDeclaration* "}" ";"?
Variant type options (just options in what follows) describe subtypes of a variant type.
For each option there is an implicit constructor which takes an argument for each field or an auto-property (including private members). All fields to not be initialized via the constructor has to be tagged by the macro-attribute RecordIgnore. By default the constructor adds the macro-attribute Record to the variant type.
The default access modifier for option members is public. Non-public members are also initialized by the constructor but are not taken into account during pattern matching (since the latter operation has access to only public values).
An option is associated with meta-information which maps fields and auto-properties to arguments of the implicit constructor. This allows for recognizing values of variant types by using the pattern “constructor” during pattern matching.
The compiler checks pattern completeness during pattern matching on a variant type. It reports warnings if some values of the type are not covered by the specified patterns. In addition, the compiler does not allow for an explicit pattern overlap (when a preceding pattern matches a value(s) covered by a subsequent pattern).
The auto-generated implicit constructor may not be the only constructor for a variant type. Other constructors may be declared but may not be used during pattern matching.
The implicit constructor may be replaced by a user-defined one. This is achieved by placing the “new” keyword just before “this”. The new constructor may contain arbitrary code but it is assumed to initialize fields by values of the corresponding arguments. Otherwise, pattern matching may not work correctly for ill-initialized options.
Enumerations (enum) may be considered as a special, degenerate case of variant types. However, there are important differences between the two:
- Enumerations are value types while variant types are reference types.
- A variable of an enumeration type may take on values besides those listed in the enumeration declaration. Values of variant type variables are restricted to options of the type (one exception is null which could be assigned to any variable of any reference type).
- An enumeration may not have members which are not elements of the enumeration. In contract, a variant type may have arbitrary class members.
Example:
variant Color { | Red | Yellow | Green | Rgb { red : byte; green : byte; blue : byte; } | Alpha { color : Color; alpha : byte; }
public ToRgb() : byte * byte * byte { match (this) { | Red => (255b, 0b, 0b) | Green => (0b, 255b, 0b) | Yellow => (255b, 255b, 0b) | Rgb(r, g, b) => (r, g, b) | Alpha(x, _) => x.ToRgb() } } }
def print(color : Color) : void { | Red | Rgb(255, 0, 0) => WriteLine("Red"); | Yellow | Rgb(255, 255, 0) => WriteLine("Yellow"); | Green | Rgb(0, 255, 0) => WriteLine("Green"); | Rgb(r, g, b) => WriteLine($"Rgb($r, $g, $b)") | Alpha(x, _) => print(x) }
print(Color.Yellow()); print(Color.Rgb(255, 255, 0)); print(Color.Alpha(Color.Green(), 128));
def colors = [Color.Yellow(), Color.Green(), Color.Rgb(255, 0, 0)]; WriteLine(colors.Map(_.ToRgb()));
Output:
Yellow
Yellow
Green
[(255, 255, 0), (0, 255, 0), (255, 0, 0)]
DelegateDeclaration = Attribits? Modifiers? "delegate" MethodHeader ";";
Nemerle supports a dedicated functional type to operate functions as first-class objects (in particular, for passing around references to functions and objects). However, it also supports a special type, called Delegate.
Delegates have been added to Nemerle for a better compatibility with other .NET languages in which they are the only means of passing references to functions.
The following are the main differences between the functional type and delegates:
- A delegate type has to be declared or imported from external libraries prior to its use. The functional type does not require a declaration.
- A delegate declaration defines a new type which is incompatible with delegates having the same signature (and the same name).
- Instances of the same delegate may be grouped (concatenated) into a single, combined delegate. Such delegates are subtypes of MulticastDelegate. Any invocation of a multicast delegate will involve calls the delegates it consists of. The result of the last function (the last concatenated delegate) will be taken as the final result of the call.
- Delegates are used for declaring events. The functional type is Nemerle-unique, thus, may not be used for even declarations.
- .NET delegates support covariance and contravariance, but in a restricted way. The functional type does not support either since its implemented on a basis of types which do not support it.
Nemerle automatically cast a value of the functional type to a delegate when their signatures coincide. The inverse cast is not permitted. However, delegates often contain the Invoke method which may be used to transform a delegate to a functional object, as illustrated below:
using System; using System.Console;
def test(f : int -> int) : int { f(5) }
// The functional object is cast to a delegate. def d : Func[int, int] = x => x * x; // A reference to the Invoke method of a delegate is passed // to the function which expects a functional object. WriteLine(test(d.Invoke));
Output:
25
Here is an example of delegate declaration and usage:
using System.Console;
public delegate Convert(x : int) : string;
def convert(value : int) : string { "'" + value.ToString() + "'" }
def print(value : int, convert : Convert) : void { WriteLine(convert(value)); }
print(42, Convert(convert)); // explicit instantiation of a delegate print(42, convert); // implicit instantiation of a delegate print(42, _.ToString("X"));
Output:
'42'
'42'
2A
Modifiers = AccessModifiers | "mutable" | "volatile" | "static" | "new"
| "abstract" | "sealed" | "override" | "virtual"
| "partial" | "extern";
AccessModifiers = "public" | "private" | "protected" | "internal";
Whether or not a modifier could be applied to an entity depends on the kind of the entity and whether it is a top-level type or a member of a type (including an inner type).
Multiple access modifiers in the same declaration are not permitted.
The access modifiers public, private, protected, and internal define visibility of the type or the member from outer code. Allowed access modifiers for top-level types (i.e., those declared in a namespace rather than in an enclosing type) are public and internal. The default modifier is internal.
The semantics of access modifiers for top-level types is defined as follows:
- public: the type is visible in its assembly as well as in other assemblies referencing it.
- internal: the type is visible only in its assembly and in assemblies listed in the attribute InternalsVisibleToAttribute.
The following are the available access modifiers for type members and inner types:
- public: the member (or the inner type) is accessible from outside of its enclosing type. It is accessible from both within and outside the assembly, provided that the enclosing type is accessible from outside.
- private: the member or the type is accessible only inside the enclosing type.
- protected: the member or the type is accessible only inside the enclosing type and its subtypes.
- internal: the member of the type is accessible from outside its enclosing type but only within its assembly and assemblies specified in the attribute InternalsVisibleToAttribute (again, provided that the enclosing type is accessible).
- protected internal: the member is accessible from all types of its assembly as well as from within subtypes of the enclosing type even if they belong to another assembly.
Actual access level for type members depend on their access modifiers and modifiers applied to their enclosing types. For example, if a type is declared as public but at least one of its enclosing types is private, then the actual access level will be private.
The modifiers abstract, sealed, and static are self-explanatory.
An abstract type may contain abstract members (i.e., members with a body). Such types may not be instantiated but they can be inherited.
Sealed types are those which may not be inherited, i.e., they are the opposite of abstract types.
Static types are those whose all members are static. In Nemerle such types are called modules.
Unsurprisingly, abstract and sealed modifiers may not be applied to the same type.
The “new” modifier may be applied to an inner type to avoid the warning that it conceals another inner type in the same enclosing type.
Attributes = AttributeSection+;
AttributeSection = "[" (AttributeTarget ":")? AttributeList ","? "]";
AttributeTarget = "field" | "event" | "method" | "param" | "property"
| "return" | "type" | "assembly" | "module";
AttributeList = Attribute ("," Attribute)*;
Attribute = AttributeName AttributeArguments?;
AttributeArguments = "(" AttributeArgument (","AttributeArgument)* ","? ")";
AttributeArgument = PositionalArgument | NamedArgument;
PositionalArgument = PExpr;
NamedArgument = Identifier = PExpr;
AttributeName = TypeName;
Attributes in Nemerle come in two flavors: attributes in the standard .NET sense and macro-attributes. They are treated the same way during the parsing stage. In both cases their descriptions are stored in the AST.
For .NET attributes AttributeName must refer, directly or indirectly, to a non-generic type inherited from System.Attribute. If an attribute name ends with Attribute, then that suffix may be omitted while referring to the attribute.
For macro-attributes AttributeName must coincide with a macro name.
The parser always creates Nemerle expressions (PExpr) when parsing attributes. If the attribute being parsed turns out to be a macro-attribute, then the expression is passed to the macro as is. If it is an ordinary attribute, then named parameters will be recognized. It will be checked that they follow positional parameters.
Nemerle supports the following data types:
- Classes (class), user-defined.
- Structures (struct), user-defined.
- Interfaces (Interface), user-defined.
- Variant types (variant), user-defined.
- Enumerations (enum), user-defined.
- Arrays (array), built-in. Nemerle supports both single-dimensional and multi-dimensional arrays.
- The built-in functional type. It describes a reference to a function.
- Tuple (tuple), built-in. It describes an ordered, finite set of values (possibly of different types).
- Delegates (delegate), built-in.
- Primitive data types, built-in.
- Nullable types. Essentially, Nullable-types are not types on their own. Nevertheless, they have their specific syntax and are treated in a special way in runtime. Therefore it makes sense to consider them as a sort of types.
Primitive data types are categorized into strings, numerical types, and void. For more details please refer to the “Built-in data types” section of the first article in the series.
All types are split into two disjoint categories: reference types and value types.
Reference types include classes, interfaces, variant types, arrays, the functional type, tuples (consisting of four or more components), delegates, strings, and boxed types.
Value types include structs, enums, tuples (with less than four components), primitive numerical data types, and Nullable-types.
Instances of reference types are always kept in the managed heap and are passed by reference.
NOTE
In contrast to C#, Nemerle does not support pointers, so all reference types are safe (managed). For interaction with unmanaged code one may use IntPtr, a special API, and DllImportAttribute attribute.
Value types are passed by value except of function arguments marked with ref or out keywords. Instances of value types may be allocated on stack or inside other objects.
Boxed types are value-types which have been “packed”, i.e., their values have been moved to the heap and is accessed via a reference. Boxing allows for using interfaces implemented in value-types or invoke methods of System.object on them. However, it is not possible to access members of a boxed value. For that, it has to be unboxed first.
Boxing in Nemerle is invoked by casting a value-type to System.object or an interface implemented by the type.
NOTE
Note that boxing does not happen when defining constraints to type arguments (when referring to interfaces implemented by the type substituted for a type parameter). This allows for efficient code generation when value-types replace type parameters of generic types.
h2 Type Declarations (Type).
Type declaration consists of a type name used to refer to it from other places in the application, e.g., when defining a type argument for a function.
Nemerle suports a special syntax for declaring functional types, arrays, tuples, and Nullable-types. Other types are declared in a common way.
Type = TypeName | ArrayType | TupleType | FunctionType | NullableType;
// Special type declaration syntax FunctionType = Type "->" Type; // the functional type TupleType = Type ("*" Type)+; // tuples ArrayType = "array" "[" Types? "]"; // a single-dimensional array ArrayType = "array" . "[" Rank "] "[" Types? "]"; // a multi-dimensional array Rank = ['0'..'9']; // dimension of a multi-dimensional array Types = Type ("," Type)*; // a comma-separated list of types NullableType = Type "?"; // the type must be a value-type
//The common syntax for declaring other types TypeName = QualifiedIdentifier TypeName = Identifier TypeArguments?; | TypeName : 285 "." Identifier TypeArgumentList? | TypeName : 283 "?"; TypeArguments = "."? "[" TypeArgument (","TypeArgument)* "]"; TypeArgument = TypeName | "_";
TypeName encompasses user-defined types (structs, classes, variant types, and type aliases) defined in the project or in plugged assemblies, and built-in types (string, int, double, void, etc.).
The built-in types, with the exception of void, are all type aliases defined in the standard library (Nemerle.dll). They belong to the open by default namespace Nemerle.Core. More information about the built-in types can be found in the “The built-in types” section of the first article.
To use a reference as a type parameter one can use the wildcard “_” in place of a specific type. This leads to the parameter being initialized as a type variable, which can take on any type. All type variables have to be specified to obtain a specific instantiation of a generic type.
A reference to a type which has at least one type parameter unspecified is called “non-fixed”. A non-fixed variable cannot be used to declare types or their members. However, it can be used inside a method. One such use case is declaring that a method returns a list without specifying the type of its members. This may be done as the following example demonstrates:
def function() : list[_]
{
...
}
In this case, the type of the list elements will be inferred on the basis of the function body or how it is invoked.
A single-dimensional array of integers (int):
array[int]
Usage examples:
An immutable variable holding a reference to an array of 42 integers initialized with zeroes (zero is the default value for int):
def elements : array[int] = array(42);
A mutable variable of the integer array type initially holding the null value:
mutable elements : array[int];
// initialization
elements = array(3);
// changing the element at index 1 (indexing starts from 0)
elements[1] = 42;
An immutable variable holding a reference to an array of integers 1, 2, and 3:
def elements = array[1, 2, 3];
A two-dimensional array of strings (string):
array[2, string]
An immutable variable holding a reference to a two-dimensional array of integers of 42×42 elements. The array is initialized with zeros:
def elements : array[2, int] = array(2, 42);
A mutable variable of a three-dimensional array type initialized with the null value.
mutable elements : array[3, int];
// initialization of elements
elements = array.[3][[[1, 2, 3], [1, 2, 3], [1, 2, 3]]];
// updating an element
elements[0, 1, 2] = 42;
// A zero-argument function that does not return a value
void -> void
// A zero-argument function that returns an int value
void -> int
// A function taking a string and return an int value
string -> int
// A function taking a string and an int and returning an int value
string * int -> int
// A function taking two strings and an int and
//returning a tuple consisting of a double value and a bool value
string * int -> double * bool
def doWork(predicate : DateTime -> bool) : void
{
...
when (predicate(DateTime.Now))
DoStaff();
...
}
// A tuple consisting of two string elements and one int element
string * int
// A tuple consisting of three elements of types int, double, and bool respectively
int * double * bool
def doWork() : int * bool { ... if (ok) (42, true) else (-1, false) }
def (result, ok) = doWork();
when (ok) DoStaff(result);
Tuples and functions have much in common in Nemerle. Essentially, function arguments, if they are more than one, are internally handled as a tuple. This is revealed by how functional types are described.
Nemerle supports auto-casts of tuples to function arguments, provided they are compatible. That is the reason the following example is a valid piece of Nemerle code:
def foo() : int * bool { ... } def bar(int * bool) : void { ... }
bar(foo());
def result = foo(); bar(result);
def (_, ok) as result = foo();
when (ok) bar(result);
System.Collections.Generic.List[int]
List[int]
SCG.List[int]
using System.Collections.Generic; using SC = System.Collections;
def foo() : Queue[int] { def queue = Queue(); // System.Collections.Generic.Queue[int] queue.Enqueue(42); // this line is sufficient to infer the type of queue queue }
def queue = Queue.[long](); // System.Collections.Generic.Queue[long] queue.Enqueue(42); def top = queue.Peek(); assert(top == 42);
def queue = SC.Queue(); // System.Collections.Queue queue.Enqueue(42); def top : int = queue.Peek() :> int; assert(top == 42);
Note that “.” is necessary to avoid indexing ambiguities when specifying types for constructors of generic types. This is not terribly elegant but, fortunately, Nemerle almost always allows for not specifying type arguments at all. The compiler can infer them based on how the types are used.
def print[T](value : T?)
where T: struct
{
if (value.HasValue)
WriteLine(value);
else
WriteLine(“No value!”);
}
mutable x : int? = null;
print(x);
x = 42;
print(x);
mutable y : DateTime? = DateTime.Now;
print(y);
y = null;
print(y);
No value!
42
26.12.2011 18:25:21
No value!
Members of a type include both members declared in the type itself and in his supertypes.
TypeMemberDeclaration = FieldDeclaration
| MethodDeclaration
| PropertyDeclaration
| EventDeclaration
| IndexerDeclaration
| ConstructorDeclaration
| TypeDeclaration;
Type members are restricted in the following way:
- Names of fields, properties, events, and nested types have to be different from names of other members of the same type.
- Method names have to be different from names of non-method members of the same type.
- The signature of a method or a constructor has to be different from signatures of other methods or constructors of the same type. Furthermore, differences only in ref/out argument modifiers are not sufficient for the signatures to be considered different.
- Signatures all indexers in a type have to be different from each other.
- Inherited type members may be overriden (be concealed) by members of the subtype.
Field is a type member and a variable which may be associated with instances of the type or the type itself (the latter is true for static fields). The FieldDeclaration keyword introduces a single new field.
FieldDeclaration = Attribits? FieldModifiers? "mutable"? Name VariableInitializer? ";";
FieldModifiers = VariableInitializer = "=" PExpr;
Fields split into static and non-static. Static fields are those marked with the static modifier or those that are members of a module. In the latter case “static” is unnecessary since module members can only be static.
Fields may have initializers: expressions which specify the initial value. The body of initializing expressions is copied to constructors such that it is evaluated once but through all constructors. If one constructor invokes another, the initialization is places only in the caller to prevent multiple evaluations.
Fields may be mutable or immutable. Values of immutable fields may be specified only in a constructor or an initializer. Immutability of a field does not extend to immutability of objects the field refers to. Whether or not an object can be modified depends on the implementation of its type.
The value of a mutable field may be changed using the assignment operator “=”, special macros, such as “++”, or by passing the field via a ref- or an out-parameter.
Mutable fields may be marked with the volatile modifier. Some compiler optimization techniques may affect the order in which instructions are carried out over the field, which may lead to unexpected and unpredictable effects in a multi-threaded environment with a lack of proper synchronization. Such reordering is restricted for volatile fields. This modifier may not be applied to fields that are structs.
Method is a function which, being a member of a type, carries out specific computation over its data.
MethodDeclaration = Attribits? Modifiers? SyntaxExtension? FunctionHeader Implements? MethodBody;
Implements = "implements" Implement ("," Implement) Implement = TypeName "." Identifier; MethodBody = Block ";"? | ";";
Methods are classified as follows (the corresponding keywords are listed in the brackets):
- Virtual methods (virtual) are methods which can be overriden in subtypes. A subtype implementation will be used even if the method is invoked from the base type.
- Abstract methods (abstract) are virtual methods which have no body. Such method can be a member of only an abstract type (a class or a variant type). It has to be overriden in a subtype unless that subtype is an abstract type.
- Sealed methods (sealed) are methods overriding some virtual methods of the base type and simultaneously prohibiting further overriding in subtypes.
- Static methods (static) are methods which belong to a type as opposed to its instances. Essentially, static methods are functions declared within a specific type.
- Non-static (or instance) methods are methods which implicitly take an extra argument called “this”. Through that argument they can read and update the internal, usually hidden, state of the instance.
- External methods (extern) are used for importing methods from libraries written in unmanaged languages. Such methods may not have a body. Importing is specified using the DllImport attribute.
A method may implement one or more methods of an interface which is a base type of the method type. This means that it takes control when its type is cast to the interface and an implemented method of that interfaces is called.
Implementation of an interface method can be explicit or implicit.
Implicit implementation means that the implementing method must have the same signature of the interface method. It also has to be non-static and public.
For explicit implementation, the implementing method must also be non-static and its signature must match the signature of the interface method but the name of the latter has to be specified in the “implements” section. The name of the implementing method does not have to (but may) coincide with the interface method’s name. Also, it need not be public.
A method implementing an interface method must not be static or external. However, it may be virtual, abstract, or sealed.
A method may introduce a syntax extension (SyntaxExtension). Such extensions may do arbitrary things, not limited to constructing the AST of the method. In particular, they can add implementations of fields, properties, or other methods.
A method declaration begins a new scope in which its type parameters and method arguments are accessible.
The name of a method has to be different from names of other, non-method members of the same type. It may coincide with names of other members but only if their signatures are different.
Declaration of multiple methods with the same name but different signatures is called overriding. It is permitted only if the signatures differ in their arguments. Overriding merely on the basis of the return value is not permitted.
Examples of method declarations and usage can be found in the section “Classes, structs, interfaces, and modules”.
The header is a part of a method of a local function declaration. It consists of the mandatory part – the name and the list of arguments in round brackets – and the optional part: type parameters (TypeParameters), the return value, and constraints (ConstraintClause).
FunctionHeader =
Name TypeParameters? "(" Parameters? ")" (":" Type)? ConstraintsClause*;
Type parameters and arguments must have different names. Names of type parameters may be used for declaring types of arguments, the return value, and in other places inside the declaration of a method or a local function.
A property is a type member providing access to the state of an instance or the type, such as array length or salary of an employee. Properties may be considered as a tool for abstracting the state of an object. Referring to a property is syntactically similar to accessing a field. However, differently from fields, properties do not store values but rather provide an interface for accessing them (by means of accessors – methods invoked upon reading or writing the value).
PropertyDeclaration = Attribits? Modifiers? Name ":" Type "{" PropertyBody "}";
Getter = Attribits? Modifiers? "get" MethodBody; Setter = Attribits? Modifiers? "set" MethodBody; PropertyBody = FieldDeclaration? Getter Setter? | FieldDeclaration? Setter Getter?;
The compiler generates a corresponding method for each accessor, whose name is obtained by perpending “get_” (for read-accessors) or “set_” (for write-accessors) to the name of the property. For example, it will generate methods get_Size and set_Size for the read-write property Size. These methods do not have to be used to access the property but help to use it as a reference to a function.
One of the property accessors may have access modifiers that differ from those of the property itself. It is allowed to increase the level of protection but not to reduce it. For example, a public property may have a private write accessor, which is useful for allowing modifications only from within the type and not from outside. However, it is not permitted to specify a public accessor for a non-public property. If such accessor is required, the property access modifier has to be changed first.
One can declare fields within the body of a property. Such fields are accessible only from that property’s accessors. Their names are randomly altered by the compiler to prevent unintentional modifications. However, property fields can be accessed from macros since internally they are members of the same type as the property itself.
Nemerle supports the so-called autoproperties in addition to regular properties. They come handy in those cases when a property does not carry out any computation in its accessors but merely returns (or updates) the value of a certain field.
There are two kinds of autoproperties:
get-autoproperties have only get-accessors. They behave similarly to immutable fields, e.g., can be initialized only in constructors.
Regular autoproperties can have both get- and set-accessors.
Examples of declaring and using properties are listed in the section “Classes, structs, interfaces, and modules”.
An event is a type member used for issuing notifications about changes to the internal state of the type or its instance or changes happening to the outer data structures. A client code can register listeners (functions with a matching signature) which will be invoked upon occurrence of the event.
EventDeclaration = Attribits? Modifiers? "event" Name ":" Type EventBodyBlock?;
EventBodyBlock = "{" EventBody "}"; EventBody = EventGetter | EventSetter; EventGetter = "add" MethodBody ("remove" MethodBody)?; EventSetter = "remove" MethodBody ("add" MethodBody)?;
The type of an event must be a delegate (c.f. section “Delegates”), whose access level is no lower than that of the event.
Events may have accessors (auto-generated by the compiler if not explicitly declared).
The EventBodyBlock may be declared for abstract events.
Indexers are used for defining a “facade” for a user-defined type such that it behaves like an array, e.g., supports index-based (x[i]) access to its internal data. This is useful for implementing custom collections supporing index-based access to their elements (examples include lists, dictionaries, etc.).
PropertyDeclaration =
Attribits? Modifiers? Name "[" Parameters "]" ":" Type "{" PropertyBody "}";
The default indexer is auto-generated for indexers with the name “Item”. Such indexers allows for applying the indexing operator directly to instances of the type. In all other cases a named property is added so that indexing is applied to that property.
Indexers are a kind of properties. As such, all said above regarding properties also applies to indexers.
Example:
using System.Console;
[Record] partial public class B { protected internal class C {}
public NamedIndexer[index : int] : string { get { "'" + Item[index] + "'" } }
public Item[index : int] : string { _data : array[string] = array(42);
get { _data[index] } set { _data[index] = value; } } }
def b = B(array["0", "1", "2", "3", "4", "5", "6"]); b[6] = "six";
WriteLine(b[5]); WriteLine(b[6]); WriteLine(); WriteLine(b.NamedIndexer[5]); WriteLine(b.NamedIndexer[6]);
Output:
5 six
'5' 'six'
ConstructorDeclaration = Attribits? Modifiers? Name "(" Parameters? ")" MethodBody; Parameters
Parameters = FormalParameter (", " FormalParameter)* ", "?; FormalParameter = ParameterPrefix? NameOr_ ":" TypePrefix? Type ("=" DefaultValue)?; ParameterPrefix = Attribits? SyntaxExtensions? ("this" | "params")?; TypePrefix = "ref" | "out"; NameOr_ = Name | "_"; DefaultValue = PExpr;
The modifier “this” can be applied to the first argument of a method (such methods are called extension methods). Such methods are static but may be invoked like non-static methods (i.e., via “.”).
The modifier “params” can be applied to the last argument of a method (such argument is then called an argument array). Methods whose declaration contains an argument array may take arbitrary number of arguments. The type of extra arguments (i.e., not specified explicitly) has to match the type of the argument array.
Arguments may be referential and returnable (prefixes “ref” and “out”, respectively).
Values of referential arguments are passed by reference (managed pointer). This means that only names of mutable variables can act as a value of such argument. Their values may be changed during the call.
Returnable arguments are similar to referential ones except that they do not allow for passing a value into a function but can only be used for returning a value. The returning argument must be initialized before using inside the method. Its value must be specified before returning from the method.
Referential and returnable arguments, as well as argument arrays, do not have default values (DefaultValue) which may be specified for other arguments.
Only constant expressions, i.e. those evaluated to a constant/literal at compile-time, may act as default values.
Syntax extensions may be used in argument declarations. Syntax extensions are macros introducing new syntactic constructs (in this case, at the level of an argument). They can be created by adding a new syntax macro at the argument level.
Operators in Nemerle are declared similarly to static methods except of one naming difference. Operator names are strings of operator characters led by “@” (this is done to transform the string of characters to a syntactically correct identifier).