Copied from this Forum post and somewhat extended.
What enums ought to be
The general idea behind D’s enums is that they’re “named literals,” and a replacement for value-like C macros, i.e. those that are used as expressions for objects. At least that’s how I model them in my head. You have a literal and you want to give it a name―because its identifier conveys intent, because it’s a complicated expression you don’t want to repeat, because it references symbols inaccessible/private in usage scope, or for whatever reason. An enum is quite different from an immutable static variable, which e.g. has an address, is immutable, and can be initialized at program startup, i.e. at run-time.
Proposal
Enums with indirections are tricky, but my idea is―I hope―both sane and simple:
- The expression initializing an enum must be admissible to initialize a static immutable variable at compile-time.
- When such an enum is used, the initializing expression is evaluated at the place of the evaluation of the enum to create the value anew.
- An expression that refers to an enum value is an rvalue, even if its initializing expression is not (e.g. because it’s a
static immutable variable that’s initialized at compile-time).
Rationale
Condition 1 ensures that they cannot reference mutable global state, which they arguable shouldn’t be able to. This does not require the enum’s type to actually be immutable, though, and thus won’t itself bar later modifications.
Condition 2 ensures that the possibly-mutable parts of the object graph are duplicated on use. Only immutable parts can be shared between evaluations.
Condition 3 ensures that enums behave like literals, which are never lvalues. Essentially, if an enum is defined by an lvalue expression E, it behaves as if it was (() => E)() instead, which is an rvalue expression.
The primary difference between a pure factory function with no arguments and an enum is intent and that enums can be grouped and have switch support. Technically, using an enum is close to invoking a pure factory function.
Examples
Well-known phenomenon that an enum of slice type allocates a dynamic array when evaluated.
enum int[] values = [1, 2, 3];
void main()
{
int[] xs = values, ys = values;
// as if:
int[] xs = [1, 2, 3], ys = [1, 2, 3];
assert(xs !is ys);
}
Simple case with two layers of indirections: Everything is newly allocated.
enum int[][] values = [[1, 2], [1, 2]];
void main()
{
auto xs = values, ys = values;
// As if:
int[] xs = [[1, 2], [1, 2]], ys = [[1, 2], [1, 2]];
assert(xs !is ys);
assert(xs[0] !is ys[0]);
assert(xs[0] !is xs[1]);
}
The above showcase how it’s handled by the language today and this is also what I propose. This isn’t true anymore if we modify the two layers of indirections a bit:
enum int[][] values = () {
auto v = [1, 2];
return [v, v];
}();
void main()
{
auto xs = values, ys = values;
// Currently: As if
int[][] xs = [[1, 2], [1, 2]], ys = [[1, 2], [1, 2]];
assert(xs !is ys);
assert(xs[0] !is ys[0]);
assert(xs[0] !is xs[1]);
// Proposed: As if
int[][] xs = () { auto v = [1, 2]; return [v, v]; }();
int[][] ys = () { auto v = [1, 2]; return [v, v]; }();
assert(xs !is ys); // same
assert(xs[0] !is ys[0]); // same
assert(xs[0] is xs[1]); // inverted
}
With 2 or more levels, a slice can contain two pointers that point to the same object. A literal of this object is impossible to spell out directly (i.e. without a lambda expression), but it’s worth being able to have it. The factory function for values can call dup on one of the uses of v to explicitly decouple the two entries, but with the current language semantics, the non-tree layout isn’t possible. If we take the idea to heart that an enum ought to be like a C macro in that the use of the enum is replaced by the definition in a rather mindless way (except that while in C, the replacement is syntactic and in D would be semantic), it’s seems to me that complex object graphs should be admissible and retained as specified in the creation expression. It makes the language more consistent, too: Normally you need dup, but in this niche case, you don’t, leading to programmers being surprised.
Copied from this Forum post and somewhat extended.
What enums ought to be
The general idea behind D’s enums is that they’re “named literals,” and a replacement for value-like C macros, i.e. those that are used as expressions for objects. At least that’s how I model them in my head. You have a literal and you want to give it a name―because its identifier conveys intent, because it’s a complicated expression you don’t want to repeat, because it references symbols inaccessible/private in usage scope, or for whatever reason. An enum is quite different from an
immutable staticvariable, which e.g. has an address, isimmutable, and can be initialized at program startup, i.e. at run-time.Proposal
Enums with indirections are tricky, but my idea is―I hope―both sane and simple:
static immutablevariable that’s initialized at compile-time).Rationale
Condition 1 ensures that they cannot reference mutable global state, which they arguable shouldn’t be able to. This does not require the enum’s type to actually be immutable, though, and thus won’t itself bar later modifications.
Condition 2 ensures that the possibly-mutable parts of the object graph are duplicated on use. Only immutable parts can be shared between evaluations.
Condition 3 ensures that enums behave like literals, which are never lvalues. Essentially, if an enum is defined by an lvalue expression
E, it behaves as if it was(() => E)()instead, which is an rvalue expression.The primary difference between a pure factory function with no arguments and an enum is intent and that enums can be grouped and have switch support. Technically, using an enum is close to invoking a pure factory function.
Examples
Well-known phenomenon that an enum of slice type allocates a dynamic array when evaluated.
Simple case with two layers of indirections: Everything is newly allocated.
The above showcase how it’s handled by the language today and this is also what I propose. This isn’t true anymore if we modify the two layers of indirections a bit:
With 2 or more levels, a slice can contain two pointers that point to the same object. A literal of this object is impossible to spell out directly (i.e. without a lambda expression), but it’s worth being able to have it. The factory function for
valuescan calldupon one of the uses ofvto explicitly decouple the two entries, but with the current language semantics, the non-tree layout isn’t possible. If we take the idea to heart that an enum ought to be like a C macro in that the use of the enum is replaced by the definition in a rather mindless way (except that while in C, the replacement is syntactic and in D would be semantic), it’s seems to me that complex object graphs should be admissible and retained as specified in the creation expression. It makes the language more consistent, too: Normally you needdup, but in this niche case, you don’t, leading to programmers being surprised.