Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reusable Pattern Matching #4057

Open
HenriqueNas opened this issue Aug 25, 2024 · 17 comments
Open

Reusable Pattern Matching #4057

HenriqueNas opened this issue Aug 25, 2024 · 17 comments
Labels
feature Proposed language feature that solves one or more problems

Comments

@HenriqueNas
Copy link

HenriqueNas commented Aug 25, 2024

Problem

Dart currently offers powerful features with pattern matching like if-case, switch expressions, destructuring... which greatly enhance code readability and safety:

var json = {
  'user': ['Lily', 13]
};
var { 'user': [name, age] } = json;

However, I believe these feature could empower even more developers by introducing a mechanism to define reusable patterns, much like we define functions or classes:

patterndef ValidUser(String name, int age) = { 'user': [name, age] };

This would be particularly useful in scenarios where the same pattern is used across multiple switch statements or within different parts of the application. Additionally, allowing the definition of pattern constants could help in enforcing consistency across the codebase and reducing redundancy.

Proposal

By adding the patterndef keyword in Dart that allows developers to define patterns once and reuse them multiple times. Here's an example of what this could look like:

// patterndef have similar syntax of typedef
patterndef PrimaryColors() = Color.red || Color.green || Color.blue;

void validateColor() {
  final color = Color.red;

  // using switch expressions
  final isPrimary = switch (color) {
    PrimaryColors => true,
    _ => false,
  };

  // using if-case
  if (color case PrimaryColors) {
    print('Is primary color: $isPrimary');
  } else {
    print('Is not primary color: $isPrimary');
  }

  // using destructuring
  final PrimaryColors(colorIsPrimary) =color;
}
@HenriqueNas HenriqueNas added the feature Proposed language feature that solves one or more problems label Aug 25, 2024
@HenriqueNas
Copy link
Author

HenriqueNas commented Aug 25, 2024

This is my first issue here, so I'd really appreciate any feedback to help me improve it ❤️!

Here's another example to illustrate this proposal:

patterndef ValidUser(String name, int age) = {
  'name': var name,
  'age': var age,
};

const userMap = {
  'name': 'John Doe',
  'id': '123',
  'age': 30,
};

validateUser() {
  // same as:
  // if (userMap case {'name': String name, 'age': int age}) {}
  if (userMap case ValidUser(:name, :age)) {
    print('Valid user: $name, $age');
  } else {
    print('Invalid user: $userMap');
  }

  // same as:
  // final {'name': name, 'age': age} = userMap;
  final ValidUser(:name, :age) = userMap;

  final userFromMap = switch (userMap) {
    ValidUser(:name, :age) => User(name, age),
    _ => throw ArgumentError('Invalid user MAP: $userMap'),
  };

  switch (userMap) {
    case ValidUser(:name, :age):
      if (age < 18) {
        print('User $name is underage');
      } else {
        print('User $name is an adult');
      }
      break;
    default:
      print('Invalid user: $userMap');
  }
}

class User {
  User(this.name, this.age);

  final String name;
  final int age;
}

@HenriqueNas HenriqueNas changed the title feat(patterndef): Reusable Pattern Matching feature Reusable Pattern Matching Aug 25, 2024
@hydro63
Copy link

hydro63 commented Aug 29, 2024

While i love the idea, the problem that i think needs to be addressed is variable declaration patterns. Since declaration patterns define new variables, how should compiler act when it encounters a variable redeclaration? Should it automatically redefine the variable, or throw error for trying to redeclare the variable? This is not problem with normal patterns, since the developper can see what the pattern does, but since this pattern is reusable, the developper can't know what it declares. This kind of ambiguity can make it hard to work with and is very error prone.

Do you propose that Validate(String user, int age) = ... can be used like case Validate(validUser, validAge) so that instead of defining String user and String age the compiler defines validUser and validAge with the values? As such the developper would be able to name the variables himself.

So the your code could look like this:

patterndef ValidUser(String name, int age) = {
  'name': var name,
  'age': var age,
};
...

switch (userMap) {
  case ValidUser(:validName, :validAge):
    if (validAge < 18) {
      print('User $validName is underage');
    } else {
      print('User $validName is an adult');
    }
    break;
  default:
    print('Invalid user: $userMap');
}

Since we also don't want to overcomplicate the things, i also propose syntax for redeclaring the variables, instead of making new ones:

case var ValidUser(:name, :age): //will act the same way as your proposal, but without ambiguity

Every time you use a patterndef and redefine a variable, you have to explicitly use var keyword, else the compiler throws declaration error.

@hydro63
Copy link

hydro63 commented Aug 29, 2024

After thinking about it some more, i'm completely sold on this idea and would like to propose an addition it.

Proposal

If Dart were to have patterndef it would be nice to add a way to combine map patterns. More specifically i would like to add a way to achieve TypeScript-like pattern composition. Take this typescript code:

type User = {"username": string, "nickname": string, "id": number, "isMember": boolean};
type MemberFields = {"rank": number, "joined": string};
type MemberUser = User & MemberFiels;

Here are the patterns for User and MemberFields:

patterndef User(String username, String nick, int id, bool isMember) = {
  "username": var username,
  "nickname": var nick,
  "id": var id,
  "isMember": var isMember
};

patterndef MemberFields(int rank, String joined) = {
  "rank": var rank,
  "joined": var joined,
};

Now imagine you are getting some json, and you want to validate if it's user and check if he is privilaged. I propose typescript-like composition of patterns, where instead of validating if json is a User and if he is a Member separately, having multiple conditions, why not just compose the patterns and check once?

The composition would only work with Map patterns. and there would be two different kinds of composition - Strict, and Optional.

Strict composition

Strict composition pattern would pass only if both subpatterns pass. The composite pattern would capture variables from both subpatterns and pass them to the user. Here is my proposed syntax:

patterndef MemberUser(/*args*/) = User(/*usrAgrs*/) & MemberFields(/*memberArgs*/);
// equivalent to
patterndef MemberUser(/*args*/){
  "username": var username,
  "nickname": var nick,
  "id": var id,
  "isMember": var isMember,
  "rank": var rank,
  "joined": var joined,
};

Optional composition

Optional composition pattern would pass if the left pattern passes, regardless of if the right side pattern fails. The composite pattern would capture variables from left patterns, and try to capture as many variables from right pattern. If some variable in right pattern is not found, it would instead assign null to the variable.

patterndef UserPossiblyMember(/*args*/) = User(/*usrAgrs*/) + MemberFields(/*memberArgs*/);
// equivalent to
patterndef UserPossiblyMember(/*args*/){
  "username": var username,
  "nickname": var nick,
  "id": var id,
  "isMember": var isMember,
// would be optional fields, if not found assign null
  "rank"?: var rank,
  "joined"?: var joined,
};

Usage

Here is example validating a user

// before
if(user case User(...)){
  if(user case Member(...)){
    doSomething();
  }
}

// after
if(user case var User(usename, nickname, id, isMember) & Member(rank, joined)){
  print("$username $nickname $id $isMember $rank $joined");
}

Edit

I've just realised a problem with the proposed patterndef syntax - the identifier([member]) syntax clashes with the object patternmatching syntax

ValidateUser(:name, :age) // user defined pattern
ValidateUser(:name, :age) // class ValidateUser with properies name and age

As such, new syntax is needed for the patterndef to work, without breaking the whole codebase

@lrhn
Copy link
Member

lrhn commented Aug 29, 2024

This is a number of features, of growing complexity.

The fundamental level is named patterns, which allows giving a pattern a name, and reusing it by referencing that name.

// Pattern on `Object` values.
patterndef Object UrlPattern = (Url() || String()) && var url;

A pattern has an associated matched value type, which the definition should include. It affects how the pattern is interpreted, you can't just have a pattern without a matched value type. (But you can probably write nothing and get Object? as default.)
Usage would be if (value case UrlPattern).

The next step is parameterized patterns. First type parameterized patterns:

// Pattern on `Object?` values, parameterized by a type `R`.
patterndef Object? Opt<R extends Object> = (R _ || null) && var value;

Then there are first order patterns, which take patterns as arguments.

// Pattern on Object? which takes type `R` and *pattern on `R`* as parameter.
patterndef Object? Chk<R extends Object>([#Pattern<R> extraCheck = _]) = (R value && extraCheck);
void main() {
   if (someValue() case Chk<int>(>= 0)) {
     print(value.toRadixString(16));
   }
}

This gets hard to parse because the grammar cannot see that Chk<int>(...) is a pattern definition reference, not an object pattern on the type Chk, so it's not ready to parse a pattern there. But on the other hand it would look for a colon, maybe prefixed by an identifier, and wont find that. Might need extra syntax to invoke the definition.

Actually, before that there could be parameterization of when clauses:

patterndef Object GoodInt(int min) = int v && >= value;
patterndef Object GoodInt2(int min) = int v when v >= value;

Both of these are complicated by patterns needing to be constant, but then that just means that all the arguments to these patterndefs need to be constant values or constant patterns.

Then there is the holy grail: First class patterns, higher order pattern definitions, where you can take pattern definitions (pattern functions, really) as argument to non pattern functions.
That's extremely unlikely. We don't have higher order types either.

But we could potentially take pattern definitions as arguments to pattern definitions, since everything must be constant.
But that may also require specifying the "type" of a pattern, which may also need specifying the variables that the pattern captures.

Imagine:

patterndef Object => {int number} IntValue(Object => {int number} fallback) =
    int value || fallback;
void main() {
  if (someValue() case IntValue(String(length: var value) || List(length: var value))) {
    print(value.toRadixString(16));
  }
}

The IntValue pattern function takes a pattern as argument, which must accept Object and if matching should produce an int value definition.

Again, the syntax is horribly close to existing pattern syntax, and may need special punctuation.

It's also unlikely that we'll want to this far. Or anywhere in the direction of abstracting over patterns.

Use functions for abstraction, don't abstract over patterns. Functions are the most general functionality, they can contain pattern matches and any other code too. The more restricted the grammar you're abstracting over, the more special-casing you may end up needing.

I'd just write:

({String name, int age})? validUser(Object input) =>
  switch (input) {
    {'user': [String name, int age]}) => (name: name, age: age),
    _ => null;
  };

// ...
  switch (validUser(input)) {
    case (: var name, : var age)):
      if (age< 18) {
        print('User $name is underage');
      } else {
        print('User $name is an adult');
      }
      break;
    default:
      print('Invalid user: $input');
  }

@rrousselGit
Copy link

rrousselGit commented Aug 30, 2024

^

I'd just write:

({String name, int age})? validUser(Object input) =>
  switch (input) {
    {'user': [String name, int age]}) => (name: name, age: age),
    _ => null;
  };

// ...
  switch (validUser(input)) {

That's not really a reusable pattern. It's a function.
At that point, the switch doesn't bring value anymore in your example.

IMO the more realistic approach is to use extension methods:

extension on Color  {
  Color? get casePrimaryColor {
    switch (this) {
      case Colors.red:
      case Colors.blue:
      case Colors.green:
        return this;
      default:
         return null;
  }
}

Then used as:

switch (color) {
  case Color(casePrimaryColor: final color?):
     print('Is primary color: $color');
}

@hydro63
Copy link

hydro63 commented Aug 30, 2024

I think we have been thinking about reusable pattern matching in a wrong direction. There is no real benefit to making a patterndef a new keyword, when most of the time you can just use a function. So how about instead of creating an alias for some pattern, we let the developper delegate pattern matching to a function.

Proposal

I propose allowing patterns to delegate the validation and matching to functions, which would allow more composable patterns, optional complex validation, automatic typing of the data and quick structure building. If the function throws or returns null, the pattern is wrong and is refuted.

Syntax

Here is a proposed syntax of the pattern type:

functionIdentifier [var_name]
// example
matchUser user // captures data to var user
matchUser // only validate the correctness of the pattern

As far as i know, this kind of pattern doesn't overlap with any current pattern grammar, since the compiler knows what is a class and what is a function. The compiler would be able to infer the type of the variable from the return type of the function.

Example

Let's say we have a user, which itself has a field of sent messages (pretend "string" is actually type definition):

{
  "user_id": "string",
  "user_info": {
    "username": "string",
    "nickname": "string",
    "email": "string",
    "avatar": "string"
  },
  "messages": [
    { "message_id": "string", "content": "string" },
    { "message_id": "string", "content": "string" },
    { "message_id": "string", "content": "string" },
  ]
}

This is how we would validate it and load it up into a structure using my proposal:

class User {
  String id;
  UserInfo info;
  List<Message> messages;

  User(this.id, this.info, this.messages);

  static User? match(dynamic objToMatch) {
    // quick match and map => match a list and map the elements to Message
    // if its not a list, or the elements are not Message, refute it
    List<Message>? matchMessages(dynamic obj) {
      if (obj is List) return obj.map<Message>((e) => Message.match(e)!).toList();
      return null;
    }

    if (objToMatch
        case {
          "user_id": String id,
          "user_info": UserInfo.match userInfo,
          "messages": matchMessages messages,
        }) {
      return User(id, userInfo, messages);
    }
    return null;
  }
}

class UserInfo {
  String username, nickname, email, avatar;

  UserInfo(this.username, this.nickname, this.email, this.avatar);

  static UserInfo? match(dynamic objToMatch) {
    if (objToMatch
        case {
          "username": String username,
          "nickname": String nickname,
          "email": String email,
          "avatar": String avatar
        }) {
      return UserInfo(username, nickname, email, avatar);
    }
    return null;
  }
}

class Message {
  String id, content;
  Message(this.id, this.content);

  static Message? match(dynamic objToMatch) {
    if (objToMatch case {"message_id": String id, "content": String content}) {
      return Message(id, content);
    }
    return null;
  }
}

And then we could use it directly in if case and switch statements

if(incomingObject case User.match){
  print("its a User");
}

// with capturing data
if(incomingObject case User.match user){
  print("${user.id} ${user.info} ${user.messages.length}");
}

Benefits

It would allow very quick validation and loading of the structures, where the data would closely resemble the structure they are coming from. The class would also be able to choose whether some values are valid (example adult >= 18) and refuse the pattern.

In relation to pattterndef, if you need to check the same pattern multiple times, you probably have a data structure you are validating. In that case, why not just combine the validation with the data and make it much easier for use in other parts of your program.

With the data classes and macros, that are in the works, it would also be much easier to write, since you could use a macro (for example @ToPattern()) to add the matching function automatically.

I also can't think of any detriments this feature would bring, though it needs to be studied some more.

@lrhn
Copy link
Member

lrhn commented Aug 30, 2024

That's not really a reusable pattern. It's a function.

True. And the function is the unit of code-reuse and abstraction in the language, which is why it works.

Using extension getters is also viable, getters is one way to get user code into a pattern, and it's the most flexible one. (Another one is overriding > etc. Overriding == with "custom behavior" is an accident waiting to happen.

(I'd extend object pattern property naming to entire constant cascade-selector-chains, so you don't need to write extensions to do number case int(abs(): < 10) && var smallNum or Result(stringContent?.trim(): var trimContent). Maybe also provide a way to perform an operation with the value as argument, but I guess an extension <T> on T { doWith<R>(R Function(T) action) => action(this); } would allow List<S>(first.doWith(computeSomething): MorePattern) easily enough.)

The thing here is that abstracting over a pattern really is abstracting over a test. A pattern is one way to do a test, but using a function means you can choose another implementation too. You can't embed that in the middle of a larger pattern, but you can combine multiple test functions in another test function.

As Tennent said in "Principles of Programming Languages", any meaningful syntactic class can be abstracted. It's not completely unreasonable to be able to name a pattern, so it can be reused. We do it for types because it's useful. We don't need to, today you can write any type inline, but sometimes you want to give it a name.

My worry is that pattern abstraction is a feature with a large risk of feature creep. Should it be able to include a when clause? Can it be parameterized by types? Constant values? Functions to be called in when clauses? Other patterns? Matched variables ("out" parameters)? With extra complications from patterns needing to be mostly constant, but when-clauses do not.
And it's not clear that the simplest possible version would be good enough, since lots of the examples in this thread want more than just naming a pattern. They parameterize in one way or another.

From a principle in the same book, a language should have zero, one or an infinite amount of any feature. Dart has zero levels of pattern abstraction. First order patterns, aka. pattern definitions, pattern aliases which directly expand to a pattern, is an option. Can probably even take other patterns as arguments. I'm not sure it's worth its own complexity cost, but it's likely doable.
As for higher order patterns ... I'd definitely not go there. Dart doesn't have higher order types either. It makes static reasoning harder, or it risks exponential unfolding at compile time, and it makes inference much, much harder.

So, for the heck of it, let's define a first-order pattern alias, with hygienic local variables, so it cannot abstract over bound variables:

<pattern-alias> ::=
  'patterndef' <pattern-type>? <identifier> 
     <pattern-type-parameters>? 
     <pattern-parameter-list>?  '=' <pattern> ('when' <expression>)?
     '[' <pattern-pattern-parameter-list>? ']'

-- Signature of a pattern. The matched value type, and the type that the pattern promotes that to 
-- if it matches. If the promoted type is omitted, we can either infer it, or default it to not promoting.
-- If both are omitted, we can infer or default to `Object? -> Object?`
<pattern-type> ::= (<type> '->') <type>

-- Like a type parameter list, but allowing `const` in front of names.
-- Must be `const` if referenced outside of `when` clause.
<pattern-type-parameters> ::= '<' <pattern-type-parameter> (',' <pattern-type-parameter>)* '>'
<pattern-type-parameter> ::= 'const'? <identifier> ('extends' <type>)

-- Normal "value" parameters, like normal parameter list, but can be marked `const`.
-- Must be const if used outside of `when` clause. 
<pattern-parameter-list> ::= '(' ... normal parameter list, but allowing `const` modifier on parameters ...')'

-- Like a parameter list, including optional and named parameters, but for pattern parameters which have a
-- pattern type. Non-empty.
<pattern-pattern-parameter-list> ::= 
      <pattern-pattern-parameters> (','?
          | ',' '[' <pattern-pattern-parameters> (','? ']' | ']' ',' '{' <pattern-named-pattern-parameters> ','? '}' 
          | ',' '{' <pattern-named-pattern-parameters> ','? '}')
    | '[' <pattern-pattern-parameters> (','? ']' | ']' ',' '{' <pattern-named-pattern-parameters> ','? '}' 
    | '{' <pattern-named-pattern-parameters> ','? '}')
<pattern-pattern-parameters> ::= <pattern-pattern-parameter> (',' <pattern-pattern-parameter>)* ']'
<pattern-pattern-parameter> ::= <pattern-type>? <identifier> ('=' <pattern>)?
<pattern-named pattern-parameters> ::= 
        'required'? <pattern-pattern-parameter> (',' 'required'? <pattern-pattern-parameter>)*
<pattern-pattern-parameter> ::= <pattern-type>? <identifier> ('=' <pattern>)?


<pattern> ::= ... | <pattern-alias-reference>

<pattern-alias-reference> ::= 
  <qualified-identifier> <argumentPart>? '[' (<argumentList> ','?)? ']'

which you can use as:

/// Matches [List<T>] that starts and ends with elements matched by [First] and [Last].
patterndef List<T> ListEnds<T>[T First, T Last] = [First && Last] || [First, ..., Last];

/// Matches a [List<T>] with [minLength] &le; `length` &le; [maxLength].
///
/// If [First] and/or [Last] are are provided and the list is non-empty,
/// the [List.first]/[List.last] elements must match.
patterndef Object?->List<T> LimitedList<T>[{T First = _, T Last = _}]({int minLength = 1, int? maxLength}) =
  List<T>(length: >= minLength && var length) && 
               ([] || ListEnds<T>[First, Last]) when maxLength == null || length <= maxLength;

// ...
  if (value case LimitedList<int>[> 0](minLength: 2, maxLength 4)) {
    // `value` promoted to `List<int>`
    // Known to have length 2..4 and first element >0.
    ..
  }

or

patterndef Json = null || num() || String() || List<Object?>() || Map<String, Object?>();

// ...
   if (value case Json[] && var json) { ... } // the `[]` is required to show the "template invocation".

The pattern-type before the name has the form "matched-value-type -> promoted-type". The pattern assumes the matched value type is the former, and if it matches, it promotes to the latter.
If only one type is written, it's the promoted type. The pattern is assumed to apply to Object? and narrow to that type. If the single type has a !, then then pattern must exhaust that type.
If no type is written at all, it defaults to a matched value type of Object? and a promoted type is inferred from the pattern.

The <...> are normal type parameters, except that they can be marked const. If they occur outside of the when clause, they must be marked const, and if const, the type argument must be a constant type.

The '(...)are normal value parameters, except that they can be marked asconst. If they occur outside of the whenclause, they must be markedconst, and if const, the argument expression must be a constant expression (and is in a const` context).

The [...] are pattern parameters, which can be given patterns as argument that must match the declared pattern type. If optional, they can have a default pattern. If no default pattern is written, they default to Never(), which rejects all values.

The pattern parameters and value parameters can be referenced in the aliased pattern, the type parameters can be referenced everywhere. The aliased pattern may include a when clause.
This when clause is executed immediately when the pattern matching succeeds, in the middle of matching the pattern that the alias is put into, before the variables captured by the pattern alias go out of scope.
That also ensures that a ListEnds<double>[!= 0, _] && List(.first: double(atan: ...)) (where double.atan is an extension getter calling atan from dart:math which is undefined for the value zero) doesn't continue with an invalid value. (We can already do computation in the middle of a pattern using extension getters, so it's not new and special, just a little different.)

The pattern signature, MatchedValueType -> PromotedType does not account for:

  • Variable bindings.
  • Having a with clause
  • Being exhaustive for the matched value type.

For the last point, we can define <pattern-type> ::= <type> ('->' <type>|'!')? where type! is exhaustive on the matched value type, and therefore doesn't need to promote. (If you want as T in there to promote to T, ... do it some other way.)
A patterndef with a with clause can't be exhaustive.

As for exfiltrating bindings ... that's a bigger can of worms. I'd probably suggest defining a "return type/value" for a pattern, which creates a record value out of some of the matched variables or other constants.
So, maybe allow the RHS to end with 'return' <recordLiteral>' where the record literal can only reference constant values, parameter values and bound variables of the pattern itself, and change ` to

<pattern-type> ::= (<type> '->')? <type> '!'? (':' <recordType>)?

so you can write

// Makes `T first` and `T last` available as "object pattern" properties of a match.
patterndef List<T>:(T first, T last) FirstLast<T> = 
     List<T>(isNotEmpty: true, : var first, :var last) return (first, last)

and use it by putting : pattern after the [..] of a template invocation:

// ...
  if (someList case FirstLast<int>[]:(var fst && >= 0, var lst && >= 0) when fst < lst) { ... }

If you write nothing, the return type is () and the alias has an implicit return (). Or use null.

So, ... possible. Not entirely sure how desirable this is. There is much complexity because we are passing in three kinds of parameters: Types, values and other patterns, in order to create a single pattern, and we need a way to say how to extract values that is not returning them. Or rather, a template is like a combination match + return value.

I would also think it'll be better to build on top of functions. They are the more general abstraction mechanism.
So, like @hydro63 says, imagine a pattern could call a method with the matched value, and match on the return:

if (value case Foo() && .(@toPoint(_): (int x, int y))) ...

where .(<propertyPatterns> is short for MatchedValueType(<propertyPatterns>) and
@<expression> evaluates expression with _ bound to the matched value. (Probably over-complicated, but still.)
Then toPoint just have to return null if it doesn't match, and a record with two integers if it does, but it can be any function, not just a pattern.

@hydro63
Copy link

hydro63 commented Aug 30, 2024

If we were to compare patterndef and "function as pattern", i would say that "function as pattern" is a lot better solution than patterndef, in both simple and complex cases. While patterndef could have a use case in quick and simple reusable patterns, i'd argue that if you need to reuse pattern multiple times, you don't need reusable pattern, you need a class. And if we were to allow parameters in the patterndef, they would very quickly become "C Macros + Regex" but worse, since the patterndef would basically function as inline replace, while in code doing basically the same as Regex. Nobody would be able to use them, interpret them, or even write them.

As @lrhn pointed out, and as i have also failed to realise (hence bad proposals), patterns are not just some constructs, they are tests. The Dart compiler basically compiles the pattern into validation function with variable bindings. Thinking of them as their own separate construct, and building on it, will only lead to more complexity and insufficient solutions. Instead of thinking of patterns as their own thing, thinking of them as functions makes it easier to use and extend their functionality.

Baseline functionality

Before we continue with this discussion, it's good to understand what we want to achieve with the reusable patterns. Here are the most essential points:

  1. It has to be easy to write and should not have much overhead associated with it
  2. The pattern has to be easily embeddable to other patterns
  3. The pattern has to be strongly typed, not relying on dynamic "hacks" to get around the compiler
  4. The pattern has to also bind data into variables

And this are the features that would be very nice to have:

  1. Reusable pattern matching should have extensive use cases, not just for quick and "hacky" validation
  2. It should support more complex pattern matching, with extensive logic

And so, if reusable pattern matching were to be added to Dart, "functions as patterns" makes a lot more sense than patterndef. Unlike typedef, patterndef would create a lot of issues and would be a hell to use. typedef only works, because types are not code. There is a lot less interations between types and code, than there is between patterns and code. Types and code only interact superficially, as a guarding mechanism, not as code. Meanwhile patterns perform many different and more complex functions, making it impossible to just "alias" away the complexity.

Improvement to syntax

The improvement concerns the variable binding and destructuring syntax. I propose an "operator" -> (more like a hint to compiler and the developper for readibility) that would pass the matched value to next subpattern which could only be either declaration, or object destructuring pattern (others are prohibited). This is done so that it's not abused:

function ["->" [declaration_pattern | object_destructure_pattern] ]
// no binding pattern
functionName
// variable binding
functionName -> var_name
// object destructuring syntax => function matches Point(int x, int y);
functionName -> (:x, :y)

I would also make it that the pattern only accept functions with signature T? Function<T>(dynamic), meaning no passing of special parameters to the functions. If you wanted to give arguments to the matching function, you would need to create a higher order function and pass it the argument.

User? matchOnPrivilage(dynamic obj, int privilage){
  ...
  if(user.privilage != privilage) return null;
  // other matching logic;
}
User? matchOnAdmin(dynamic obj){
  return matchOnPrivilage(obj, 10);
}

if(json case matchOnAdmin -> (:user_id, :email)) {
  print("$user_id, $email");
}

Also, i'd like to ask @HenriqueNas to look throught through the proposed solutions, so that we can better understand the requirements.

@HenriqueNas
Copy link
Author

HenriqueNas commented Aug 30, 2024

For me, the idea of using functions as pattern matches sounds really good to solve the initial problem:
reducing code duplication through reusable pattern matching.
This syntax: function -> value, for inline pattern matching from function, is such an elegant way to do it... Love it !!

Here's the refactored initial examples using this new proposed syntax:

class User {
  const User(this.name, this.age);
  final String name;
  final int age;

  factory User.fromJson(dynamic user) {
    if (user case { 'name': String name, 'age': int age }) {
      return User(name, age);
    }
    throw 'Invalid argument';
  }
}

class Vehicle {
  const Vehicle(this.wheels);

  factory Vehicle.fromJson(dynamic vehicle) => Vehicle(vehicle['wheels'] as int);
  
  final int wheels;
}

class Car extends Vehicle {
  const Car() : super(4);
}
class Bicycle extends Vehicle {
  const Bicycle() : super(2);
}

// I might not get the ideia, but it would work like that, right ?
Vehicle getVehicle(dynamic vehicle) {
  return switch (vehicle) {
    Vehicle.fromJson -> (:wheels) when wheels == 4 => const Car(),
    Vehicle.fromJson -> (:wheels) when wheels == 2 => const Bicycle(),
    Vehicle.fromJson -> vehicle => vehicle,
   _ => throw 'Invalid argument',
  };
}

const json = {
  // valid user
  'name': 'John Doe',
  'age': 30,
  // valid vehicle
  'wheels': 2,
};

void run() {
  try {
    if (json case User.fromJson -> (:name, :age)) {
      print('user name: $name'); // John Doe
    }

    final Vehicle vehicle = getVehicle(json);

    print('vehicle type: ${vehicle.runtimeType}'); // Bicycle
  } catch (error) {
    print(error);
  }
}

Anything I missed ?

@hydro63
Copy link

hydro63 commented Aug 30, 2024

@HenriqueNas You didn't miss anything.

// I might not get the ideia, but it would work like that, right ?
Vehicle getVehicle(dynamic vehicle) {
  return switch (vehicle) {
    Vehicle.fromJson -> (:wheels) when wheels == 4 => const Car(),
    Vehicle.fromJson -> (:wheels) when wheels == 2 => const Bicycle(),
    Vehicle.fromJson -> vehicle => vehicle,
    _ => throw 'Invalid argument',
  };
}

Yes, that's exactly how i want it to work. I want "function as patterns" over "patterndef", because it would also allow the when clause to do some form of checking, and because it is intuitive to write and read.

Now we just have to wait for the feedback from Dart Language team, to see how feasable the proposal is, and what doesn't work.

One thing that is needed to be addressed (at least i think so), is the constrains of the pattern matching process (which function can it take, is null a valid return argument or does it refute the pattern).

@lrhn
Copy link
Member

lrhn commented Aug 31, 2024

I would also make it that the pattern only accept functions with signature T? Function<T>(dynamic)

I'd allow a function with a subtype of Record? Function(M) where M is the matched value type at the point.
That is: any function that can be called with the matched value as argument, and which returns either null or a record.

Forcing it to be a record ensures that accessing properties had no side effect, and making it nullavle gives a way to reject.

But it is a little special-cased.
If you're going to do -> (: var x, : var y) anyway, it doesn't matter what the return type is. That code will reject on null and accept on the record type anyway.

We could allow the function to have any return type, and return value, treating the function as a transformer, not a matcher, and then you call do

case ... || transform -> (:var x, :var y)

to only match if it is a non-null record, or you ca use any other pattern:

 ... nullOrEmpty -> false

If you want to have a general transformer with the Record? return type requirement, you can just have a return type of (T,) for the type T that you really want, so it has exactly the same power, the Record? just has a built-in choice of how to reject quickly and how to return multiple values, but nothing that you can't easily get by just using that return type anyway.

This behavior is why I suggested that the function call was part of the object pattern, as a way to do "user code" access on the object, just like calling an extension getter. Because that is a possible implementation today.
Every function R Function(M) that you want to call, you can create as extension on M {R get foo {...this...}} and match as M(foo: (:var x, :var y) and have the full functionality being asked for here already.

The functionality is there, it's just inconvenient. So we need a way to not have to create the extension getter for an existing (constant?) function. With a shorter syntax.

I suggested .(...) as a shorthand for an object pattern of the current matched value type. That's generally useful.
I also suggested allowing more accessors (selectors) in the object pattern, like .([2]: ...).

Then we just need a way to call a function with the "current value" as an argument. That's what the "pipe operator" is about in expressions.(There are multiple issues for that.) Assume we have a pipe operator expr -> functionExpr([extraArgs]), where functionExpr evaluates to a function value that takes the value of expr as first argument, and then maybe some more arguments. Then it would make sense to allow a pipe as an accessor on an object pattern too:

 case .(-> foo(): ...)

Nothing new or specialized, but generalizing object patterns and making their syntax shorter, and using an existing (well, if we add it first) way to call a static function as a selector.

@hydro63
Copy link

hydro63 commented Aug 31, 2024

Edit - this whole comment should be disregarded, see comment below

Could you please provide a code example? I just can't really wrap my head around the syntax.

Also i don't mind allowing record type as return value, i'm just of the opinion, that we should start it out sort of restricted to simple cases, and when we see that developpers want more, it would be easier to make the compiler more loose, than making it very loose from the start and than making it stricter if some kind of coding pattern causes problems

I don't oppose making it transformers instead of matchers, but what i really don't want to is them being tied to a class, i would also allow anynomous functions to be passed to the pattern.

Syntax suggestion

If the functions in patterns will function as transformers, and we are to allow any transforming function to be part of the pattern, i suggest using the following syntax:

// transformer
function[arguments]

"->" = piping operator

where user will be able to pass the required arguments, if he needs to (i'm still opposed to passing arguments), and the -> will function as operator passing the matched value to subpattern. I also suggest using $ as a placeholder for the passed value, so that the person can just pipe it into another transformer. The $ would also not be accesable from the condition body. Here is the example of piping:

// example with piping and argument passing
// mapMessages(dynamic, bool shouldSkip) -> if shouldSkip is enabled, skip non matching element, and continue
// just a rudimentary showcase of passing arguments, with no real purpose
var msg = switch(json){
  List() -> $.map[(e) => Message.match(e)] -> var messageList => messageList,
  {
    "messages": mapMessages[$, true] -> var messageList,
    "user_id": var user_id
  } => messageList,
  _ => throw "Bad json"
};

// example with subsequent object destructuring
...
if(json case {
    "user_info": UserInfo.match[$] -> $(username: var user, nickname: var nick)
  }){
  print("$user, $nick"); // good
  print("${$.username}") // throws since the $ is no longer a variable
}

// matchPoint is a transformer function
// normal record destructuring => matchPoint() returns (int x, int y)
matchPoint[$] -> (int x, int y)
// or (since we know the data type)
matchPoint[$] -> (var x,  var y)

// Object destructuring => matchPoint() returns Point(int x, int y);
matchPoint[$] -> $(:x, :y)
// or
matchPoint[$] -> $(x: var x, y: var y)

This syntax would allow proper piping, with type safety. I believe this syntax should be used if we are to have a piping mechanism in patterns.

@hydro63
Copy link

hydro63 commented Sep 1, 2024

Actually, i can't think of a reason to make piping possible in the pattern. Or at the very least genuine piping that you could chain.

I also don't know about allowing returning null, since i can't think of any reason you would want to capture null.

I would allow only a matcher - function, that could return some value or null. The matcher would have to have the signature T? Function(M), and passing arguments would not be possible. The return type could also be a Record type, as you pointed out, and null would be used to refute the pattern.

That means that -> could only be used once in a subpattern, not more. And it would work to pass the return value to some assignment pattern.

My suggested syntax of function[arguments] should be completely disregarded, since there would be no way to pass arguments. The $ should also not bind the currently matched value, since there is no reason to. Instead it should be used in object pattern as a placeholder

Object destructuring syntax

If i understand it correctly, your proposed syntax for object destructuring is .(attr: var a, ...), right? If not, could you please provide code snippets where it's used, so that i can understand it?

The syntax i propose is similar, with the only change being readibility. Here it is:

matchSomeObject -> $(attr: var a, ...)

I only changed the $ symbol, because i think this way, it's more readable.

TLDR: Multiple chaining is probably a mistake and my previous comment should be disregarded.

@lrhn
Copy link
Member

lrhn commented Sep 1, 2024

The not currently existing syntax that I suggest (at least as a strawman syntax) are:

  • Object pattern with implicit type (see Allow inferring the name on object patterns  #2563): currently you have to write case MapEntry<Foo, Bar>(: var key, :var value)reven if the matched value type is known to be MapEntry<Foo, Bar>. I suggest allowing .(: var key, :var value) to mean object pattern on the entire matched value type. (Using $ is a problem because that's hypothetically already allowed as the name of a type.)

  • Object pattern selectors (see Allow extractor patterns to use full selector chains. #2433): currently the only property selector you can use in an object pattern is a single identifier devouring a getter. I suggest allowing any non-assignment cascade selector: identifier, [expr], and identifier(args)
    Example:

     Result<List<int>>(
        asValue():  .(length: >= 2, [1]: == p))
    

    Some APIs are exposing property accessors that are not getters. Some take arguments, and [] is the canonical example of a parametrized getter. The style guide also suggests asFoo and toFoo be methods, but they can be useful in pattern matching. All expressions must still be constant.

  • allow more than one selector (still Allow extractor patterns to use full selector chains. #2433). If your can write .(foo:.(bar: pattern)), we might as well allow you to write .(foo.bar: pattern) directly.

  • add a pipe operator/selector (see Pipeline-operator for inline invocation of static functions. #43). Suggested syntax and semantics: expr->exprNoCall(args)
    where exprNoCalls can be any expression evaluating to a function, but any function calls must be parenthesized, so that the next function call parentheses seen are the ones linked to the ->. Adds the value of expr as the first argument to the call of that function. And -> would be one of the selectors allowed by object patterns.

With those features, there is no need to add extra functionality to patterns to allow reusable matching or value transformation, they just work from the existing features:

case SomeNum(
    toString()->int.tryParse: var integer?)

There is no need to require specific return types for functions you call on the matched value, because that would prevent you from using existing functions. The String(->int.tryParse: var v?) pattern would be the canonical ready to check if a string value holds a number, and it doesn't return a record, and String(indexOf("needle"): >= 0 && var i) could be a way to check the contents of a string, and indexOf returns int.

(That all said, the pipe operator is itself a little too special cased. You can see I said to put the value first in the argument list. Why not last? The general feature is let: let v = e in foo(v, args).
Which means that the pipe operator should probably be expr->(v)=>expr.)

@hydro63
Copy link

hydro63 commented Sep 1, 2024

Going according to the requirements that i would like the new syntax to have:

  1. It has to be easy to write and should not have much overhead associated with it
  2. The pattern has to be easily embeddable to other patterns

I feel that all of those new syntaxes would have it's use cases in the normal flow, but i don't think piping, or at least genuine piping, is good to have in patterns.

Many of the syntaxes you propose have great functionality, but also somewhat jarring syntax, that makes the pattern more difficult to understand.

I think that if all the new syntax were to be added, it would indeed create the Regex problem, where with the more we try to do in the Regex, the difficulty of understanding of what it does grows exponentially. Meaning they would work wonderfully in small patterns, but medium sized patterns would be difficult to understand and the complex ones are basically foreign language

I think this proposals could work nicely together:

  1. Object pattern with implicit type
  2. Allow more than one selector

The new syntax for

Allow extractor patterns to use full selector chains

could probably work, but i feel that at least some of the functionality could be substituted by functions.

One thing i definitely don't want to see in patterns is chained piping. As i've already stated, if you need to chain transformers multiple times in a pattern, which is essentially a testing function, you ARE doing something wrong, and should probably rethink what you are doing.

Also piping is quite uncommon in programming languages, being mostly confined to functional programming. The syntax feels foreign and doesn't blend well with the current Dart syntax. Most developpers would have a hard time understanding what is actually happening in the pattern.

While piping would work well in normal code, i don't feel like it would do great in patterns.

That being said, most of my concerns are with making the new patternmatching syntax relatively easy to understand and follow. If such syntax is found, that supports the new proposals, while also mending well with the current pattern syntax, i would have no objections.

New syntax needs to be easy to follow in order to be usable (at least according to me)

PS. I still think that functions as patterns is a more elegant solutions to reusable patterns and some of the other issues, but that's just my opinion

PS 2. I have also never written a programming language, so take it more as my personal opinion

@lrhn
Copy link
Member

lrhn commented Sep 1, 2024

if you need to chain transformers multiple times in a pattern, which is essentially a testing function, you ARE doing something wrong, and should probably rethink what you are doing.

Sure, that's what code reviews are for.
It's incredibly hard to give you the ability to do one transformation, and not more than one.

The moment you allow running arbitrary user code insider the pattern, then the doors are open. Users can run arbitrary code there. Then it's just a matter of making running reasonable code convenient. And we do allow almost arbitrary code, because we allow extension getters. The only thing we don't allow is parameterization, each individual operation needs its own getter, locked down at compile time (or have configurable getters, that you can configure before running the pattern - but it starts being more complicated than just writing if chains then.)
That's not convenient, but it is possible.

@hydro63
Copy link

hydro63 commented Sep 2, 2024

Sure, that's what code reviews are for.
The moment you allow running arbitrary user code insider the pattern, then the doors are open.

I do understand that, but still think if we were to allow piping in patterns, and leave it at that, it would possibly kill the feature. It's true that what the developper does should be at his own discresion, and the issues he faces will be purely the fault of his own wrong planning and programming.

That being said, piping in patterns in wrong circumstances adds a lot of complexity. So in order to alleviate at least some of the complexity, i propose adding new lint and formatting rules, that would strive to make the piping part more readable. This way, the user gets as much freedom as before, while also getting at least some kind of support for making it easier understand (formatting), some kind of mechanism to tell him that he is probably abusing the feature (linter)

Here are my rules:

  1. Formattter tries to keep the transformer chain on one line => foo -> bar -> baz
  2. Linter raises a warning if you try to have a transformer chain (3+ transformers in a row) - that's typically a threshold where things probably went wrong => examples:
    GOOD: foo -> bar -> var x or foo -> bar -> Obj(:var prop)
    WARNING: foo -> bar -> baz -> var x

I think that's a good compromise between allowing piping in patterns, and trying to keep the readibility to the maximum. Making it suggestion, rather than banning the piping in pattern matching, ensures the user has the ability to do what he wants, while also supporting the user to not abuse the feature.

This way it still aligns with the requirements i set out (readibility), while also aligning with your requirement to make the feature powerful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Proposed language feature that solves one or more problems
Projects
None yet
Development

No branches or pull requests

4 participants