Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize compilation of switch statements for untagged variants #7135

Merged
merged 4 commits into from
Nov 5, 2024

Conversation

cristianoc
Copy link
Collaborator

@cristianoc cristianoc commented Oct 30, 2024

First step: remove the distinction between cases with and without payload in the toplevel algorithm.

On this test:

@unboxed
type rec t =
  | Boolean(bool)
  | @as(null) Null
  | String(string)
  | Number(float)
  | Object(Dict.t<t>)
  | Array(array<t>)

type group = {
  id: string,
  name: string,
}

let decodeGroup = group => {
  switch group {
  | (dict{"id": String(id), "name": String(name)}) =>
  (id, name)
  | _ => ("e", "f")
  }
}

Before:

function decodeGroup(group) {
  let match = group.id;
  if (match === undefined) {
    return [
      "e",
      "f"
    ];
  }
  if (match === null) {
    return [
      "e",
      "f"
    ];
  }
  if (typeof match !== "string") {
    return [
      "e",
      "f"
    ];
  }
  let match$1 = group.name;
  if (match$1 !== undefined && !(match$1 === null || typeof match$1 !== "string")) {
    return [
      match,
      match$1
    ];
  } else {
    return [
      "e",
      "f"
    ];
  }
}

After:

function decodeGroup(group) {
  let match = group.id;
  if (match === undefined) {
    return [
      "e",
      "f"
    ];
  }
  if (typeof match !== "string") {
    return [
      "e",
      "f"
    ];
  }
  let match$1 = group.name;
  if (match$1 !== undefined && typeof match$1 === "string") {
    return [
      match,
      match$1
    ];
  } else {
    return [
      "e",
      "f"
    ];
  }
}

The 3 cases have become 2: check for optional fields and check for which case it is.

@cristianoc
Copy link
Collaborator Author

There's the question of whether the check for optional fields could also be incorporated at the pattern matching level.
Can't remember where that happens -- need to take a look.
An alternative is to do a back-end optimisation post-generation, though that would need to check that the two instances of return ["e", "f"]; in the different conditionals are both the same code, and either without side effects, or produce equivalent side effect (returning), which might not be trivial.

@cristianoc cristianoc requested a review from zth October 30, 2024 05:27
@cristianoc cristianoc mentioned this pull request Nov 1, 2024
@cristianoc cristianoc force-pushed the untagged-pattern-matching branch from a236b78 to 6bfad7d Compare November 1, 2024 12:29
@cometkim
Copy link
Member

cometkim commented Nov 1, 2024

An alternative is to do a back-end optimisation post-generation, though that would need to check that the two instances of return ["e", "f"]; in the different conditionals are both the same code, and either without side effects, or produce equivalent side effect (returning), which might not be trivial.

Yes, it wouldn't be trivial and often inefficient at runtime. The simplest way to reduce computation and run safely with side effects is to create a function with a captured return, but I'm not sure if it's optimizable.

Or we can ignore it as this kind of repetition is very well compressed in the real world via HTTP Content-Encoding

@cometkim
Copy link
Member

cometkim commented Nov 1, 2024

Umm, with side effects it compiles to

function decodeGroup(group) {
  let match = group.id;
  if (match !== undefined) {
    if (typeof match === "string") {
      let match$1 = group.name;
      if (match$1 !== undefined) {
        if (typeof match$1 === "string") {
          return [
            match,
            match$1
          ];
        }
        console.log("aaaaaaaa");
        return [
          "e",
          "f"
        ];
      }
      console.log("aaaaaaaa");
      return [
        "e",
        "f"
      ];
    }
    console.log("aaaaaaaa");
    return [
      "e",
      "f"
    ];
  }
  console.log("aaaaaaaa");
  return [
    "e",
    "f"
  ];
}

(Makes bit better to understand the flow without optimization)

Compiler obviously emits unnecessary branches. It should be:

function decodeGroup(group) {
  let match = group.id;
  if (match !== undefined) {
    if (typeof match === "string") {
      let match$1 = group.name;
      if (match$1 !== undefined) {
        if (typeof match$1 === "string") {
          return [
            match,
            match$1
          ];
        }
      }
    }
  }
  console.log("aaaaaaaa");
  return [
    "e",
    "f"
  ];
}

(undefined checks also can be removed)

@cristianoc
Copy link
Collaborator Author

Optional fields: #7143

First step: remove the distinction between cases with and without payload in the toplevel algorithm.

On this test:
```res
@unboxed
type rec t =
  | Boolean(bool)
  | @as(null) Null
  | String(string)
  | Number(float)
  | Object(Dict.t<t>)
  | Array(array<t>)

type group = {
  id: string,
  name: string,
}

let decodeGroup = group => {
  switch group {
  | (dict{"id": String(id), "name": String(name)}) =>
  (id, name)
  | _ => ("e", "f")
  }
}
```

Before:
```js
function decodeGroup(group) {
  let match = group.id;
  if (match === undefined) {
    return [
      "e",
      "f"
    ];
  }
  if (match === null) {
    return [
      "e",
      "f"
    ];
  }
  if (typeof match !== "string") {
    return [
      "e",
      "f"
    ];
  }
  let match$1 = group.name;
  if (match$1 !== undefined && !(match$1 === null || typeof match$1 !== "string")) {
    return [
      match,
      match$1
    ];
  } else {
    return [
      "e",
      "f"
    ];
  }
}
```

After:
```
function decodeGroup(group) {
  let match = group.id;
  if (match === undefined) {
    return [
      "e",
      "f"
    ];
  }
  if (typeof match !== "string") {
    return [
      "e",
      "f"
    ];
  }
  let match$1 = group.name;
  if (match$1 !== undefined && typeof match$1 === "string") {
    return [
      match,
      match$1
    ];
  } else {
    return [
      "e",
      "f"
    ];
  }
}

```

The 3 cases have become 2: check for optional fields and check for which case it is.
@cristianoc cristianoc force-pushed the untagged-pattern-matching branch from cd472da to 73cf47d Compare November 2, 2024 09:44
@cristianoc
Copy link
Collaborator Author

After rebasing on master, it's close to optimal for the motivating example.
Need to explore more this approach, then make it work (it breaks on lots of examples).

let decodeGroup = group => {
  switch group {
  | (dict{"id": String(id), "name": String(name)}) =>
  (id, name)
  | _ => ("e", "f")
  }
}

compiles to:

function decodeGroup(group) {
  let id = group.id;
  if (typeof id !== "string") {
    return [
      "e",
      "f"
    ];
  }
  let name = group.name;
  if (typeof name === "string") {
    return [
      id,
      name
    ];
  } else {
    return [
      "e",
      "f"
    ];
  }
}

@cristianoc
Copy link
Collaborator Author

cristianoc commented Nov 2, 2024

This is a decompilation of the lambda code even before getting to the pattern matching compilation later on in the back-end:

let decodeGroup = fun group ->
  try
    let id = group.id in
    let match_val = group.dictValuesType in
    
    match id with
    | 1 -> 
        (match group.name with
         | 1 -> (id, group.name)
         | _ -> raise Exit)
    | _ -> raise Exit
  with Exit ->
    (0, ("e", "f"))

It means that the presence of String(id) changes the compilation, even if for the purpose of code generation it could be treated as if it were just id.

@cristianoc
Copy link
Collaborator Author

This is a decompilation of the lambda code even before getting to the pattern matching compilation later on in the back-end:

let decodeGroup = fun group ->
  try
    let id = group.id in
    let match_val = group.dictValuesType in
    
    match id with
    | 1 -> 
        (match group.name with
         | 1 -> (id, group.name)
         | _ -> raise Exit)
    | _ -> raise Exit
  with Exit ->
    (0, ("e", "f"))

It means that the presence of String(id) changes the compilation, even if for the purpose of code generation it could be treated as if it were just id.

There's something deep in the very first phases of pattern matching algorithm where pairs of values cannot be a single case.
That is true for pairs, as is e.g. for records with 2 fields.
It would be pretty hard to change that, so this code can be considered optimal given reasonable constraints.

@cristianoc
Copy link
Collaborator Author

@zth this is ready for a functional review.
Then the code can be cleaned up a bit, but the functionality stays.

@zth zth requested a review from cknitt November 3, 2024 18:40
@zth
Copy link
Collaborator

zth commented Nov 3, 2024

Tagging @cknitt as well.

cristianoc added a commit that referenced this pull request Nov 4, 2024
@cristianoc cristianoc changed the title Experiment for untagged pattern matching. Optimize compilation of switch statements for untagged variants Nov 4, 2024
@cristianoc
Copy link
Collaborator Author

This illustrates why the optimisation is not safe on tagged variants: #7146

@cknitt
Copy link
Member

cknitt commented Nov 4, 2024

Will test against a real-world project.

@cristianoc
Copy link
Collaborator Author

Will test against a real-world project.

Great! I was going to ask.

@cknitt
Copy link
Member

cknitt commented Nov 4, 2024

Tested against a real-world project, and the output looked very good! 👍

Compared to alpha.4:

  • Actually not that many changes
  • Only correct changes
  • Concrete examples of nice improvements:
    • if x { return true; } else { return false } gone
    • for React refs created with React.useRef(Nullable.null), does not unnecessarily check for undefined anymore

@zth
Copy link
Collaborator

zth commented Nov 4, 2024

Agreed, looks great in my tests as well! 👍

let match$1 = match["thirdProp"];
if (match$1 !== undefined && !(match$1 === null || !(Array.isArray(match$1) && match$1.length === 2))) {
if (match$1 !== undefined && Array.isArray(match$1) && match$1.length === 2) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be further simplified as a simple simplification, in a separate PR.

@cristianoc cristianoc merged commit 48f30e5 into master Nov 5, 2024
20 checks passed
@cristianoc cristianoc deleted the untagged-pattern-matching branch November 5, 2024 06:39
if (match$3 === "My name is") {
let match$4 = match$2[1];
if (match$4 === undefined || match$4 === null || match$4 === false || match$4 === true) {
if (match$1 === undefined || match$1 === null || match$1 === false || match$1 === true) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically this could be simplified to == null || type of "boolean", but I'm not sure it's that important tbh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants