Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[move-compiler-v2] add parser code for lambda types #14792

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

brmataptos
Copy link
Contributor

@brmataptos brmataptos commented Sep 28, 2024

Description

Extend syntax

  • lambda expression: move |args| body has copy, store
  • function types: |T1, T2| T3: copy+store
  • curry construction:
    • from lambda: |a, b| f(a, x, b, 3)
    • from call syntax with "holes": f(_, x, _, 3)
    • as discussed in offsite (and move-model/src/ast.rs), curry expression takes a bitmask whose 1 bits indicate argument positions which receive lambda parameters (in order); 0 bits in mask entry correspond to curry parameters (in order).
  • arbitrary function calls:
    • (|x, y| x + y)(2, 3)
  • suppress printing common abilities (all functions have drop)

Modify exp_builder and lambda_lifter to generate function values.

Modify model to track "used" functions in addition to "called" functions to be able to catch all dependencies when function values are created but not called.

Attaches an AbilitySet to Type::Fun and Exp::Lambda based on source. Adds a new ExpCall operation in parser/expansion ASTs to be able to carry more generalized function calls through to move-model, which already can support this through Invoke, which previously was underutilized. Added basic type checking for function abilities.

Current Gaps

I was targeting the test cases return_func.move and doable_func.move, but this PR is still missing

  • inference of ability store for function values built by curry with only storable parameters
  • allowing storable function types as type parameters and function returns

I will try to patch those in while this is being reviewed, or add them to a later PR.

How Has This Been Tested?

Added more lambda tests under move-compiler-v2/tests/lambda/ which are run "with" and "without" lambda features enabled. Currently, many things pass through to hit "not yet implemented" errors in bytecode gen, etc.

Ran and carefully checked all tests under third_party and aptos.

Key Areas to Review

Key features are illustrated in test move-compiler-v2/tests/lambda/storable/doable_func.move and the two corresponding output files, along with the test return_func.move in the same directory. Most other test outputs should be largely unchanged, although error messages related to lambda use in default configuration have changed.

Tricky features are:

  • parser is modified for lambda syntax extensions.
  • in exp_builder.rs: several paths with function calls are modified to check for _ in function call parameters, in which case the call becomes a curry operation.
    • translate_possible_curry_exp_list is like translate_exp_list but also generates a mask based on the _ operands,
    • translate_curry_or_call unifies some repeated code for function calls (if mask == 0) or curry construction (if mask != 0). (This includes some freeze code for call arguments that previously seemed applied only in some cases, now applied more uniformly for the Call case.)
  • in ty.rs: unification checks Fun abilities as well as other things.
  • in lambda_lifter.rs, we (1) reject lambdas without move free-var handling, (2) try to reduce lambda to curry, by checking for a simple function call with simple args (each either a unique lambda parameter or a constant/free var).

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Move Compiler
  • Other (specify)

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Copy link

trunk-io bot commented Sep 28, 2024

⏱️ 2h 25m total CI duration on this PR
Slowest 15 Jobs Cumulative Duration Recent Runs
rust-move-unit-coverage 17m 🟩
rust-cargo-deny 15m 🟩🟩🟩🟩🟩 (+4 more)
general-lints 12m 🟩🟩🟩🟩🟩 (+4 more)
rust-move-unit-coverage 12m 🟩
rust-move-unit-coverage 11m 🟩
rust-move-tests 10m 🟩
rust-move-tests 9m 🟩
check-dynamic-deps 8m 🟩🟩🟩🟩🟩 (+4 more)
rust-move-tests 8m 🟥
rust-move-unit-coverage 5m 🟥
rust-move-unit-coverage 4m 🟥
rust-move-unit-coverage 4m 🟥
rust-move-unit-coverage 4m 🟥
rust-move-unit-coverage 3m 🟥
semgrep/ci 3m 🟩🟩🟩🟩🟩 (+4 more)

settingsfeedbackdocs ⋅ learn more about trunk.io

Copy link

codecov bot commented Sep 28, 2024

Codecov Report

Attention: Patch coverage is 79.67290% with 87 lines in your changes missing coverage. Please review.

Please upload report for BASE (09-27-extend_parser_for_lambda_types@243ea0f). Learn more about missing BASE report.

Files with missing lines Patch % Lines
third_party/move/move-compiler/src/parser/ast.rs 24.2% 25 Missing ⚠️
...hird_party/move/move-compiler/src/expansion/ast.rs 25.0% 24 Missing ⚠️
third_party/move/move-model/src/exp_rewriter.rs 55.8% 15 Missing ⚠️
third_party/move/move-model/src/ty.rs 56.5% 10 Missing ⚠️
third_party/move/move-model/src/ast.rs 50.0% 7 Missing ⚠️
...arty/move/move-compiler/src/expansion/translate.rs 96.2% 4 Missing ⚠️
...piler-v2/src/env_pipeline/unused_params_checker.rs 92.3% 1 Missing ⚠️
...d_party/move/move-model/src/builder/exp_builder.rs 98.4% 1 Missing ⚠️
Additional details and impacted files
@@                           Coverage Diff                           @@
##             09-27-extend_parser_for_lambda_types   #14792   +/-   ##
=======================================================================
  Coverage                                        ?    59.9%           
=======================================================================
  Files                                           ?      852           
  Lines                                           ?   207826           
  Branches                                        ?        0           
=======================================================================
  Hits                                            ?   124528           
  Misses                                          ?    83298           
  Partials                                        ?        0           
Flag Coverage Δ
59.9% <79.6%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@brmataptos brmataptos force-pushed the 09-27-extend_parser_for_lambda_types branch from 7f59256 to ae73f37 Compare September 30, 2024 17:09
@brmataptos brmataptos force-pushed the 09-27-add_parser_code_for_lambda_types branch from dbd482d to 270214e Compare September 30, 2024 17:09
@brmataptos brmataptos force-pushed the 09-27-extend_parser_for_lambda_types branch from ae73f37 to 243ea0f Compare September 30, 2024 18:43
@brmataptos brmataptos force-pushed the 09-27-add_parser_code_for_lambda_types branch from 270214e to afbad9b Compare September 30, 2024 18:44
@brmataptos brmataptos force-pushed the 09-27-extend_parser_for_lambda_types branch from 243ea0f to f3d30eb Compare October 3, 2024 01:42
@brmataptos brmataptos force-pushed the 09-27-add_parser_code_for_lambda_types branch from a40a3c9 to 71ef91f Compare October 3, 2024 01:43
@brmataptos brmataptos force-pushed the 09-27-extend_parser_for_lambda_types branch from f3d30eb to 0cc6890 Compare October 3, 2024 01:52
@brmataptos brmataptos force-pushed the 09-27-add_parser_code_for_lambda_types branch from 71ef91f to 3a1d362 Compare October 3, 2024 01:53
@brmataptos brmataptos force-pushed the 09-27-extend_parser_for_lambda_types branch from 0cc6890 to 1b54f30 Compare October 3, 2024 04:42
@brmataptos brmataptos force-pushed the 09-27-add_parser_code_for_lambda_types branch 2 times, most recently from c46a244 to b3c1106 Compare October 3, 2024 06:09
@brmataptos brmataptos force-pushed the 09-27-extend_parser_for_lambda_types branch from ad600f0 to 017f471 Compare October 3, 2024 17:01
@brmataptos brmataptos force-pushed the 09-27-add_parser_code_for_lambda_types branch from b3c1106 to a65ee0f Compare October 3, 2024 17:01
@brmataptos brmataptos force-pushed the 09-27-extend_parser_for_lambda_types branch from 017f471 to b7d429a Compare October 10, 2024 04:18
@brmataptos brmataptos force-pushed the 09-27-add_parser_code_for_lambda_types branch from a65ee0f to cbb77b0 Compare October 10, 2024 04:18
@brmataptos brmataptos force-pushed the 09-27-extend_parser_for_lambda_types branch from b7d429a to c99de22 Compare October 10, 2024 05:25
@brmataptos brmataptos force-pushed the 09-27-add_parser_code_for_lambda_types branch from cbb77b0 to b2185c2 Compare October 10, 2024 05:25
@brmataptos brmataptos force-pushed the 09-27-extend_parser_for_lambda_types branch from c99de22 to d3254ad Compare October 18, 2024 18:03
@brmataptos brmataptos force-pushed the 09-27-add_parser_code_for_lambda_types branch from b2185c2 to 3f98e56 Compare October 18, 2024 18:04
@brmataptos brmataptos force-pushed the 09-27-extend_parser_for_lambda_types branch from d3254ad to 7e21319 Compare October 18, 2024 18:08
@brmataptos brmataptos force-pushed the 09-27-add_parser_code_for_lambda_types branch from 3f98e56 to e24bf4e Compare October 18, 2024 18:08
@brmataptos brmataptos force-pushed the 09-27-extend_parser_for_lambda_types branch from 7e21319 to 7cfe2c4 Compare October 21, 2024 07:10
@brmataptos brmataptos force-pushed the 09-27-add_parser_code_for_lambda_types branch from e24bf4e to 2a740e8 Compare October 21, 2024 07:10
Base automatically changed from 09-27-extend_parser_for_lambda_types to main October 21, 2024 07:48
@brmataptos brmataptos force-pushed the 09-27-add_parser_code_for_lambda_types branch from 2a740e8 to dabc427 Compare October 26, 2024 00:05
@brmataptos brmataptos force-pushed the 09-27-add_parser_code_for_lambda_types branch from dabc427 to 11fec45 Compare October 31, 2024 00:31
@brmataptos brmataptos marked this pull request as ready for review October 31, 2024 04:18
@brmataptos brmataptos changed the title add parser code for lambda types [move-compiler-v2] add parser code for lambda types Oct 31, 2024
@@ -215,6 +216,10 @@ fn check_privileged_operations_on_structs(env: &GlobalEnv, fun_env: &FunctionEnv
| Operation::MoveFrom
| Operation::MoveTo => {
let inst = env.get_node_instantiation(*id);
if inst.is_empty() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these debug prints will be removed later I believe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, debug prints can stay but they should be protected by a module flag DEBUG which is off when submitted, so people debugging other parts of the compiler are not bothered by this.

Those debug prints however look like temporary and should be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps. Note that these will only be printed if the following assert will fail, so I'm leaving them in for now for debugging.

let z = 11;
move |x| alt_multiply(x, z)
} else if (key == 11) {
let g = move |x, y| mod3::multiply(x, y);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test for move |x, y| { ... } if the syntax is supported?

@@ -488,7 +488,7 @@ pub enum Type_ {
// &mut t
Ref(bool, Box<Type>),
// (t1,...,tn):t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may need to update the comment to reflect ability set?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not |t1, ..., tn|: t as in the concrete syntax?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's actually no : in the current function type. I'm going to follow Wolfgang's suggestion and make it |t1,...tn|t has store+copy to continue avoiding :.

fn parse_call_args(context: &mut Context) -> Result<Spanned<Vec<Exp>>, Box<Diagnostic>> {
let start_loc = context.tokens.start_loc();
let args = parse_comma_list(
consume_token(context.tokens, Tok::LParen)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any difference after this refactor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this was residue from the removed .. notation.

let mut used = BTreeSet::new();
let mut visitor = |e: &ExpData| {
match e {
ExpData::Call(_, Operation::MoveFunction(mid, fid), _) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these two cases cannot be merged like the one below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason. Fixed.

@@ -2180,6 +2224,9 @@ impl GlobalEnv {
/// Allocates a new node id.
pub fn new_node_id(&self) -> NodeId {
let id = NodeId::new(*self.next_free_node_id.borrow());
// if id == NodeId(425) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments for debugging

@@ -2930,17 +2982,18 @@ impl<'env> ModuleEnv<'env> {
return deps;
}
for fun_env in self.get_functions() {
let called_funs = fun_env.get_called_functions().expect("called functions");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When should we use used_functions instead of called_functions and vice versa?

@@ -136,6 +136,52 @@ pub(crate) enum OldExpStatus {
InsideOld,
}

#[derive(Debug, PartialEq, Eq, Copy, Clone)]
pub enum ModelCallKind {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain more on the difference between ModelCallKind and CallKind and when to use them?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question -- why do we now need to distinguish UnOp/BinOp, and what has that todo with lambdas?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This relates to the _ notation. It really would be confusing if _ + 3 turns into a function.

Maybe I should try to remove the _ changes. It may be more trouble than it's worth.


/// Translate a list of expressions and deliver them together with their types.
///
/// If Move Language Version >= 2.1, then if any expressions are `_`, then the positions are
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe 2.2 instead of 2.1?

AbilitySet::FUNCTIONS,
);

// Problematic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this comment mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a legacy from debugging. I previously had a call to check_types here, and with much effort found that it was causing some problems. After I changed it to unify it was working better.

Copy link
Contributor

@wrwg wrwg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This came out more complex than I was hoping for. Major ask is to reduce back to the existing expression variants (use Operation::Closure which already existed), and change the syntax to |T| has A instead of colon. I wish we would not have done the f(_) for the MVP, this creates significant complexity all over the place, and I'm not sure whether I can check correctness of all those changes just from this PR.

@@ -276,6 +277,40 @@ Resource representing an account.
</dl>


</details>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why in this PR? Just joking...

@@ -786,10 +786,13 @@ impl AbilitySet {
| (Ability::Store as u8)
| (Ability::Key as u8),
);
/// Abilities for user-defined/"primitive" functions (not closures)
pub const DEFINED_FUNCTIONS: AbilitySet =
Self((Ability::Copy as u8) | (Ability::Drop as u8) | (Ability::Store as u8));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Store doesn't seem to be right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After our last discussion, yes, it should only be Store if it's public.

Lambda(NodeId, Pattern, Exp),
Lambda(NodeId, Pattern, Exp, LambdaCaptureKind, AbilitySet),
/// Represents a reference to a Move Function as a function value.
MoveFunctionExp(NodeId, ModuleId, FunId),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The philosophy of the exp language to keep it as small as possible. We do not need this thing, it can be folded into 'Curry', and Curry can be renamed to make it more neutral, like Closure. A closure can capture 0 arguments, why not.

We furthermore do not and should not have a new ExpData variant for it. The current design uses Call(..Operation..) where possible, and so should this. There is already the Closure operation, you can simply extend it

Call(_, Operation::Closure(f, mask), vec[captured])

This is rather important to me. I took great care to design ExpData like it is today, and it benefited us already many times. Lots of the code changes you did in visitors and elsewhere should become unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have 2 distinct expressions we need to build:

  • create a function value from a defined function by (ModuleId, FunId)
  • create a function value from a function value and 0 or more parameters
    Your Closure operation doesn't handle the latter case, as the "function value" need not be a reference to a defined function (ModuleId, FunId) but can be an arbitrary expression.

If f in Operation::Closure(f, mask) can refer to an expression, then we have bigger consistency problems.

We might conceivably have Operation::Closure(mask) with the function value as the first argument to the Call expression, though it's a bit irregular.

OTOH, we might only allow directly creating a closure based on a Move function. This is more limiting, in that |x| f(x) can only be used in MVP if f is a user function, not if it's a function value. I guess perhaps that is what you had in mind. I prefer more symmetry.

Lambda(NodeId, Pattern, Exp, LambdaCaptureKind, AbilitySet),
/// Represents a reference to a Move Function as a function value.
MoveFunctionExp(NodeId, ModuleId, FunId),
/// Represents a Curry expression. mask has a 1 bit for position of each delayed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is more logical to set a bit if a parameter is captured and not the opposite.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the current rep, total args = bitcount(mask) + actual args.

With the alternative you suggest, actual args = bitcount(mask), and we need to check the function type to know how many args to expect in total.

True, we need to consult the function type to check the types of remaining args, but we get more information out of this rep.

if let Type::Fun(argt, _) = &ty {
if argt.deref().has_function() {
if let Type::Fun(args, _, _) = &ty {
if args.deref().has_function() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deref is a trait function for * and not idiomatic to use, as_ref should be preferred.

let fun_type = self.check_type(loc, &fun_type, expected_type, context);

let id = self.parent.parent.env.new_node(loc.clone(), fun_type);
return ExpData::MoveFunctionExp(id, module_id, fun_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be Call(id, Closure(module_id, fund_id, 0x0), vec![])

Reduce the number of variants in the IR is a key to maintainable and correct code.

@@ -3778,7 +3948,9 @@ impl<'env, 'translator, 'module_translator> ExpTranslator<'env, 'translator, 'mo
.struct_table
.contains_key(&global_var_sym)
{
self.check_language_version(loc, "resource indexing", LanguageVersion::V2_0);
if !self.check_language_version(loc, "resource indexing", LanguageVersion::V2_0) {
return None;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait a minute -- why are we returning None here? (And the comment above?) This generates an error and this is all we need. The program is consistent independent of Move language version. This change here and elsewhere seems unnecessary.

if let Type::Reference(_, inner_ty) = ty {
self.new_node_id_with_type_loc(
&Type::Reference(ReferenceKind::Mutable, inner_ty),
if mask != 0u128 && !matches!(cand, AnyFunEntry::UserFun(_)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment what you are doing here would be helpful. What magic number is 0u128?

I'm worried about this change and the complexity of that we support _. The MVP doesn't need _, it would have been SO much easier to leave the type checking mostly the same, and just derive the closure from the lambda without new function during lifting, after type checking. We would be already working at VM part.

fun_exp_or_oper: FunExpOrOper,
arg_types: Vec<Type>,
translated_args: Vec<Exp>,
mask: u128,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the mask for closures, it should be only u64. More is not needed. u128 should not be put into an instruction, to large for it, but its not needed anyway.

};
let body = Box::new(parse_exp(context)?);
let abilities_start = context.tokens.start_loc();
let abilities = parse_abilities(context)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The abilities of a function type should use |T|R with A and not :. It looks strange to have two colons in an argument declaration, as in f(x: |T|:copy. Thats also the reason why it is not |T|:R.

@@ -1834,6 +1866,12 @@ impl AstDebug for Exp_ {
w.comma(rhs, |w, e| e.ast_debug(w));
w.write(")");
},
E::ExpCall(arg, sp!(_, rhs)) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering where is ast_debug tested?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh. It's used for debugging code, so it's tested when you debug. 😏

@brmataptos brmataptos force-pushed the 09-27-add_parser_code_for_lambda_types branch from 5e47d93 to 626b1ed Compare November 12, 2024 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants