Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lua] Rewrite syntax #1811

Merged
merged 29 commits into from
Jan 16, 2019
Merged

[Lua] Rewrite syntax #1811

merged 29 commits into from
Jan 16, 2019

Conversation

Thom1729
Copy link
Collaborator

This is a total rewrite of the Lua syntax.

The previous Lua syntax was tmLanguage-derived and fairly rudimentary. It had a single named context and relied heavily on large, complex regexps with many capture groups. As a result, it had trouble handling complex expressions or unexpected whitespace. In addition, it uses many negative lookbehinds to work around the simplistic parsing.

This new syntax definition is written as a sublime-syntax from the ground up. It should be capable of providing exact highlighting of any Lua code (with some minor caveats about line lookahead). In addition, it provides standard meta scopes for use by package developers.

The previous syntax used few enough scopes that the new syntax should be generally compatible with it.

I don't use Lua myself, so I'd appreciate feedback and examples from people who do. Some known work still remains:

  • Highlighting support functions. (Should I just do the standard library?)
  • Ignoring the shebang. (Not sure how to test this, because it basically only works on the first line of a file and doing it elsewhere would break things.)
  • Removing the nested lists in the syntax.
  • Improving block meta scoping.

Copy link
Contributor

@rwols rwols left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's awesome to see a syntax definition having less than 500 lines. Well done.

The entity.name.function scope is not always applied in the places that I would expect it to be applied. Here is a complete runnable lua script:

-- should fact not be entity.name.function? It is now variable.function
local function fact(n)
    if n == 0 then return 1
    else return n * fact(n-1)
    end
end
print(fact(12))


-- OK: f is entity.name.function
function f(x) return x + 2 end
print(f(0))

-- While I understand that it would make the syntax a bit hacky,
-- it would be nice if f were entity.name.function here.
-- This would be a nice extra, don't view this as important.
f = function(x) return x + 3 end
print(f(0))

Account = {}
-- Why is the "Account:" part not entity.name.function?
-- From what I understand from your syntax tests this is a deliberate decision,
-- but with this choice, using Goto Symbol would give me a long list of "new" symbols.
function Account:new(o)
    o = o or {}
    -- setmetatable et al should be support.function
    setmetatable(o, self)
    self.__index = self
    return o
end

function Account:deposit(amount)
    self.balance = self.balance + amount
end

function Account:__tostring()
    -- "self" should be variable.language (like in C++, Python, C#, et al)
    return "Account with balance " .. self.balance
end

-- The syntax sugar for single-argument functions also includes tables
account = Account:new{balance=100}
account:deposit(200)
print(account)

Lua/Lua.sublime-syntax Outdated Show resolved Hide resolved
Lua/Lua.sublime-syntax Outdated Show resolved Hide resolved
Lua/Lua.sublime-syntax Show resolved Hide resolved
Lua/Lua.sublime-syntax Show resolved Hide resolved
Lua/Lua.sublime-syntax Show resolved Hide resolved
Lua/Lua.sublime-syntax Outdated Show resolved Hide resolved
Lua/Lua.sublime-syntax Show resolved Hide resolved
Lua/Lua.sublime-syntax Outdated Show resolved Hide resolved
Lua/Lua.sublime-syntax Show resolved Hide resolved
Lua/Symbol List.tmPreferences Show resolved Hide resolved
Lua/Lua.sublime-syntax Show resolved Hide resolved
@Thom1729
Copy link
Collaborator Author

A variable assigned to a function literal should now be scoped entity.name.function.

In function Account:new(o) end, only new is the name of the function, so Account and the colon should not be scoped entity.name.function. This is a deliberate change from the old syntax. However, your point about the symbol list is well taken. I've added an explicit Symbol List.tmPreferences that should address this. With this decoupled, we could expand the symbols if desired (e.g. from Account:new to function Account:new(o)).

I've implemented the basic builtin functions, plus the constants _G and _VERSION. Following the example of JavaScript, we might want to implement all of the builtin modules.

Can you mark as resolved the conversations that you think are addressed?

@rwols
Copy link
Contributor

rwols commented Dec 21, 2018

Can you mark as resolved the conversations that you think are addressed?

I tried looking for that button, but I can't seem to find it. Is it some sort of repository setting that @wbond has to enable?

At this point I'm trying out the syntax on some code bases and it seems to work well. It's basically done from my point of view.

Since the syntax can track wether we are in a "do ... end" block perfectly, and there is no preprocessor of sorts, and you cannot re-assign to keywords (e.g. end = "foo" is not allowed), it might be safe to check for stray end keywords in the main context.

Following the example of JavaScript, we might want to implement all of the builtin modules.

Possibly, but I think this is mergeable for now.

There is one feature that you might want to consider looking deeper into: one may yield from a coroutine via the coroutine module. With the current syntax, coroutine.yield("foo") is scoped as a variable.function. I think that is okay and doable, but you might have other ideas about this. Either way I just wanted you to be aware of this.

@Thom1729
Copy link
Collaborator Author

I've resolved the conversations. Not sure why the button wasn't showing up.

Unexpected end will now be scoped invalid.illegal. This can happen at top level, but also inside a repeat loop (which is ended by until).

I added support scopes for builtin modules. This should ensure that important builtin features like coroutine.yield are marked. Syntactically, that is a function, so I'd rather mark it thus rather than as a keyword.

@FichteFoll
Copy link
Collaborator

Exciting. I've done some Lua in the past and also made a couple local fixes on the default syntax many years ago and took a couple notes on issues I noticed while working with it.

I'll test and review the new syntax in the next couple days.

Copy link
Collaborator

@FichteFoll FichteFoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good over here. I only went through the test file, inspected a couple sample files and left a couple comments where I don't agree. I expect the actual definition to be of good quality.

The only things that is missing is highlighting of placeholders for string.format (https://www.lua.org/manual/5.3/manual.html#pdf-string-format) and regular expressions (or "patterns" as they are called in Lua) (https://www.lua.org/manual/5.3/manual.html#6.4.1). The problem with the latter is that they are indistinguishable from normal strings syntactically. They are usually used as the first argument to a :gsub call (and similar), though I wonder whether that would be worth it.

-- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ meta.function
-- ^^^ variable.other - entity
-- ^ punctuation.accessor
-- ^^^ variable.other - entity
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a meta.property, not variable.other.

Works correctly for foo.abc.bar = function() end;, although there is no test for it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


--STATEMENTS

end;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why apply the keyword scope as well? For color schemes that don't set a background color on invalid.illegal, this will look like a normal keyword.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Lua/tests/syntax_test_lua.lua Show resolved Hide resolved
Lua/tests/syntax_test_lua.lua Outdated Show resolved Hide resolved
@FichteFoll
Copy link
Collaborator

FichteFoll commented Dec 27, 2018

More:

  1. Function calls don't have the meta.function-call scope for the function identifier. I don't think this has been standardized yet, but afaik this should cover both the function identifier and the parametres.

  2. Built-in function calls don't have variable.function. I'm actually not sure how to treat this as I did add this scope to usages in Python (in addition to support.function) and make them italic in my color scheme. I could target variable.function in general, but I only want them to be italicized when the functions are actually called. Open to opinions.

@FichteFoll
Copy link
Collaborator

FichteFoll commented Dec 27, 2018

Another thing to consider:

In #421 (comment) I suggest the usage of string.unquoted.key for unquoted keys as they are implicitly converted to strings in the resulting table. You haven't commented on that so far and this would be the ideal opportunity to try this in a field test, so to say. It also applies to Javascript, which you're quite involved with, too.

@Thom1729
Copy link
Collaborator Author

Thom1729 commented Jan 2, 2019

In #421 (comment) I suggest the usage of string.unquoted.key for unquoted keys as they are implicitly converted to strings in the resulting table. You haven't commented on that so far and this would be the ideal opportunity to try this in a field test, so to say. It also applies to Javascript, which you're quite involved with, too.

I agree in principle and in practice. In the particular case of JavaScript, apparently they once tried highlighting unquoted string keys with string and people complained. I'd certainly like to implement this if we can find a way around people's objections. (JS Custom currently provides this as an option, and that might be a viable solution for people who are really, really, really attached to the old behavior.)

Copy link
Collaborator

@FichteFoll FichteFoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Placeholders/patterns can always be added at a later time, if we figure out a good way to differentiate them from normal strings or just apply it to all strings.

However, the commit history looks quite messy. I suppose cherry-picking onto a rebased master branch would be the best option.

By the way, it seems only people with commit access to the branch can mark discussions as resolved, so you'll need to do that for my remarks.

Copy link
Member

@wbond wbond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed this one little thing when looking at the commit history.

Recently I've been squashing history to try and reduce the crazy history of the repo. I'll take the commit summaries and leave them as the commit description for the squash if you are okay with that.

Lua/Lua.sublime-syntax Outdated Show resolved Hide resolved
@Thom1729
Copy link
Collaborator Author

Thom1729 commented Jan 7, 2019

Yeah, the commit history is a mess. Sorry about that. A squash sounds like the way to go.

@wbond wbond merged commit 98c7230 into sublimehq:master Jan 16, 2019
@Thom1729 Thom1729 deleted the lua-rewrite-syntax branch January 16, 2019 14:24
deathaxe pushed a commit to deathaxe/sublime-packages that referenced this pull request Jun 9, 2019
* [Lua] Rewrite syntax
* [Lua] Added .lua to storage.type.function.
* [Lua] Accept anonymous function names.
* [Lua] entity.name.function in function assignments.
* [Lua] Accept table as bare function argument.
* [Lua] Fix escape sequences.
* [Lua] Add self.
* [Lua] Remove redundant name key.
* [Lua] Improved symbol list.
* [Lua] Use storage.modifier for local.
* [Lua] Implemented basic builtins.
* [Lua] Mark unexpected end as invalid.
* [Lua] Added builtin modules.
* [Lua] Use string.quoted.multiline for compatibility.
* [Lua] Added hex literal punctuation highlighting.
* [Lua] Scope unquoted string.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants