Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it compilable as a c library #44

Closed
wants to merge 1 commit into from
Closed

Conversation

cgrand
Copy link

@cgrand cgrand commented Feb 22, 2023

I copier the missing files (one makefile and two templates) copied from grammars under the tree-sitter organization to allow compiling the Clojure grammar to a dynamic library.

@sogaiu
Copy link
Owner

sogaiu commented Feb 22, 2023

In principle I think I'm ok with adding a Makefile for the stated purpose.

Would you mind explaining your use case in more detail though [1]?

The Makefile does more than just allow compilation as a dynamic library, it also has installation bits including for example pkgconfig related processing. The particular one you provided I think may have had a start here, but there are multiple versions floating around [2] so I think I'd like to examine the details a bit before deciding.

Also, AFAIU, it won't work as-is on Windows (unless one uses something like mingw-w64 / msys2).

A dynamic library can also be created using the tree-sitter cli as is already done by nvim-treesitter, Emacs 29+, AUR package, etc. This approach allows building a dynamic library for Windows as well. I can understand if you don't want to be dependent on the tree-sitter cli, just mentioning the option.


[1] At the moment, I don't think this grammar works with everything in ClojureDart. IIUC, there is at least one thing that is different. I haven't tried to support this yet and I'm not sure how practical it will be as I don't understand the details at this time.

[2] Of the 220+ tree-sitter grammar repositories I looked at, around 15 (6%) of them have a Makefile that seems to serve this particular purpose.

@cgrand
Copy link
Author

cgrand commented Feb 22, 2023

Thank you for the quick response! As I'm new to tree-sitter I'm certainly doing things wrong.
I took the files either from the json or java grammar (json I believe).
Anyway to the point: what's my motive? I want to use tree-sitter in ClojureDart through C FFI and while reading on tree-sitter I have a hard time understanding how one can go from a grammar file to an actually loadable dynamic library in a standard way.
I see that generate and test seems to compile any grammar thrown at them but I'm puzzled to not find a subcommand for generating the lib.

Thanks to your pointers I begin to understand that all tools seem to develop their own custom build solutions (eg https://github.com/nvim-treesitter/nvim-treesitter/blob/7ba61ca7f5fe48c95afa03a8298788b67207fd6d/lua/nvim-treesitter/install.lua#L302-L432).

The actual shared contract is either the json file (if tree-sitter CLI installed) or the c/h files generated by the CLI on another machine.

Closing this PR and scripting my own build ;-)

As for ClojureDart, I don't believe you need to change anything to the grammar: ClojureDart has made no change to the reader (to not break cljc). We are just using / for tagged literals but your grammar handles it just fine.

However while quickly playing I noticed that your grammar doesn't prevent nil, true or false from being read as symbols. This can be spotted in places where not all values are allowed. For example #nil () or ^true ().

I have a question: I see that ^ doesn't introduce its own node but is flattened, is it by design or because of a right-recursion issue?

Thanks!

@cgrand cgrand closed this Feb 22, 2023
@sogaiu
Copy link
Owner

sogaiu commented Feb 22, 2023

@cgrand

I have some notes here about the various files involved going from grammar.js to parser.c + friends. May be the notes will be of some use. (BTW, it is possible to invoke tree-sitter generate with an argument of grammar.json and skip using grammar.js, but you'll need to prepare grammar.json some other way [1].)

I see that generate and test seems to compile any grammar thrown at them but I'm puzzled to not find a subcommand for generating the lib.

Yes, I also find that unintuitive and opened an issue about the confusing nature of "automatic compilation".

FWIW, the generate subcommand can build the shared object in recent unreleased versions of the tree-sitter cli since this commit, here is a sample invocation:

tree-sitter generate --build --libdir /tmp

That should produce clojure.so (or clojure.dylib, etc.) under /tmp. tree-sitter generate --help should reveal some other options and flags (possibly of interest might be --abi).

If you don't mind using rust tooling like cargo, you can get access to the various changes since the latest release (0.20.7) in 2022-09 including the above functionality by building the cli from source.

If you have other general tree-sitter questions, perhaps you know already, but FWIW, there is a GitHub discussions area and a recently started Discord server (announced here). I don't mind trying to answer questions, but I'm not the most knowledgeable about this topic :)


Regarding ClojureDart, thanks for the explanation. I will take a closer look and try to understand properly (Ah...I think I get it now, I failed to think of / as a symbol before).

However while quickly playing I noticed that your grammar doesn't prevent nil, true or false from being read as symbols. This can be spotted in places where not all values are allowed. For example #nil () or ^true ().

Yes, in general, tree-sitter grammars / parsers are made to be on the side of "tolerant" in accepting invalid input. It's what the tree-sitter creator maxbrunsfeld has advocated a few times.

I have a question: I see that ^ doesn't introduce its own node but is flattened, is it by design or because of a right-recursion issue?

Regarding why ^ and #^ are handled the way they are, this was arrived at when working with the constraints of:

  • not having a nested set of nodes like rewrite-clj does (it's kind of a pain to find out what the metadata is actually on in the context of editors and other tooling)
  • getting the metadata to be part of what it is supposed to be metadata of
  • getting it to work with tree-sitter

I did experiment with the idea of ^ living on its own, but I think people find the current approach a bit easier to work with for the tasks I've heard of so far.


[1] I did this recently by writing grammar.edn and transforming it into grammar.json. No Clojure version of this code yet though. Perhaps some day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants