-
Notifications
You must be signed in to change notification settings - Fork 48
RFC: include and exclude files by shebang contents and other criteria #648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
jfly
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! Have you seen the previous discussion about this in #33?
I'd prefer the approach originally proposed by @zimbatm, as it's designed to have a minimal performance impact (only actually open the file if it's executable and has no extension). Does this PR force us to read every file in a repo?
c0e672c to
176a18f
Compare
D'oh 🤦. I have now; thanks :)
I believe the answer is "no", unless the user has configured a formatter with
Would it make sense to implement this as an opt-out switch? E.g |
|
(Full disclosure: I still haven't read the code, sorry.) But perhaps it would be useful to split the conversation into 2 questions:
For 1), I see no reason to consider files other than files both have no extension and are executable. Can you think of a situation this would not handle well? For 2), I we have 2 options:
I wonder if we're better off writing something ourselves. The mimetype library is 6k+ lines of go code that I haven't read, and it handles a whole bunch of things we don't care about (image files, archive formats, etc). |
|
I like that you just went ahead and implemented a solution. To solve the executable matching, I believe the simplest solution would be to parse the shebangs and extract which executable will be running. With a bit of heuristics to handle In that case we could add a Would you be up for implementing something like that instead? |
176a18f to
8e4ae84
Compare
|
If matching on the parsed shebangs of extension-less executable scripts is precisely the feature set that this project is prepared to accept, then I am happy to rework this PR in the manner described:
For what it's worth, I can think of some use-cases that fail one or more of these strictures:
Of course, the fact that I can think of these use-cases is no strong indication that any current or potential I'd therefore like to float an idea inspired in part by this comment on #33: using This permits configurations like the following: [formatter.mylanguage]
command = "mycommand"
select = ["{{ and .LooksLikeScript (rematch `^possibly-versioned-exe[[:digit:]]*$` .InterpreterName) }}"]Please let me know what you think of this idea. Thanks! |
This changeset also introduces some new methods on `walk.File`, including a method for extracting a shebang (if present), a method for extracting the interpreter portion of the shebang, a method indicating whether or not the file is executable, and a convenience method that indicates if the file "looks like" a script.
8e4ae84 to
7bb0563
Compare
|
Thanks for the examples! Comments inline.
Would these even have shebangs? I wouldn't expect shell libraries to have them, I would expect them to have file extensions. The file in git itself notably doesn't have a shebang and does have an extension: https://github.com/git/git/blob/master/git-sh-setup.sh.
Good example! To extend @zimbatm's original files_executable = { "" = [ "bash" ], "*.in" = [ "bash" ] }
This is a very cool idea! It has some nice properties, and is very flexible. One downside is that go becomes less of an implementation detail. I'm curious what @zimbatm and @brianmcgee think. I personally would probably prefer something simpler and less flexible (like the original |
The point is that not all shell scripts without extensions are executable. If you are looking for an example stored in source in this form, here's one. Such scripts may indeed have shebangs: for instance, I've used shebangs in non-executable scripts to inform ShellCheck of the target shell dialect, and to inform my editor that I want shell syntax highlighting.
This doesn't account for the fact that If the maintainers of this project stipulate that executable, (possibly?) extension-less scripts are the only use-case they wish to support, then I am happy to rework this PR. But I can't agree that this is the only use-case, or the only use-case worth supporting. |
|
The whole discussion here is about how opinionated treefmt should be here. Your latest idea with go templates is an elegant way to achieve maximal flexibility, but I'd prefer treefmt take a more opinionated stance on how we expect scripts to be written. Specifically:
You're absolutely correct that this does not cover all use cases. If projects cannot (or prefer not) to structure their code this way, they can still use treefmt's include mechanism to explicitly include those exceptions (ooh would be interesting if we could put multiple
@tomeon, thank you for staying engaged throughout this somewhat bike-sheddy conversation. I will see if I can get the other maintainers to weigh in so we can give you a clear "path to acceptance" of this PR. |
Simplifies the Matcher approach, removing the complicated Matcher types in favour of simple functions. Signed-off-by: Brian McGee <brian@bmcgee.ie>
|
@tomeon, I spent some time reviewing the approach and pushed a commit that simplifies the implementation. I like how flexible the template approach is. It's a nice progressive enhancement users can reach for when they need something more powerful than just globs. Where I'm undecided yet is the potential for issues or user confusion resulting from the interplay of the two mechanisms and the fact that they can now get themselves into "a lot more trouble" with templates. This could be addressed with clearer logging, perhaps. Some of what @jfly talks about also resonates. TLDR: It's an intriguing approach; I'm undecided whether it's the right one yet. I'd like to let it marinate a bit more and get @zimbatm to weigh in, too. |
Intended for cases like scripts, which may lack extensions indicating their file/language type.
An incidental change is the introduction of the
format.Matcherinterface and various types implementing this interface. The idea is to encapsulate various bits of logic:I do not have very much experience in Go, so I'm not certain whether this approach is a good idea or, if so, whether it is implemented sensibly. The intentions are:
CompositeMatcheris a bit verbose/unwieldy, but using it is IMO straightforward -- one call toWants(path)for each path under consideration), andThanks in advance for your consideration!