-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add basic modeline support #7788
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty good, thanks for taking this on!
helix-view/src/modeline.rs
Outdated
const LENGTH_TO_CHECK: usize = 256; | ||
|
||
static MODELINE_REGEX: Lazy<Regex> = | ||
Lazy::new(|| Regex::new(r"(?:\S+\s+)?(?:vi|vim|Vim|ex):\s*(?:set? )?(.*)(?::[^:]*)?").unwrap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex should be anchored (that makes it both faster and correct since the modeline is specific about how it may be prefixed We also don't need capture groups here. Since modeline support escapes (with \:
we can't phrase the option with regec anyway (it's technically possible but not worth the effort. You can simply find the regec match and then use the rest of the line (from the start if the regec). That also removes the need for non-capturing groups (since you will not compute captures anywya which sinplfies the regec a bit further
Lazy::new(|| Regex::new(r"(?:\S+\s+)?(?:vi|vim|Vim|ex):\s*(?:set? )?(.*)(?::[^:]*)?").unwrap()); | |
Lazy::new(|| Regex::new(r"^(\S*\s+)?(vi|[vV]im[<=>]?\d*|ex):\s*(set?\s+)?").unwrap()); |
helix-view/src/modeline.rs
Outdated
} | ||
|
||
fn parse_line(line: &str) -> Option<Self> { | ||
if let Some(captures) = MODELINE_REGEX.captures(line) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As described above simply use find instead of captures here
helix-view/src/modeline.rs
Outdated
let mut indent_style = None; | ||
let mut line_ending = None; | ||
|
||
let options = &captures[1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And then do line[march.end..]
helix-view/src/modeline.rs
Outdated
let mut line_ending = None; | ||
|
||
let options = &captures[1]; | ||
for option in options.split([' ', '\t', ':']) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a simple split here doesn't work since mode lines allow escaping with \:
you need to handle that here. You can take a look at the query_atoms
function to see how you can use split to split on a char that can be escaped (in that case it was spaces).
Right now it won't matter much since you don't support option with arbitrary values yet but it's good uo be correct from the start
helix-view/src/modeline.rs
Outdated
None | ||
} | ||
|
||
fn merge(&mut self, other: Self) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is unnecessary. Instead of Creating a new ModeLine
for each line that is parsed simply make the parsing accept a mutable reference to self and write the results directly into self
helix-view/src/modeline.rs
Outdated
// regexes can't operate over chunks yet, but we can at least | ||
// limit how much we potentially need to copy because modelines | ||
// are typically quite short. | ||
let line = Cow::<str>::from(line.slice(..(LENGTH_TO_CHECK.min(line.len_chars())))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can just ignore lines longer than LENGTH_TOK_CJECK
since otherwise we cut off mode lines in the middle and get a wrong result. I don't think this occurs in practice anyway
Also one more thing: I think the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me this feels like plugin territory rather than something that should be built into the core. I believe only each editor supports their own modeline natively: vim variants are the only ones to support vim modelines, only emacs supports emacs modelines natively, vscode doesn't support any modelines natively AFAIK. The translation between filetypes and line ending values from Vi to Helix feels a little hacky to me here: we're basically parsing and trying to intepret Vi set
commands. If we do support this in core, I'm not sure where we draw the lines for which Vi settings should be supported. And do we support interpreting emacs modelines as well?
helix-view/src/modeline.rs
Outdated
} | ||
|
||
impl Modeline { | ||
pub fn parse(text: &Rope) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pub fn parse(text: &Rope) -> Self { | |
pub fn parse(text: RopeSlice) -> Self { |
helix-view/src/modeline.rs
Outdated
text.lines_at(text.len_lines()) | ||
.reversed() | ||
.take(LINES_TO_CHECK), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this iterator be reversed again after the take
? The lines at the end of the file would have the opposite merging behavior as the lines at the beginning otherwise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fascinatingly enough no, the vim implementation also reads from the end of the file up
vim modelines are pretty standard even in editors other than vim. Kakoune also parses these by default (https://github.com/mawww/kakoune/blob/master/rc/detection/modeline.kak). I think it would be reasonable to mirror kakoune here. It supports the following options currently:
|
yeah, i don't think this belongs in a plugin. i'm not tied to vim syntax specifically (we could also totally just do our own thing) but it feels pretty important to be able to have some way to override settings like file type that heuristics aren't always going to correctly identify (i have quite a few files that aren't identifiable by filename or shebang). |
this should resolve the issues mentioned so far (other than the extended list of options that kakoune supports - should i implement those now or can it wait?). let me know if there are any other issues! |
helix-core/src/modeline.rs
Outdated
} | ||
|
||
// if vim and helix use different names for a language, put that mapping here | ||
fn vim_ft_to_helix_language(val: &str) -> String { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@the-mikedavis was concerned about too much vim emulation here. I tend to agree, kakoune doesn't do this either. We don't need to translate file types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair enough - i do kind of wonder what the right answer is for files that are intended to be shared between people using vim and helix though. maybe we do want our own format in addition?
i added a second commit which adds support for parsing a helix-specific modeline, since that is probably going to be more useful going forward. i think parsing some amount of vim modelines is useful because they are so widespread, but for configuration that doesn't exist in vim (or is different in vim, such as language names), we are going to want a different method for configuring that. this currently uses the same toml configuration syntax we use for the configuration file, but not sure if that's what we actually want (given that my understanding is that the toml configuration file is eventually intended to go away in favor of some kind of scripted configuration - telling people to rewrite their local configuration file is different from telling them to fix modelines in every file they want to use). let me know if you think this is fine or if we can come up with something better here. |
currently only supports setting language, indent style, and line endings
for example, if the language name differs between vim and helix
.take(LINES_TO_CHECK), | ||
) { | ||
// can't guarantee no extra copies, since we need to regex and | ||
// regexes can't operate over chunks yet, but we can at least |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now use regex cursor so we actually can do regex search without buffering so that should be used here.
The length to check limit id probably unnecessary with that
self.line_ending = vim_ff_to_helix_line_ending(val); | ||
} | ||
} | ||
"noet" | "noexpandtab" => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should habdle expandtab
without an explicit shiftwidth. I think that isn't too rare. We could simply set the indentation to the tabwidth currently set for that document. This would match sw=0
(which we may also want to handle) and is generally what a user would expect.
One minor csbeat in both cases: If the document ks already configured to use spaces then expandtab should do nothing. That will need special representation in the ModeLine struct but IMO it's worth it since that is pretty common.
|
||
if let Some(pos) = HELIX_MODELINE_REGEX.find(line) { | ||
let config = &line[pos.end()..]; | ||
match toml::from_str::<ModelineConfig>(config) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think tonl is they syntax I would go for here. We are removing toml hopefzlly soon and generally it makes more sense to me what vim does and emulate typable commands (everything you allow users to set here has an interactive typable command that allows setting that config per document interactively). The parsing should be pretty simple. So for example it would be // helix: line-ending=lf indent=4 lang=rust
In terms of aliases you should also just support the aliases these commands support.
@@ -668,6 +672,7 @@ impl Document { | |||
focused_at: std::time::Instant::now(), | |||
readonly: false, | |||
jump_labels: HashMap::new(), | |||
modeline, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a fan of adding a field for this. Mofeline detection should only run when reading a file for the first time and set the buffer setting. In the future with the new config system there will be a dedicated buffer config. For now all fields that you allow to overwrite already have fields on document and you should only set these fields automatically after parsing the ModeLine/bypass the existing autodetection
}, | ||
), | ||
( | ||
"/* vim: set noai ts=4 sw=4: */", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Funny
This :
at the end of the directive is not required in Vim
And this commit only works with the : at the end
There is some exoteric examples that I tested if Vim can properly detect the file type. Vim got it right in all but the FSharp example, that is expected because Vim doesn't have by default the file type definition for FSharp. In my tests the only thing that seems to break detection here is that requirement of a : at the end of the directive. https://github.com/lucasew/nix-flake-shell/tree/c896400199f12a293ec6ee7d54b5f2e19f0f0a99/tests |
I don't know common it is parse vim's modelines. Gedit can parse expandtab, tabstop, shiftwidth, wrap and textwidth: https://help.gnome.org/users/gedit/stable/gedit-plugins-modelines.html.en The Kate editor supports their own Kate-specific modelines: But I imagine a lot of Helix's users will come from vim/neovim, which parse them. I agree parsing vim's modelines would be a good fit for a plugin. Emacs can parse Vim's modelines using a plugin: Regarding the trailing colon. I'm one of the vim users who waw aware that's not required, so I don't put in in my own files. I'm not sure how common that is. |
fixes #729
currently only supports setting language, indent style, and line endings, but maybe this is enough?