Rule: `multiple-statements-per-line` #246

LiamPattinson · 2024-12-18T15:59:45Z

Fixes #64

This was a lot harder to implement than I expected!

Adds a function that replaces all text in strings and comments with whitespace/'!' respectively, allowing text rules to avoid matching on characters in that context.
- The difficulty here is catching multiline strings, and the fact that the following is valid:

"This is a character string &
  & going over &
  !!! surprise comment !!!
  & multiple lines"

Adds rule that finds semi-colons in code and replaces them with a newline and an appropriate amount of indentation.

I've left this as a draft as I'd like to add a companion rule, superfluous-semicolon, for a semicolon at the end of a line, and I still need to add tests.

ZedThree · 2024-12-18T18:05:18Z

Could you do this without the blanking by something like: find ; nodes in the AST, then check if the next non-comment sibling is on the same line?

LiamPattinson · 2024-12-18T18:41:32Z

As far as I can tell ; doesn't show up in the AST. If you switch the search in check.rs from root_node.named_descendants() to just root_node.descendants() and print every node kind, things like ::, , and = show up, but there's no sign of ;. I'm not sure if I'm just missing something though.

LiamPattinson · 2024-12-19T12:20:42Z

While working on this further I've found something that might be a tree-sitter-fortran bug. The following code raises syntax errors, but as far as I know it's valid (albeit awful) Fortran:

program p
  implicit none
  integer :: i, j
  ! 1st, 2nd/3rd, and 4th semicolons are superfluous
  ! 3rd/2nd separates the two statements
  ;i = 1;; j = 2;
  write (*,*) i, j
end program p

It doesn't like a semicolon at the start of a line or multiples grouped together.

I can't see a single instance of either one of these scenarios in my limited collection of test projects, so this is a very low priority bug.

LiamPattinson · 2024-12-19T13:02:43Z

I've added a rule superfluous-semicolon, but ultimately I don't think it would be a good idea to merge these rules in their current state.

Blanking all comments and strings is a very expensive operation, and as we can't have one rule return multiple kinds of violation, it has to be done twice per file if both rules are active. Benchmarking suggests this slows the code by ~7%:

❯ hyperfine -i './target/release/fortitude check --output-format=concise --preview --ignore=S081,S082 ../geos-chem'
Benchmark 1: ./target/release/fortitude check --output-format=concise --preview --ignore=S081,S082 ../geos-chem
  Time (mean ± σ):      1.763 s ±  0.046 s    [User: 8.042 s, System: 0.396 s]
  Range (min … max):    1.684 s …  1.808 s    10 runs
 
  Warning: Ignoring non-zero exit code.
 
❯ hyperfine -i './target/release/fortitude check --output-format=concise --preview ../geos-chem'
Benchmark 1: ./target/release/fortitude check --output-format=concise --preview ../geos-chem
  Time (mean ± σ):      1.883 s ±  0.075 s    [User: 8.973 s, System: 0.414 s]
  Range (min … max):    1.761 s …  1.996 s    10 runs
 
  Warning: Ignoring non-zero exit code.

The difference would be a lot less noticeable in a realistic IO-constrained run (geos-chem takes almost 10 seconds on my machine with full output!), but this still indicates that we need a less expensive solution to this problem.

@ZedThree Would it be possible to have tree-sitter recognise semicolons in the AST?

ZedThree · 2024-12-19T13:28:09Z

Ah, I think the issue is that "end-of-statement" is an unnamed node and done entirely through an external scanner.

We could change the grammar to something like:

_end_of_statement: $ => choice(';', $._scan_end_of_statement),

ZedThree · 2025-01-23T18:32:41Z

We've just released tree-sitter-fortran 0.4.0 which has ; as an unnamed node, so we should now be able to significantly simplify this rule

LiamPattinson · 2025-01-23T18:42:21Z

Fantastic, I'll try to get that going now.

LiamPattinson · 2025-01-23T19:19:03Z

I've got it working with the new tree-sitter-fortran features -- this is a huge improvement!

TODO:

Add tests
Match local indentation when applying fixes -- I'll add some generic functions for this, as it's likely to come up again.

ZedThree · 2025-01-27T17:46:03Z

fortitude/src/ast.rs

+        let line = src
+            .slice(TextRange::new(start_line, start_line + start_offset))
+            .to_string();
+        line.chars().take_while(|&c| c.is_whitespace()).collect()


We might have some fun with tabs here! (although tabs are technically not valid characters in fortran source)

I figured tabs might be present in some codes, and hence I wrote indentation to return a string of the current indentation level rather than a count of the number of spaces. That way it can include a mix of tabs and spaces. I think this is the safest way to handle them, but could be totally wrong on that.

On this topic, we should probably have a 'no tabs allowed' rule.

LiamPattinson · 2025-01-27T17:49:06Z

I think I'm finally there. The fixes behave strangely with regards to indentation if the user is doing anything really weird, like using multiple semicolons in a row, but should hold up in most cases.

LiamPattinson added 3 commits December 18, 2024 12:43

Add function for blanking strings and comments

409df02

Add rule multiple-statements-per-line

8000e05

Fix comment in blank_comments_and_strings

5bbd23f

Add rule superfluous-semicolon

2943671

LiamPattinson mentioned this pull request Jan 2, 2025

Trailing semicolons #256

Closed

LiamPattinson added 2 commits January 23, 2025 18:44

Merge branch 'main' into rule/multiple_statements_per_line

f490e95

Convert semicolon rules to AstRule

d7e53de

LiamPattinson added 2 commits January 27, 2025 16:41

Merge branch 'main' into rule/multiple_statements_per_line

f1be0e0

Preserve indentation when replacing semicolon, add tests

729fc7b

LiamPattinson marked this pull request as ready for review January 27, 2025 17:45

ZedThree reviewed Jan 27, 2025

View reviewed changes

ZedThree approved these changes Jan 27, 2025

View reviewed changes

LiamPattinson merged commit bcec2b6 into main Jan 27, 2025
25 checks passed

LiamPattinson deleted the rule/multiple_statements_per_line branch January 27, 2025 17:56

ZedThree added the rule A new rule for the linter label Jan 28, 2025

chenrui333 mentioned this pull request Mar 4, 2025

fortitude 0.7.0 chenrui333/homebrew-tap#469

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rule: `multiple-statements-per-line` #246

Rule: `multiple-statements-per-line` #246

Uh oh!

LiamPattinson commented Dec 18, 2024

Uh oh!

ZedThree commented Dec 18, 2024

Uh oh!

LiamPattinson commented Dec 18, 2024

Uh oh!

LiamPattinson commented Dec 19, 2024

Uh oh!

LiamPattinson commented Dec 19, 2024

Uh oh!

ZedThree commented Dec 19, 2024

Uh oh!

ZedThree commented Jan 23, 2025

Uh oh!

LiamPattinson commented Jan 23, 2025

Uh oh!

LiamPattinson commented Jan 23, 2025 •

edited

Loading

Uh oh!

ZedThree Jan 27, 2025

Uh oh!

LiamPattinson Jan 27, 2025

Uh oh!

LiamPattinson commented Jan 27, 2025

Uh oh!

Uh oh!

Uh oh!

Rule: multiple-statements-per-line #246

Rule: multiple-statements-per-line #246

Uh oh!

Conversation

LiamPattinson commented Dec 18, 2024

Uh oh!

ZedThree commented Dec 18, 2024

Uh oh!

LiamPattinson commented Dec 18, 2024

Uh oh!

LiamPattinson commented Dec 19, 2024

Uh oh!

LiamPattinson commented Dec 19, 2024

Uh oh!

ZedThree commented Dec 19, 2024

Uh oh!

ZedThree commented Jan 23, 2025

Uh oh!

LiamPattinson commented Jan 23, 2025

Uh oh!

LiamPattinson commented Jan 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ZedThree Jan 27, 2025

Choose a reason for hiding this comment

Uh oh!

LiamPattinson Jan 27, 2025

Choose a reason for hiding this comment

Uh oh!

LiamPattinson commented Jan 27, 2025

Uh oh!

Uh oh!

Uh oh!

Rule: `multiple-statements-per-line` #246

Rule: `multiple-statements-per-line` #246

LiamPattinson commented Jan 23, 2025 •

edited

Loading