Skip to content

New vector_logic_linter #978

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 23, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,7 @@ Collate:
'undesirable_function_linter.R'
'undesirable_operator_linter.R'
'unneeded_concatenation_linter.R'
'vector_logic_linter.R'
'with_id.R'
'xp_utils.R'
'zzz.R'
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ export(trailing_whitespace_linter)
export(undesirable_function_linter)
export(undesirable_operator_linter)
export(unneeded_concatenation_linter)
export(vector_logic_linter)
export(with_defaults)
export(with_id)
import(rex)
Expand Down
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,7 @@ function calls. (#850, #851, @renkun-ken)
* `expect_length_linter()` Require usage of `expect_length(x, n)` over `expect_equal(length(x), n)` and similar
* `expect_identical_linter()` Require usage of `expect_identical()` by default, and `expect_equal()` only by exception
* `expect_comparison_linter()` Require usage of `expect_gt(x, y)` over `expect_true(x > y)` and similar
* `vector_logic_linter()` Require use of scalar logical operators (`&&` and `||`) inside `if()` conditions and similar
* `any_is_na_linter()` Require usage of `anyNA(x)` over `any(is.na(x))`
* `assignment_linter()` now lints right assignment (`->` and `->>`) and gains two arguments. `allow_cascading_assign` (`TRUE` by default) toggles whether to lint `<<-` and `->>`; `allow_right_assign` toggles whether to lint `->` and `->>` (#915, @michaelchirico)
* `infix_spaces_linter()` gains argument `exclude_operators` to disable lints on selected infix operators. By default, all "low-precedence" operators throw lints; see `?infix_spaces_linter` for an enumeration of these. (#914 @michaelchirico)
Expand Down
59 changes: 59 additions & 0 deletions R/vector_logic_linter.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
#' Enforce usage of scalar logical operators in conditional statements
#'
#' Usage of `&` in conditional statements is error-prone and inefficient.
#' `condition` in `if (condition) expr` must always be length-1, in which
#' case `&&` is to be preferred. Ditto for `|` vs. `||`.
#'
#' This linter covers inputs to `if()` and `while()` conditions and to
#' [testthat::expect_true()] and [testthat::expect_false()].
#' @evalRd rd_tags("vector_logic_linter")
#' @seealso
#' [linters] for a complete list of linters available in lintr.
#' <https://style.tidyverse.org/syntax.html#if-statements>
#' @export
vector_logic_linter <- function() {
Linter(function(source_file) {
if (length(source_file$xml_parsed_content) == 0L) {
return(list())
}

xml <- source_file$xml_parsed_content

# ensures the expr is in the cond part of `if/while (cond) expr` --
# if on the XML parse tree is structured like
# <expr>
# <IF> | <expr><SYMBOL_FUNCTION_CALL>
# <OP-LEFT-PAREN>
# <expr> ... </expr> # <- loop condition
# <OP-RIGHT-PAREN>
# <expr> ... </expr> # <- evaluation; includes BRACEs if present
# <ELSE> # (here & below is optional)
# <expr> ... </expr>
# </expr>
# we _don't_ want to match anything on the second expr, hence this
xpath <- "//*[
(self::AND or self::OR)
and ancestor::expr[
not(preceding-sibling::OP-RIGHT-PAREN)
and preceding-sibling::*[
self::IF
or self::WHILE
or self::expr[SYMBOL_FUNCTION_CALL[text() = 'expect_true' or text() = 'expect_false']]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be caused to trip by

if (a) {
  expect_true(ok)
  x <- 2
  a | b
}

I don't fully understand the XPath, maybe it's worth elaborating what all the parts are for.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have to admit I have lost some of the subtlety myself since I wrote it >1 year ago :)

the original code has a few more named objects which might help (not really to me at a glance):

  # correct nodes are AND2 and OR2
  bad_node <- "self::AND or self::OR"
  # ensures the expr is in the cond part of `if/while (cond) expr` --
  #   if on the XML parse tree is structured like
  #   <expr>
  #     <IF> | <expr><SYMBOL_FUNCTION_CALL>
  #     <OP-LEFT-PAREN>
  #     <expr> ... </expr> # <- loop condition
  #     <OP-RIGHT-PAREN>
  #     <expr> ... </expr> # <- evaluation; includes BRACEs if present
  #     <ELSE>             # (here & below is optional)
  #     <expr> ... </expr>
  #  </expr>
  #  we _don't_ want to match anything on the second expr, hence this:
  main_call_cond <- "not(preceding-sibling::OP-RIGHT-PAREN)"
  known_scalar_calls <- .XpTextInTable(c("expect_true", "expect_false"))
  target_call_node <- paste0(
    "preceding-sibling::",
    c(
      "IF", "WHILE",
      sprintf("expr[SYMBOL_FUNCTION_CALL[%s]]", known_scalar_calls)
    ),
    collapse = " or "
  )

  expr_inside_call_cond <- .XpOr(
    sprintf(
      "preceding-sibling::expr[SYMBOL_FUNCTION_CALL[not(%s)]]",
      known_scalar_calls
    ),
    "preceding-sibling::OP-LEFT-BRACKET"
  )

  xpath <- sprintf(
    "//*[(%s) and ancestor::expr[(%s) and (%s)] and not(ancestor::expr[%s])]",
    bad_node,
    main_call_cond,
    target_call_node,
    expr_inside_call_cond
  )
  bad_expr <- xml2::xml_find_all(xml, xpath)

I added this example as a test and it still works. Here's the tree of your example:

 <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
 <exprlist>
   <expr line1="1" col1="1" line2="5" col2="1" start="19" end="91">
     <IF line1="1" col1="1" line2="1" col2="2" start="19" end="20">if</IF>
     <OP-LEFT-PAREN line1="1" col1="4" line2="1" col2="4" start="22" end="22">(</OP-LEFT-PAREN>
     <expr line1="1" col1="5" line2="1" col2="5" start="23" end="23">
       <SYMBOL line1="1" col1="5" line2="1" col2="5" start="23" end="23">a</SYMBOL>
     </expr>
     <OP-RIGHT-PAREN line1="1" col1="6" line2="1" col2="6" start="24" end="24">)</OP-RIGHT-PAREN>
     <expr line1="1" col1="8" line2="5" col2="1" start="26" end="91">
       <OP-LEFT-BRACE line1="1" col1="8" line2="1" col2="8" start="26" end="26">{</OP-LEFT-BRACE>
       <expr line1="2" col1="3" line2="2" col2="17" start="39" end="53">
         <expr line1="2" col1="3" line2="2" col2="13" start="39" end="49">
           <SYMBOL_FUNCTION_CALL line1="2" col1="3" line2="2" col2="13" start="39" end="49">expect_true</SYMBOL_FUNCTION_CALL>
         </expr>
         <OP-LEFT-PAREN line1="2" col1="14" line2="2" col2="14" start="50" end="50">(</OP-LEFT-PAREN>
         <expr line1="2" col1="15" line2="2" col2="16" start="51" end="52">
           <SYMBOL line1="2" col1="15" line2="2" col2="16" start="51" end="52">ok</SYMBOL>
         </expr>
         <OP-RIGHT-PAREN line1="2" col1="17" line2="2" col2="17" start="53" end="53">)</OP-RIGHT-PAREN>
       </expr>
       <expr line1="3" col1="3" line2="3" col2="8" start="57" end="62">
         <expr line1="3" col1="3" line2="3" col2="3" start="57" end="57">
           <SYMBOL line1="3" col1="3" line2="3" col2="3" start="57" end="57">x</SYMBOL>
         </expr>
         <LEFT_ASSIGN line1="3" col1="5" line2="3" col2="6" start="59" end="60">&lt;-</LEFT_ASSIGN>
         <expr line1="3" col1="8" line2="3" col2="8" start="62" end="62">
           <NUM_CONST line1="3" col1="8" line2="3" col2="8" start="62" end="62">2</NUM_CONST>
         </expr>
       </expr>
       <expr line1="4" col1="3" line2="4" col2="7" start="75" end="79">
         <expr line1="4" col1="3" line2="4" col2="3" start="75" end="75">
           <SYMBOL line1="4" col1="3" line2="4" col2="3" start="75" end="75">a</SYMBOL>
         </expr>
         <OR line1="4" col1="5" line2="4" col2="5" start="77" end="77">|</OR>
         <expr line1="4" col1="7" line2="4" col2="7" start="79" end="79">
           <SYMBOL line1="4" col1="7" line2="4" col2="7" start="79" end="79">b</SYMBOL>
         </expr>
       </expr>
       <OP-RIGHT-BRACE line1="5" col1="1" line2="5" col2="1" start="91" end="91">}</OP-RIGHT-BRACE>
     </expr>
   </expr>
 </exprlist>

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's very thoroughly tested, it's okay for me. Generally, I find "constructive" XPaths easier to understand, e.g. here select all relevant calls (IF, WHILE, ...), go into their first expr and check for top-level vector logic.

But feel free to keep the working XPath. May also be worth performance checking due to the use of //* which requires traversing all nodes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember it started simpler:

//expr[IF or WHILE]
/expr[preceding-sibling::OP-LEFT-PAREN and following-sibling::OP-RIGHT-PAREN]
/*[self::AND or self::OR]

but I rewrote it to handle the much more complicated cases like if (x && y | z), which did pick up a fair amount of new true positives in google.

May also be worth performance checking due to the use of //* which requires traversing all nodes.

My sense so far is that it's really hard to slow down XPath on R AST-sized XML trees to the point where I notice. OTOH, I do have in mind at some point to run some timing tests of all the linters on a CRAN sample to see if we have any performance issues to worry about. I assume things like object_usage_linter that require doing other I/O tasks besides an XPath search will be much slower. Maybe this could be another .dev/ script.

]
]
and not(ancestor::expr[
preceding-sibling::expr[SYMBOL_FUNCTION_CALL[not(text() = 'expect_true' or text() = 'expect_false')]]
or preceding-sibling::OP-LEFT-BRACKET
])
]"
bad_expr <- xml2::xml_find_all(xml, xpath)

return(lapply(
bad_expr,
xml_nodes_to_lint,
source_file = source_file,
lint_message = "Conditional expressions require scalar logical operators (&& and ||)",
type = "warning"
))
})
}
3 changes: 2 additions & 1 deletion R/zzz.R
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,8 @@ default_linters <- with_defaults(
spaces_left_parentheses_linter(),
T_and_F_symbol_linter(),
trailing_blank_lines_linter(),
trailing_whitespace_linter()
trailing_whitespace_linter(),
vector_logic_linter()
)

#' Default undesirable functions and operators
Expand Down
1 change: 1 addition & 0 deletions inst/lintr/linters.csv
Original file line number Diff line number Diff line change
Expand Up @@ -52,3 +52,4 @@ trailing_whitespace_linter,style default
undesirable_function_linter,style efficiency configurable robustness best_practices
undesirable_operator_linter,style efficiency configurable robustness best_practices
unneeded_concatenation_linter,style readability efficiency
vector_logic_linter,default efficiency best_practices
1 change: 1 addition & 0 deletions man/best_practices_linters.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion man/default_linters.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/efficiency_linters.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 4 additions & 3 deletions man/linters.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

24 changes: 24 additions & 0 deletions man/vector_logic_linter.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions tests/testthat/default_linter_testcode.R
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@ someComplicatedFunctionWithALongCamelCaseName <- function(x)
if (1 > 2 && 2 > 3 && 3 > 4 && 4 > 5 && 5*10 > 6 && 5 > 6 && 6 > 7 && x == NA) {T} else {F}
}

# vector_logic
if (1 & 2) FALSE else TRUE

# no_tab
# pipe_continuation
# seq_linter
Expand Down
91 changes: 91 additions & 0 deletions tests/testthat/test-vector_logic_linter.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
test_that("vector_logic_linter skips allowed usages", {
expect_lint("if (TRUE) 5 else if (TRUE) 2", NULL, vector_logic_linter())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW: Do we have a linter for this kind of unreachable code?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have unreachable_code_linter but the logic is limited so far. we can keep extending that.


expect_lint("if (TRUE || FALSE) 1; while (TRUE && FALSE) 2", NULL, vector_logic_linter())

# function calls and extractions may aggregate to scalars -- only catch
# usages at the highest logical level
expect_lint("if (agg_function(x & y)) 1", NULL, vector_logic_linter())
expect_lint("if (DT[x | y, cond]) 1", NULL, vector_logic_linter())
})

test_that("vector_logic_linter blocks simple disallowed usages", {
expect_lint(
"if (TRUE & FALSE) 1",
rex::rex("Conditional expressions require scalar logical operators"),
vector_logic_linter()
)

expect_lint(
"while (TRUE | TRUE) 2",
rex::rex("Conditional expressions require scalar logical operators"),
vector_logic_linter()
)
})

test_that("vector_logic_linter detects nested conditions", {
expect_lint(
"if (TRUE & TRUE || FALSE) 4",
rex::rex("Conditional expressions require scalar logical operators"),
vector_logic_linter()
)

expect_lint(
"if (TRUE && (TRUE | FALSE)) 4",
rex::rex("Conditional expressions require scalar logical operators"),
vector_logic_linter()
)

# don't match potentially OK usages nested within calls
expect_lint("if (TRUE && any(TRUE | FALSE)) 4", NULL, vector_logic_linter())
# even if the usage is nested in those calls (b/181915948)
expect_lint("if (TRUE && any(TRUE | FALSE | TRUE)) 4", NULL, vector_logic_linter())

# don't match potentially OK usages in the branch itself
lines <- trim_some("
if (TRUE) {
x | y
}
")
expect_lint(lines, NULL, vector_logic_linter())
})

test_that("vector_logic_linter catches usages in expect_true()/expect_false()", {
expect_lint(
"expect_true(TRUE & FALSE)",
rex::rex("Conditional expressions require scalar logical operators"),
vector_logic_linter()
)

expect_lint(
"expect_false(TRUE | TRUE)",
rex::rex("Conditional expressions require scalar logical operators"),
vector_logic_linter()
)

# ditto with namespace qualification
expect_lint(
"testthat::expect_true(TRUE & FALSE)",
rex::rex("Conditional expressions require scalar logical operators"),
vector_logic_linter()
)

expect_lint(
"testthat::expect_false(TRUE | TRUE)",
rex::rex("Conditional expressions require scalar logical operators"),
vector_logic_linter()
)

# valid nested usage within aggregator
expect_lint("testthat::expect_false(any(TRUE | TRUE))", NULL, vector_logic_linter())
})

test_that("vector_logic_linter doesn't get mixed up from complex usage", {
lines <- trim_some("
if (a) {
expect_true(ok)
x <- 2
a | b
}")
expect_lint(lines, NULL, vector_logic_linter())
})