Skip to content

New vector_logic_linter #978

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 23, 2022
Merged

New vector_logic_linter #978

merged 4 commits into from
Mar 23, 2022

Conversation

MichaelChirico
Copy link
Collaborator

Part of #962

@MichaelChirico
Copy link
Collaborator Author

This one might be a default as well. Relevant call-out in the style guide:

https://style.tidyverse.org/syntax.html?q=else#if-statements

@@ -0,0 +1,81 @@
test_that("vector_logic_linter skips allowed usages", {
expect_lint("if (TRUE) 5 else if (TRUE) 2", NULL, vector_logic_linter())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW: Do we have a linter for this kind of unreachable code?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have unreachable_code_linter but the logic is limited so far. we can keep extending that.

@AshesITR
Copy link
Collaborator

Agreed. Please add it to the default_linters (i.e. as a tag and to the R list).
NB this means example "bad" lint files need to be extended to produce a lint.
Also, should add a NEWS section "New default linters", I think there are a couple new default linters since 2.0.1

@MichaelChirico
Copy link
Collaborator Author

Also, should add a NEWS section "New default linters", I think there are a couple new default linters since 2.0.1

Makes sense, but I think out of scope for this PR; added to #833 since we already have an issue (and pending stale PR) about re-tooling NEWS.

#' @evalRd rd_tags("vector_logic_linter")
#' @seealso
#' [linters] for a complete list of linters available in lintr.
#' <https://style.tidyverse.org/syntax.html?q=else#if-statements>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ?q= URL param is unnecessary

and preceding-sibling::*[
self::IF
or self::WHILE
or self::expr[SYMBOL_FUNCTION_CALL[text() = 'expect_true' or text() = 'expect_false']]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be caused to trip by

if (a) {
  expect_true(ok)
  x <- 2
  a | b
}

I don't fully understand the XPath, maybe it's worth elaborating what all the parts are for.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have to admit I have lost some of the subtlety myself since I wrote it >1 year ago :)

the original code has a few more named objects which might help (not really to me at a glance):

  # correct nodes are AND2 and OR2
  bad_node <- "self::AND or self::OR"
  # ensures the expr is in the cond part of `if/while (cond) expr` --
  #   if on the XML parse tree is structured like
  #   <expr>
  #     <IF> | <expr><SYMBOL_FUNCTION_CALL>
  #     <OP-LEFT-PAREN>
  #     <expr> ... </expr> # <- loop condition
  #     <OP-RIGHT-PAREN>
  #     <expr> ... </expr> # <- evaluation; includes BRACEs if present
  #     <ELSE>             # (here & below is optional)
  #     <expr> ... </expr>
  #  </expr>
  #  we _don't_ want to match anything on the second expr, hence this:
  main_call_cond <- "not(preceding-sibling::OP-RIGHT-PAREN)"
  known_scalar_calls <- .XpTextInTable(c("expect_true", "expect_false"))
  target_call_node <- paste0(
    "preceding-sibling::",
    c(
      "IF", "WHILE",
      sprintf("expr[SYMBOL_FUNCTION_CALL[%s]]", known_scalar_calls)
    ),
    collapse = " or "
  )

  expr_inside_call_cond <- .XpOr(
    sprintf(
      "preceding-sibling::expr[SYMBOL_FUNCTION_CALL[not(%s)]]",
      known_scalar_calls
    ),
    "preceding-sibling::OP-LEFT-BRACKET"
  )

  xpath <- sprintf(
    "//*[(%s) and ancestor::expr[(%s) and (%s)] and not(ancestor::expr[%s])]",
    bad_node,
    main_call_cond,
    target_call_node,
    expr_inside_call_cond
  )
  bad_expr <- xml2::xml_find_all(xml, xpath)

I added this example as a test and it still works. Here's the tree of your example:

 <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
 <exprlist>
   <expr line1="1" col1="1" line2="5" col2="1" start="19" end="91">
     <IF line1="1" col1="1" line2="1" col2="2" start="19" end="20">if</IF>
     <OP-LEFT-PAREN line1="1" col1="4" line2="1" col2="4" start="22" end="22">(</OP-LEFT-PAREN>
     <expr line1="1" col1="5" line2="1" col2="5" start="23" end="23">
       <SYMBOL line1="1" col1="5" line2="1" col2="5" start="23" end="23">a</SYMBOL>
     </expr>
     <OP-RIGHT-PAREN line1="1" col1="6" line2="1" col2="6" start="24" end="24">)</OP-RIGHT-PAREN>
     <expr line1="1" col1="8" line2="5" col2="1" start="26" end="91">
       <OP-LEFT-BRACE line1="1" col1="8" line2="1" col2="8" start="26" end="26">{</OP-LEFT-BRACE>
       <expr line1="2" col1="3" line2="2" col2="17" start="39" end="53">
         <expr line1="2" col1="3" line2="2" col2="13" start="39" end="49">
           <SYMBOL_FUNCTION_CALL line1="2" col1="3" line2="2" col2="13" start="39" end="49">expect_true</SYMBOL_FUNCTION_CALL>
         </expr>
         <OP-LEFT-PAREN line1="2" col1="14" line2="2" col2="14" start="50" end="50">(</OP-LEFT-PAREN>
         <expr line1="2" col1="15" line2="2" col2="16" start="51" end="52">
           <SYMBOL line1="2" col1="15" line2="2" col2="16" start="51" end="52">ok</SYMBOL>
         </expr>
         <OP-RIGHT-PAREN line1="2" col1="17" line2="2" col2="17" start="53" end="53">)</OP-RIGHT-PAREN>
       </expr>
       <expr line1="3" col1="3" line2="3" col2="8" start="57" end="62">
         <expr line1="3" col1="3" line2="3" col2="3" start="57" end="57">
           <SYMBOL line1="3" col1="3" line2="3" col2="3" start="57" end="57">x</SYMBOL>
         </expr>
         <LEFT_ASSIGN line1="3" col1="5" line2="3" col2="6" start="59" end="60">&lt;-</LEFT_ASSIGN>
         <expr line1="3" col1="8" line2="3" col2="8" start="62" end="62">
           <NUM_CONST line1="3" col1="8" line2="3" col2="8" start="62" end="62">2</NUM_CONST>
         </expr>
       </expr>
       <expr line1="4" col1="3" line2="4" col2="7" start="75" end="79">
         <expr line1="4" col1="3" line2="4" col2="3" start="75" end="75">
           <SYMBOL line1="4" col1="3" line2="4" col2="3" start="75" end="75">a</SYMBOL>
         </expr>
         <OR line1="4" col1="5" line2="4" col2="5" start="77" end="77">|</OR>
         <expr line1="4" col1="7" line2="4" col2="7" start="79" end="79">
           <SYMBOL line1="4" col1="7" line2="4" col2="7" start="79" end="79">b</SYMBOL>
         </expr>
       </expr>
       <OP-RIGHT-BRACE line1="5" col1="1" line2="5" col2="1" start="91" end="91">}</OP-RIGHT-BRACE>
     </expr>
   </expr>
 </exprlist>

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's very thoroughly tested, it's okay for me. Generally, I find "constructive" XPaths easier to understand, e.g. here select all relevant calls (IF, WHILE, ...), go into their first expr and check for top-level vector logic.

But feel free to keep the working XPath. May also be worth performance checking due to the use of //* which requires traversing all nodes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember it started simpler:

//expr[IF or WHILE]
/expr[preceding-sibling::OP-LEFT-PAREN and following-sibling::OP-RIGHT-PAREN]
/*[self::AND or self::OR]

but I rewrote it to handle the much more complicated cases like if (x && y | z), which did pick up a fair amount of new true positives in google.

May also be worth performance checking due to the use of //* which requires traversing all nodes.

My sense so far is that it's really hard to slow down XPath on R AST-sized XML trees to the point where I notice. OTOH, I do have in mind at some point to run some timing tests of all the linters on a CRAN sample to see if we have any performance issues to worry about. I assume things like object_usage_linter that require doing other I/O tasks besides an XPath search will be much slower. Maybe this could be another .dev/ script.

@MichaelChirico MichaelChirico merged commit 9f4da55 into master Mar 23, 2022
@MichaelChirico MichaelChirico deleted the vector_logic branch March 23, 2022 05:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants