Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make scalar_in_linter() configurable #2574

Merged
merged 10 commits into from
May 21, 2024
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@
* `any_is_na_linter()` is extended to catch the unusual usage `NA %in% x` (#2113, @MichaelChirico).
* `make_linter_from_xpath()` errors up front when `lint_message` is missing (instead of delaying this error until the linter is used, #2541, @MichaelChirico).
* `paste_linter()` is extended to recommend using `paste()` instead of `paste0()` for simply aggregating a character vector with `collapse=`, i.e., when `sep=` is irrelevant (#1108, @MichaelChirico).
* `scalar_in_linter` is now configurable to allow other `%in%` like operators to be linted. `%chin%` is no longer linted by default. (@F-Noelle)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the default argument is NULL for this parameter, the second sentence is no longer needed here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linter previously did lint the %chin% in its unconfigurable state. As this is no longer the case due to the default changing to NULL this is a notable change. If there is a better way to phrase that I am all ears.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The larger point of the change you are making is allowing to configure this linter. %chin% was just your use case, and the developers who are reading this NEWS item don't need to know that. For users of some other package, it would have been a different operator (say %nin%).

I personally don't use {data.table} much, so I wasn't even aware of this infix operator; as a result, this part of the NEWS item was confusing for me because I thought this was an operator in base that I didn't know about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you might misunderstand something here. %chin% was never my use case. I know of {data.tables}, but never used it in any of my projects. I just through it would be easy to make this linter configurable to work with my use cases, hence the pull request.

Previously the linter in its unconfigurable form linted %in% and %chin%, nothing else. When I made the update I wanted to make it backwards compatible to its original use case, so I opted to make %chin% the default for the new parameter.

@MichaelChirico then suggested to demote %chin% from its special status making this PR a breaking change, which I am completely on board with. But having a breaking change is NEWS worthy in my opinion, because people might have relied on the old behaviour aka %chin% being linted.

As this is no longer the case I want to highlight that simple fact as part of the NEWS. Just because I don't rely on it doesn't mean that nobody else did. If there is a better way of phrasing this I am all ears.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have nothing more to add here over my previous reply.

I will let @MichaelChirico decide the best phrasing for this news item, if he thinks it merits being phrased differently.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do make a breaking change, the corresponding bullet should be in its own section to highlight the change more clearly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @F-Noelle about the NEWS item, it looks good to me. Maybe we can emphasize better that it's a {data.table} operator.

@IndrajeetPatil see here, I'm not quite sure your point:

//SPECIAL[text() = '%in%' or text() = '%chin%']

  • On current main, x %chin% 'a' will lint
  • On this PR, x %chin% 'a' will only lint under scalar_in_linter("%chin%") or similar.

F-Noelle marked this conversation as resolved.
Show resolved Hide resolved

### New linters

Expand Down
27 changes: 17 additions & 10 deletions R/scalar_in_linter.R
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
#' Block usage like x %in% "a"
#'
#' `vector %in% set` is appropriate for matching a vector to a set, but if
#' that set has size 1, `==` is more appropriate. `%chin%` from `{data.table}`
#' is matched as well.
#' that set has size 1, `==` is more appropriate.
#'
#' `scalar %in% vector` is OK, because the alternative (`any(vector == scalar)`)
#' is more circuitous & potentially less clear.
#'
#' @param in_operators Character vector of additional functions that behave like the `%in%` operator
F-Noelle marked this conversation as resolved.
Show resolved Hide resolved
#'
#' @examples
#' # will produce lints
#' lint(
Expand All @@ -16,7 +17,7 @@
#'
#' lint(
#' text = "x %chin% 'a'",
#' linters = scalar_in_linter()
#' linters = scalar_in_linter(in_operators = "%chin%")
#' )
#'
#' # okay
Expand All @@ -28,27 +29,33 @@
#' @evalRd rd_tags("scalar_in_linter")
#' @seealso [linters] for a complete list of linters available in lintr.
#' @export
scalar_in_linter <- function() {
scalar_in_linter <- function(in_operators = NULL) {
# TODO(#2085): Extend to include other cases where the RHS is clearly a scalar
# NB: all of logical, integer, double, hex, complex are parsed as NUM_CONST
xpath <- "
//SPECIAL[text() = '%in%' or text() = '%chin%']
xpath <- glue("
//SPECIAL[{xp_text_in_table(c('%in%', {in_operators}))}]
/following-sibling::expr[NUM_CONST[not(starts-with(text(), 'NA'))] or STR_CONST]
/parent::expr
"
")

Linter(linter_level = "expression", function(source_expression) {
xml <- source_expression$xml_parsed_content

bad_expr <- xml_find_all(xml, xpath)
in_op <- xml_find_chr(bad_expr, "string(SPECIAL)")
lint_msg <-
paste0("Use == to match length-1 scalars, not ", in_op, ". Note that == preserves NA where ", in_op, " does not.")
msg <- ifelse(
in_op == "%in%",
"Use == to match length-1 scalars instead of %in%. Note that == preserves NA where %in% does not.",
glue(
"{in_op} behaves similar to %in%. ",
F-Noelle marked this conversation as resolved.
Show resolved Hide resolved
"Use operators like == to match length-1 scalars instead of operators like %in%."
)
)

xml_nodes_to_lints(
bad_expr,
source_expression = source_expression,
lint_message = lint_msg,
lint_message = msg,
type = "warning"
)
})
Expand Down
2 changes: 1 addition & 1 deletion inst/lintr/linters.csv
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ repeat_linter,style readability
return_linter,style configurable default
routine_registration_linter,best_practices efficiency robustness
sample_int_linter,efficiency readability robustness
scalar_in_linter,readability consistency best_practices efficiency
scalar_in_linter,readability consistency best_practices efficiency configurable
semicolon_linter,style readability default configurable
semicolon_terminator_linter,defunct
seq_linter,robustness efficiency consistency best_practices default
Expand Down
1 change: 1 addition & 0 deletions man/configurable_linters.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions man/linters.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 7 additions & 5 deletions man/scalar_in_linter.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 7 additions & 5 deletions tests/testthat/test-scalar_in_linter.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ test_that("scalar_in_linter skips allowed usages", {

expect_lint("x %in% y", NULL, linter)
expect_lint("y %in% c('a', 'b')", NULL, linter)
expect_lint("c('a', 'b') %chin% x", NULL, linter)
expect_lint("c('a', 'b') %in% x", NULL, linter)
expect_lint("z %in% 1:3", NULL, linter)
# scalars on LHS are fine (often used as `"col" %in% names(DF)`)
expect_lint("3L %in% x", NULL, linter)
Expand All @@ -15,16 +15,18 @@ test_that("scalar_in_linter skips allowed usages", {
})

test_that("scalar_in_linter blocks simple disallowed usages", {
linter <- scalar_in_linter()
lint_in_msg <- rex::rex("Use == to match length-1 scalars, not %in%.")
lint_chin_msg <- rex::rex("Use == to match length-1 scalars, not %chin%.")
linter <- scalar_in_linter(in_operators = c("%chin%", "%notin%"))
lint_in_msg <- rex::rex("Use == to match length-1 scalars instead of %in%.")
F-Noelle marked this conversation as resolved.
Show resolved Hide resolved
lint_chin_msg <- rex::rex("%chin% behaves similar to %in%.")
lint_notin_msg <- rex::rex("%notin% behaves similar to %in%.")

expect_lint("x %in% 1", lint_in_msg, linter)
expect_lint("x %chin% 'a'", lint_chin_msg, linter)
expect_lint("x %notin% 1", lint_notin_msg, linter)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a negative test case (no lints) here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test where I add negative test cases based on configuration and negative test cases for the additional in operators in general.

Is there by the way an expect_no_lint function? I would have expected a wrapper for expect_lint(content, NULL, ...).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea! #2580

})

test_that("multiple lints are generated correctly", {
linter <- scalar_in_linter()
linter <- scalar_in_linter(in_operators = "%chin%")

expect_lint(
trim_some('{
Expand Down
Loading