Skip to content

Conversation

@jhorzek
Copy link

@jhorzek jhorzek commented Jan 5, 2026

The following currently fails:

correlation::correlation(data = data.frame(a = sample(c(NA, 1:5), 1000, replace = TRUE),
                                           b = sample(c(NA, 1:5), 1000, replace = TRUE)),
                         method = "auto",
                         missing = "keep_pairwise")

The issue arises because the utils_find_correlationtype function assumes no missing data in all(x %% 1 == 0). I've replaced this with a basic check that ensures that NAs are ignored.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes a bug where using method = "auto" fails when the dataset contains missing values. The issue occurred in the .vartype() function which performs type checking to determine the appropriate correlation method.

Changes:

  • Updated the is_count check to properly handle NA values by adding any(!is.na(x)) guard and na.rm = TRUE parameter to the all() function

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 60 to 61
if (any(!is.na(x)) && all(x %% 1 == 0, na.rm = TRUE)) {
out$is_count <- TRUE
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There appears to be trailing whitespace at the end of these lines. Please remove the trailing spaces to maintain code consistency.

Suggested change
if (any(!is.na(x)) && all(x %% 1 == 0, na.rm = TRUE)) {
out$is_count <- TRUE
if (any(!is.na(x)) && all(x %% 1 == 0, na.rm = TRUE)) {
out$is_count <- TRUE

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reformatted the code with air.

Comment on lines 60 to 62
if (any(!is.na(x)) && all(x %% 1 == 0, na.rm = TRUE)) {
out$is_count <- TRUE
}
Copy link

Copilot AI Jan 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a test case for this specific bug fix to ensure that method = "auto" works correctly with missing data. For example, test the scenario mentioned in the PR description where correlation is called with method = "auto" and missing = "keep_pairwise" on data with NA values.

Copilot uses AI. Check for mistakes.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a very basic test for missing data combined with method = "auto" on continuous data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant