-
-
Notifications
You must be signed in to change notification settings - Fork 59
Using method = "auto" fails if there is missing data due to the type checking #365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request fixes a bug where using method = "auto" fails when the dataset contains missing values. The issue occurred in the .vartype() function which performs type checking to determine the appropriate correlation method.
Changes:
- Updated the is_count check to properly handle NA values by adding
any(!is.na(x))guard andna.rm = TRUEparameter to theall()function
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
R/utils_find_correlationtype.R
Outdated
| if (any(!is.na(x)) && all(x %% 1 == 0, na.rm = TRUE)) { | ||
| out$is_count <- TRUE |
Copilot
AI
Jan 13, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There appears to be trailing whitespace at the end of these lines. Please remove the trailing spaces to maintain code consistency.
| if (any(!is.na(x)) && all(x %% 1 == 0, na.rm = TRUE)) { | |
| out$is_count <- TRUE | |
| if (any(!is.na(x)) && all(x %% 1 == 0, na.rm = TRUE)) { | |
| out$is_count <- TRUE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reformatted the code with air.
R/utils_find_correlationtype.R
Outdated
| if (any(!is.na(x)) && all(x %% 1 == 0, na.rm = TRUE)) { | ||
| out$is_count <- TRUE | ||
| } |
Copilot
AI
Jan 13, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a test case for this specific bug fix to ensure that method = "auto" works correctly with missing data. For example, test the scenario mentioned in the PR description where correlation is called with method = "auto" and missing = "keep_pairwise" on data with NA values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a very basic test for missing data combined with method = "auto" on continuous data.
The following currently fails:
The issue arises because the utils_find_correlationtype function assumes no missing data in all(x %% 1 == 0). I've replaced this with a basic check that ensures that NAs are ignored.