-
Notifications
You must be signed in to change notification settings - Fork 26
Switching from summarise() to reframe() to avoid dplyr warnings #518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: devel_pre
Are you sure you want to change the base?
Conversation
cying111
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tested on test data, looks all good to me.
| hitsDF <- hitsDF %>% dplyr::select(queryHits, chr, start, end, strand) %>% | ||
| group_by(queryHits, chr, strand) %>% | ||
| summarise(start = max(start), end = min(end), .groups = "drop") %>% | ||
| reframe(.by = c(queryHits, chr, strand), start = max(start), end = min(end)) %>% ungroup() %>% |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need ungroup() after reframe()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reframe() gives ungrouped data frame, so ungroup() is redundant (for example above). Other than that, the change looks good to me.
It resolved the github issue #502
| txScore.noFit = weighted.mean(txScore.noFit, readCount_tmp)) %>% | ||
| group_by(chr, strand, start, end) %>% | ||
| summarise(readCount = sum(readCount), | ||
| reframe(.by = c(chr, strand, start, end), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the warning really is coming from this usage only, where txScore is from multiple samples, hence, multiple rows will be produced in this step.
A quick fix would be change it to mutate, cause there is no summarising happening at this step.
All the other instances of summarise no need to change actually. I have tested on my side for this.
Thanks
No description provided.