-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added expansion factors for the mz and rt range used in fillPeaks.chrom. #3
base: master
Are you sure you want to change the base?
Conversation
Hi, I am trying a few things with the new code, and would like to add
|
Yes I would expect the same. It is not the case? If not I can investigate next week when I get back. |
Ok. Damn. I will investigate next week. I wonder if these are at the extremes of the retention time range and getPeaks doesn't like it or something. |
The added zeros were indeed because getPeaks doesn't handle if the rt range of the file is exceeded. I have added a check to limit rtmin and rtmax to the limits in each file (in my fork, pull request not submitted yet). There is still something strange going on. getPeaks is returning slightly lower values for some peaks in some samples even with expanded peakrange. I have not figure out why yet. I guess I will have to understand exactly how getPeaks works to figure that out so no pull request yet. Example to see the problems: library(faahKO)
library(RUnit)
library(stringr)
xsg <- group(faahko)
xsgf1 <- fillPeaks(xsg, method="chrom")
xsgf1111 <- fillPeaks(xsg, method="chrom", expand.rt=1.1, expand.mz=1.1)
select = !(peaks(xsgf1)[xsgf1@filled, "into"] <= peaks(xsgf1111)[xsgf1111@filled, "into"])
View(peaks(xsgf1)[xsgf1@filled[select],])
View(peaks(xsgf1111)[xsgf1111@filled[select],]) Any ideas? |
I have tracked down the problem. The problem is that getPeaks calculates the average peakwidth: pwid <- diff(stime[iret])/diff(iret) With the expanded peakrange that obviously lead to inclusion of additional scans as intended. rmat[i,7] <- pwid*sum(ymax) This average peakwidth is occasionally lower if the extra scan has a slightly shorter than "normal" "width". This in turn makes the calculated area lower. The difference is really really small in the cases I have seen. So the only solutions I can see is to use the peakwidth for each scan for the calculation or not use this particular unit test. I would be hesitant to change such a fundamental calculation in xcms... |
I have added min.width.mz and min.width.rt arguments. In my test dataset setting a reasonable min.width.mz was much more important than expanding the range by a factor. It almost cut the zeros in half whereas the expansion factor only cut a few percent point. The dataset had some features with ridiculously low mz ranges making re-integration impossible: |
…krange matrix must not be coersed to a vector when there is only one peak to expand.
I think this adds some nice features that can optionally be 'turned off' if the user desires to use the original functionality. I support pulling this, and am currently using it in my own code to prevent certain cases when Pwid = Na which I believe is being caused by the M/z range being too narrow. |
Sorry, I think there are a number of issues here. If this is implemented, the defaults should be FALSE or NULL. Can someone please explain why expand.rt and expand.mz are needed in the first place in fillPeaks? Is this:
cheers |
This is workaround for when the scan and rt ranges found in the original profiling might not be representable, however they are defined. The reasons I could come up with for why this seem to be a pervasive problem with some datasets include:
As you can see from the plot above mz ranges go all the way down to 0 and the re-integration of peaks are therefore done on completely different terms even if you probably won't argue that the accuracy differs that much. It is true that using the expansion factors run the risk of integration other peaks. But if a reasonable setting of min.width.rt and in particular min.width.mz creates overlap then this peak can never be trusted anyway; you have overlap that cannot be resolved by your system. I would think that the minimum settings would be much less dangerous than using the group minima and maxima. Why must the defaults be NULL/FALSE? The current defaults (expand.mz=1,expand.rt=1,min.width.mz=0,min.width.rt=0) leaves the ranges unchanged and should be identical to the current version. |
It looks like my first pull request was accepted (the expand.mz and expand.rt) but not the min.width.mz and min.width.rt pull request. Was that intended? |
Was a decision made on this issue? |
I'd still very much like to get this finished... |
No description provided.