Issue with FindMarkers() methodology

The two parts of the find markers function: 1) fold change calculation and 2) p value calculation.  Fold change is calculated using raw counts, while the p value is calculated using normalized data.  Why is this?  I do not believe raw counts are comparable to normalized data in this scenario, and here is why: 

Raw counts vs normalized counts result in a change of directionality of many peaks.  Example: peak A might have a positive logFC value with raw counts, but it becomes negative when using normalized data.  This means that the input data for the p value and the input data for the fold change are contradicting each other.  

I calculated fold change on the raw counts and normalized counts and looked at how this changes direction of peak fold changes.  
```
# raw counts
u.load <- mtx[, load.cells]
u.load <- rowMeans(u.load)+1
u.normal <- mtx[, norm.cells]
u.normal <- rowMeans(u.normal)+1
fc <- u.load/u.normal

table(fc > 1)
FALSE  TRUE 
55299 47509

a <- fc > 1

# normalized counts
mtx2 <- atac[['peaks']]@data
u.load <- mtx2[, load.cells]
u.load <- rowMeans(u.load)+1
u.normal <- mtx2[, norm.cells]
u.normal <- rowMeans(u.normal)+1
fc <- u.load/u.normal

table(fc > 1)
FALSE  TRUE 
19302 83506

b <- fc > 1

table(a == b)
FALSE  TRUE 
36399 66409

```
In this case, out of 102808 peaks, 36399 of them change direction depending on whether raw vs normalized data is used as input.  For these peaks, the input for the p value and the input for the fold change calculation are directly contradicting each other.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issue with FindMarkers() methodology #795

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue with FindMarkers() methodology #795

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions