Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seurat3 and subsetting features with dashes (-) in the name #1212

Closed
jeremycfd opened this issue Mar 8, 2019 · 9 comments
Closed

Seurat3 and subsetting features with dashes (-) in the name #1212

jeremycfd opened this issue Mar 8, 2019 · 9 comments

Comments

@jeremycfd
Copy link

Hello,

I've found that I'm unable to use the subset() function to subset based on features that have dashes in their name.

`Aggreg.data <- Read10X(data.dir = "./filtered_feature_bc_matrix/")
test <- CreateSeuratObject(counts = Aggreg.data, min.cells = 50, min.features = 0)

This works fine:

summary(test@assays$RNA@data["CD8A",])
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000 0.000 0.000 3.556 3.000 94.000

test2 <- subset(x = test, subset = CD8A > 0)
summary(test2@assays$RNA@data["CD8A",])
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.00 3.00 6.00 9.81 14.00 94.00 `

However, this does not work:

test3 <- subset(x = test, subset = MT-CO3 > 0)
Error in FetchData(object = object, vars = expr.char[vars.use], cells = cells) :
None of the requested variables were found:
test3 <- subset(x = test, subset = 'MT-CO3' > 0)
Error in FetchData(object = object, vars = expr.char[vars.use], cells = cells) :
None of the requested variables were found:

What I find more troubling is that if a single requested variable is found, Seurat will not warn you that the "missing" variables were not used in the subset:

test4 <- subset(x = test, subset = 'MT-CO3' > 0 & nFeature_RNA > 500)
summary(test4@assays$RNA@data["MT-CO3",])
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.00 28.00 50.00 81.78 102.00 1506.00

This is kind of strange to me because Seurat will specifically convert underscores to dashes in feature names, so I assume this is not intended. Does anyone know if there is a workaround for this, or should I replace all underscores and dashes in the features tsv file before reading in the data?

Thanks!

@andrewwbutler
Copy link
Collaborator

Hi,

To use subset or WhichCells on features that have dashes in their names you'll need to surround the feature in backticks. E.g:

test3 <- subset(x = test, subset = `MT-CO3` > 0)

This is consistent with how base R subsets data.frames based on column names.

As far as the missing variable problem, when you put the name in quotes, R will treat it as a string rather than a variable so it will evaluate the expression 'MT-CO3' > 0, which evaluates to TRUE (you can verify this outside of subset by just entering that expression in the command line).

@jeremycfd
Copy link
Author

Ah, thanks so much for the help!

@katie-connor
Copy link

Hi there, I am having this exact warning coming up when I am running my shiny app.

The app works but I can't get rid of this warning message. If I change input$single_cell_gene to a "genename" then there is no problem. I have tried changing as.character(input$single_cell_gene)

cell_type_violin <- reactive({ p <- VlnPlot(object = data, features = input$single_cell_gene, pt.size = input$dot_size) p + theme(axis.text.x = element_text(size = rel(0.7))) + labs(x=NULL) })

This has been bugging me for ages - any help much approeciated!

@wgk51
Copy link

wgk51 commented Mar 1, 2021

This fix, to have the gene name surrounded by backticks, no longer works in Seurat 4.0. Any ideas for how to get around this? Trying to subset genes with "-" in the name and getting the below error. I get the same error if I put any other gene name in ticks, even those that subset just fine without them.

Error in FetchData(object = object, vars = unique(x = expr.char[vars.use]), :
None of the requested variables were found:

@andrewwbutler
Copy link
Collaborator

Hi @wgk51, can you provide an example?

@swaggggyzhao
Copy link

swaggggyzhao commented Sep 27, 2021

I got the same problem as @wgk51.
Example:
I did: WhichCells(PC, expression='Ighv7-1'>0 )
Error in FetchData(object = object, vars = unique(x = expr.char[vars.use]), :
None of the requested variables were found:

@cwseidel
Copy link

Putting a gene name in back ticks works fine for me. But how does one put a gene name into a variable?
subset(obj, subset = Nkx1-2 > 0) # success!
mygene <- "Sox10"
subset(obj, subset = mygene > 0) # doesn't work! :(

@paulitikka
Copy link

paulitikka commented Sep 5, 2022

I have the same problem:

So this works:

length(WhichCells(data, slot = 'data', expression = Pitx2 > 0)) #[1] 379

But this does not work:

a=c('Pitx2')
length(WhichCells(data, slot = 'data', expression = a[1] > 0))
#Error in FetchData.Seurat(object = object, vars = unique(x = expr.char[vars.use]), : None of the requested variables were found:

#This would be needed to loop many gene names at one go. E.g. via:
vars=c('Pitx2','Sox2','Shh')
for(i in 1:length(vars)) {
length(WhichCells(data, slot = 'data', expression =vars[i]> 0))}

My workaround for this gene variables issue is to make another function 'tasta' and loop that:

tasta=function(data,vars,rangi) { for(i in 1:rangi) {
conda=data@meta.data[,'seurat_clusters']==strtoi(names(table(data@meta.data[,'seurat_clusters']))[i])
testa=data.frame(GetAssayData(data[vars,conda])); x=sum(testa>0); ax1= append(ax1,x)}
return(ax1)}

#Test variables for the function:
vars=c('Pitx2','Sox2','Shh', 'Cdh5');
x=c();ax1=c();testa=c(); a <- NULL; rangi=max(as.numeric(data@meta.data[,'seurat_clusters']))

#Testing the function:
for(i in 1:length(vars)) {a <- c(a,tasta(data,vars=vars[i],rangi))}

#Making the result as a datamatrix which has clusters in rows and gene names in columns (this solution should be defaulted):
hei=data.frame(matrix(a,nrow = 14, ncol = 4,byrow = FALSE)); names(hei)=vars;
hei$'Total Number of Cells'=table ( Idents( data) ); # Good to have also the total number of cells in clusters (rows)

@HappinessEricst
Copy link

hi please i encountered a problem which keeps giving me an error. Here is the code i wrote:
receptor.df <- FetchData(object = ha.obj, vars=receptor.names, slot="counts")

Here is the error i keep having
Error in FetchData.Seurat(object = ha.obj, vars = receptor.names, slot = "counts") : None of the requested variables were found

please can anyone help me out. I am trying to convert the ENSEMBL gene ids to gene symbols and add a metadata.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants