Skip to content

using gene_id rather than gene_name when reading in 10X data #630

Closed
@andyrussell

Description

@andyrussell

When using the 'Read10X' command, Seurat reads in the genes.tsv file (from the CellRanger output) and reads in the gene_name (second column in the genes.tsv file) to your Seurat object e.g. genes.tsv file:

$ head filtered_gene_bc_matrices/hg19/genes.tsv
ENSG00000243485    MIR1302-10
ENSG00000237613    FAM138A
ENSG00000186092    OR4F5
ENSG00000238009    RP11-34P13.7
ENSG00000239945    RP11-34P13.8
ENSG00000237683    AL627309.1
ENSG00000239906    RP11-34P13.14
ENSG00000241599    RP11-34P13.9
ENSG00000228463    AP006222.2
ENSG00000237094    RP4-669L17.10

(source: https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/output/matrices)

This is obviously the most useful option for well annotated organisms such as mouse and human (as it's easier to work with shorter, distinct gene names) but in my case, many of the gene names are the same and I was wondering if there was a way to read in gene_ids instead (i.e. the 1st column in the genes.tsv file from the CellRanger output), and if not, if this could be an option in 'Read10X'?

Failing this, would anyone have any advice on how to possibly manipulate the genes.tsv file post CellRanger output to allow me to read in a modified genes.tsv file?

Thank you for any help you can give in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions