Skip to content

equal_frequency bins in highly_variable_genes #415

Closed
@davidhbrann

Description

@davidhbrann

Hi,

Using Seurat, in their variable gene function I've had some success using the equal_frequency option, where each bin contains an equal number of genes. Would it possible to implement this option in scanpy?

If you'd like I could submit a PR to implement this feature. I think it could be as simple as using pd.qcut instead of pd.cut or you could use a similar style as in the cell_ranger flavor with pd.cut(df['mean'], np.r_[-np.inf, np.percentile(df['mean'], np.arange(10, 105, 5)), np.inf]). I don't know how useful it would be, but I could also add the option to have more bins in the cell_ranger flavor by replacing np.arange(10,105,5) with np.linspace(10, 100, n_bins - 1).

Best,
David

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions