Skip to content

Find a better method to represent a centroid  #61

@axiomcura

Description

@axiomcura

Maybe I am understanding your goals wrong or the pycytominer aggregation wrong, so please let me know if so. You said here in documentation that the data was aggregated by component-wise median, which I assume just involves median-ing every feature separately (or is this a typo and you meant geometric median?). I have read about component-wise median operations generalizing poorly from 1d to higher dimensional data and can produce unrealistic centroid profiles.

In the extreme case if we have very anti-correlated or correlated features the median profile would be far from anything that is typical or realistic, unless this type of correlation structure has been previously dealt with:

| Sample | Feature X | Feature Y |
|--------|-----------|-----------|
| S1     |    1      |    10     |
| S2     |    2      |     9     |
| S3     |    9      |     2     |
| S4     |   10      |     1     |
| Median    |   5.5     |   5.5     |

Originally posted by @wli51 in #60 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Look at it laterThese are issues I will return to in the future

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions