Skip to content

Discrepancy in Long Trajectory Extraction: Sorting by Displacement Instead of Traverse Length #35

@studentK2004

Description

@studentK2004

While reviewing the pseudotime estimation implementation in pseudo_time.py (as of the latest version), I noticed a potential discrepancy between the described methodology in the CellDancer paper and the current code behavior regarding the selection of long trajectories.
According to the paper (Step 5: Long Trajectory Extraction and Pseudotime Assignment), the traverse length of trajectories should be computed as the accumulated distance $ \sum_t || \xi (t + \Delta t) - \xi (t) || $, and long trajectories $ {L_k(t)}_{k=1,\ldots,m} $ should be selected based on this metric, with the longest trajectories iteratively chosen and similar ones eliminated within a cutoff.
However, in the current implementation:

The compute_trajectory_displacement function (line 38) calculates the straight-line (Euclidean) distance from the start to the end point of each trajectory.
Trajectories are sorted using this displacement metric (line 595: order = np.argsort(traj_displacement)[::-1]), rather than the cumulative traverse length provided by compute_trajectory_length (line 41).
The extract_long_trajectories function (line 607) then proceeds with this sorted order, potentially prioritizing straighter paths over those with significant detours or loops.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions