Skip to content

Options for dealing with missing flanking data in samples #224

@hyanwong

Description

@hyanwong

As discussed with @jeromekelleher, it would be useful to be able to produce inferred tree sequences where the input samples have missing data to the left and right of a known sequence (e.g. fragments from a sequencer). This ability has been removed from #169, as it seems easier and more flexible to do this after inference-with-imputation on the fragmented sequences. For example, this would allow large sections of missing data in the middle of a sample (instead of only in the flanking regions) to be marked as "truly missing and not for imputation"

I intend to write a function to do this independently of the inference process. All this needs to do is to take an inferred TS and its corresponding SampleData file, remove the edges that link a sample to the tree at sites that are missing in that particular sample, and simplify(keep_unary=True). This issue replaces #153, and should eventually subsume #173.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions