Skip to content

Document reading tabular pedigree formats into sgkit #1012

Open
@timothymillar

Description

@timothymillar

We don't currently have any IO functionality for pedigree formats. These are usually tabular but can be quite variable. We should document how to read in some generic examples and add them to an sgkit style dataset.

Basic workflow:

  • Read tabular format as pandas dataframe
  • Assign sample identifiers to the sample_id variable
  • Assign parental columns to the parent_id variable
  • Optionally set coords for the parents dim (['Father', 'Mother'], ['Sire', 'Dam'], etc.)
  • Use parent_indices to generate the parents array and explain the 0-based indexing etc.
  • Do something interesting like calculating kinship.

Metadata

Metadata

Assignees

No one assigned

    Labels

    IOIssues related to reading and writing common third-party file formatsdocumentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions