Open
Description
Basic population structure analysis is a key goal for sgkit. This is a top-level issue which tracks progress on this target, following the example of #67 for GWAS.
- Method for grouping samples into "populations" (Method for grouping samples which is understood by library functions #224)
- Running PCA on large cohorts of individuals (>10K) is essential (see PCA User Story #95, PCA #123)
- Fst (Fst function #225)
- Pairwise distance (Pairwise distance calculation #241)
@alimanfoo, is this a reasonable starting point for doing basic population structure analysis with sgkit?