You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We often have the use case of comparing two or more datasets where the corresponding sites have different allele mappings. For example, the ordering could be different, or one dataset could have alleles that aren't in the other.
This makes directly comparing the genotype arrays annoying and error prone. I would be good to have come API support for re-mapping the alleles in sgkit.
Maybe something like ds.remap_alleles(allele_mapping) which returns a dataset with the remapped alleles? allele_mapping would either be:
a dict such as {'A':1, 'G':56, '-':2} that would be used across all variants.
a list of dicts which a mapping for each variant.
There could then be an additional method that creates a mapping dict from a set of datasets. This splitting of the functionality allows using either fixed mappings like ACTG or one based on the alleles seen.
The text was updated successfully, but these errors were encountered:
We often have the use case of comparing two or more datasets where the corresponding sites have different allele mappings. For example, the ordering could be different, or one dataset could have alleles that aren't in the other.
This makes directly comparing the genotype arrays annoying and error prone. I would be good to have come API support for re-mapping the alleles in sgkit.
Maybe something like
ds.remap_alleles(allele_mapping)
which returns a dataset with the remapped alleles?allele_mapping
would either be:dict
such as{'A':1, 'G':56, '-':2}
that would be used across all variants.dict
s which a mapping for each variant.There could then be an additional method that creates a mapping dict from a set of datasets. This splitting of the functionality allows using either fixed mappings like ACTG or one based on the alleles seen.
The text was updated successfully, but these errors were encountered: