SNP Error Correction

When erroneous SNPs are detected by the binary classifiers trained for SNP filtering, the easiest approach is to remove them. However, this can lead to a substantial loss of valuable information. In many applications, it can be preferable to instead employ techniques that correct the erroneous SNPs. During the final stage of the MergeGenome's homogenization process, a multi-output machine learning classifier is trained to predict better SNP values to replace the filtered erroneous SNPs within the query dataset. According to our experiments, K-Nearest Neighbors (KNN) is the most suitable choice to correct erroneous SNPs caused by poor Beagle imputation and homogenization errors.

A newer version of the code can be found here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README_7_snp_error_correction.md

README_7_snp_error_correction.md

SNP Error Correction

Files

README_7_snp_error_correction.md

Latest commit

History

README_7_snp_error_correction.md

File metadata and controls

SNP Error Correction