-
Notifications
You must be signed in to change notification settings - Fork 23
Open
Labels
Description
Motivation: There is information about potential mutations when many reads stop mapping at a particular position (i.e. are soft-clipped) without the detection of a junction. In particular, it may indicate that a new sequence that is not in the reference files is inserted there.
Implementation: Track reads with mappings that prematurely stop during the mutation identification step. Add these as a new type of DC = discontinuity evidence type. Making these predictions may require changing --require-match-fraction to be less than the current default of 0.9.
- Create test data in which a portion of a reference sequence is removed to simulate the effect of having an entirely novel sequence inserted at a position.
- Add code for detecting DC to the mutation identification step. Figure out statistics for judging a value to be significant based off of overall coverage or empirical distribution.