Skip to content

Conversation

@hyanwong
Copy link
Member

This also switches the C code to using signed ints for genotypes. Guess that's OK. It will fail tests unless it is used with a tskit version that includes tskit-dev/tskit#267 which exports tskit.MISSING_DATA, which I'm now using as a replacement for tsinfer.constants.UNKNOWN_ALLELE (I think this makes sense)

@jeromekelleher
Copy link
Member

This looks like a good start. There's a chunk of work to do to complete the change though:

  1. Need to bump the file format version numbers. I'm not sure if this is a major or minor version change: will old code be able to read the new files and will new code be able to read the old files?

  2. Need to update the low-level C code. The allele_t need to change to int8_t, and propagate the changes.

  3. We need to put in tskit>=0.2.0 in requirements.py and setup.py because of the tskit.MISSING_DATA constant. As such, we should wait until 0.2.0 is shipped before taking this further probably.

@hyanwong
Copy link
Member Author

hyanwong commented Jul 19, 2019

Thanks @jeromekelleher . It looked to me like allele_t was (bizarrely) already int8_t - but I might have been skipping over the code too quickly. ISWYM about file format & tskit requirements. I'm going to keep on developing stuff off this branch, as there's more stuff to do (e.g. allow missing data in samples), but as long as it passes the tests on my laptop with the dev version of tskit, I'm assuming that we will be able to update and push to tsinfer master later.

@hyanwong hyanwong force-pushed the change-genotypes-to-signed branch from 9de72d4 to 1241955 Compare July 19, 2019 13:43
@hyanwong
Copy link
Member Author

It looked to me like allele_t was (bizarrely) already int8_t

Unless I'm mistaken, allele_t is defined here:

typedef int8_t allele_t;

@codecov
Copy link

codecov bot commented Jul 19, 2019

Codecov Report

Merging #168 into master will decrease coverage by 0.09%.
The diff coverage is 88.52%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master     #168     +/-   ##
=========================================
- Coverage   93.72%   93.63%   -0.1%     
=========================================
  Files          11       11             
  Lines        2885     2907     +22     
  Branches      507      511      +4     
=========================================
+ Hits         2704     2722     +18     
- Misses        148      150      +2     
- Partials       33       35      +2
Impacted Files Coverage Δ
tsinfer/constants.py 100% <ø> (ø) ⬆️
tsinfer/inference.py 99.02% <100%> (ø) ⬆️
tsinfer/eval_util.py 88.73% <100%> (-0.03%) ⬇️
tsinfer/formats.py 95.49% <71.42%> (ø) ⬆️
tsinfer/__init__.py 78.94% <81.81%> (+3.94%) ⬆️
tsinfer/algorithm.py 93.84% <93.33%> (+0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cf3c68a...1241955. Read the comment docs.

@jeromekelleher
Copy link
Member

How does this relate to #168? If it's been superseded, can you close this pr please @hyanwong?

@hyanwong
Copy link
Member Author

hyanwong commented Sep 2, 2019

Obsoleted by #169

@hyanwong hyanwong closed this Sep 2, 2019
@hyanwong hyanwong deleted the change-genotypes-to-signed branch September 2, 2019 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants