-
Notifications
You must be signed in to change notification settings - Fork 15
Refactor batch sample matching #954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #954 +/- ##
==========================================
- Coverage 93.23% 93.16% -0.07%
==========================================
Files 18 18
Lines 6299 6337 +38
Branches 1139 1133 -6
==========================================
+ Hits 5873 5904 +31
- Misses 290 294 +4
- Partials 136 139 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I think some classes to capture the parameters would be helpful here, as the "metadata" dictionaries are confusing and easy to make mistakes with.
You could even add some functionality to the class to abstract out some of the common boilerplate, like
class WorkDescriptor:
def common_params(self) -> dict:
return {"precision": self.precision, ... etc}
match_samples(..., **wd.common_params())
tsinfer/inference.py
Outdated
# Create work dir | ||
work_dir.mkdir(parents=True, exist_ok=True) | ||
|
||
metadata = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth making a dataclass to capture this now - it's getting a bit of out hand. "Metadata" is confusing as a name as I thought we were talking about "tskit metadata" initially. The thing we're talking about is a serialisable description of some work that's to be done... So,
@dataclasses.dataclass
class SampleBatchWorkDescriptor:
sample_data_path: str
....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 6924929
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, happy to merge
3900535
to
abcab58
Compare
Refactor to work like ancestor matching.
Some feedback on the approach would be good here.