-
Couldn't load subscription status.
- Fork 79
Ebi step 3 #1464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ebi step 3 #1464
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of deleting this lines, can you just check that is returning an empty set? I think this was the original idea of this test but got lost when scientific_name become required. This way we will have a test for a template that is not missing anything.
qiita_ware/ebi.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is not accurate because you're actually checking the data type on line 153.
Anyways, I don't think the logic here is correct. In here we should be checking the columns required by EBI. In the current code, you're checking all the required columns, for multiple reasons - however I think that the correct thing to do is checking only the EBI columns.
Also, you don't need to actually check the data type - the templates have a property that will return the columns that are required for their datatype:
In [11]: st = SampleTemplate(1)
In [12]: st.columns_restrictions
Out[12]:
{'EBI': Restriction(columns={'physical_specimen_location': 'varchar', 'collection_timestamp': 'timestamp', 'scientific_name': 'varchar', 'taxon_id': 'integer'}, error_msg='EBI submission disabled'),
'qiita_main': Restriction(columns={'description': 'varchar', 'physical_specimen_remaining': 'bool', 'dna_extracted': 'bool', 'longitude': 'float8', 'sample_type': 'varchar', 'latitude': 'float8', 'host_subject_id': 'varchar'}, error_msg='Processed data approval disabled')}
In [13]: pt = PrepTemplate(1)
In [14]: pt.columns_restrictions
Out[14]:
{'EBI': Restriction(columns={'center_name': 'varchar', 'platform': 'varchar', 'primer': 'varchar', 'experiment_design_description': 'varchar', 'library_construction_protocol': 'varchar'}, error_msg='EBI submission disabled'),
'demultiplex': Restriction(columns={'primer': 'varchar', 'barcode': 'varchar'}, error_msg='Demultiplexing disabled. You will not be able to preprocess your raw data'),
'demultiplex_multiple': Restriction(columns={'primer': 'varchar', 'barcode': 'varchar', 'run_prefix': 'varchar'}, error_msg='Demultiplexing with multiple input files disabled. If your raw data includes multiple raw input files, you will not be able to preprocess your raw data')}Given this, I think the correct path here is:
restrictions = self.sample_template.columns_restrictions['EBI'].columns.keys()
restrictions.extend(self.prep_template.columns_restrictions['EBI'].columns.keys()There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong feeling about this just thought it was easier to remove than add. If nobody opposes I will only use EBI restrictions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, thinking a bit more about this, you may be able to take advantage of the check_restrictions function from the templates:
pt_missing = self.prep_template.check_restrictions([self.prep_template.column_restrictions['EBI'])
st_missing = self.sample_template.check_restrictions([self.sample_template.column_restrictions['EBI'])And then let the user know where are the columns missing (i.e. not only telling them which columns are missing but where):
if st_missing:
# add error_msg for sample template
if pt_missing:
# add error_msg for prep tempalte|
Thanks for the titanic effort here @antgonza ! I have done one pass and I think your errors will go away after addressing some of my comments |
qiita_ware/ebi.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this isn't a private function would it be worthwhile to add the Parameters section in the numpydocs? (To clarify that s is a str)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know you didn't modify this, but can you reword this sentence so it fits in a single line in order to follow the numpy standard?
|
I did notice that in |
|
@mortonjt thanks for offering but those will be added in step 4 of this branch. |
|
Addressed all issues and questions. Can someone do a final review/merge? |
|
Sure, going over it again! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like my previous comment here was a bit confusing, sorry about that. The scientific_name should be 'root metagenome'.
'1118232' is the taxon_id (which is already present in the template).
IMO, not blocking - but would like to hear from @ElDeveloper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not blocking at all.
|
thanks @antgonza! few more comments. I've added a couple of concerns to the joy list, as I think that this PR is getting to big and they can be fixed later and/or the final code will change once your work continues moving forward. In short, the issues that I'm concerned are:
|
|
Thanks for adding. The other issues (not on the list) have been addressed. |
qiita_ware/ebi.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, the date_to_hold parameter should be renamed. To me, this name implies that the release should be held until today.
Also, this should default to None, then in the body of the function it should be set to date.today() (or today + 365 days) if it is None. Otherwise, this will not behave as you expect it to. As it is now, date.today() will be called only once, when the def is first executed (so, only once each time Qiita is launched, I assume), and the date_to_hold will be the same for all submissions, regardless of the actual submission date.
|
A few comments on the code, but it looks good to me. It's great how much code can be deleted now that we are not relying at all on files and can instead use the data model! 😄 |
qiita_ware/ebi.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these should be dashes (-) and not underscores (_) for numpydoc
|
Ping! @adamrp @josenavas @mortonjt |
qiita_ware/ebi.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really minor but EB.Submission.valid_platforms -> EBISubmission.valid_platforms
|
Las commit was modifying only documentation - no reason to wait for the tests 😄 thanks @antgonza !! |
Sorry for the large amount of changes. Have no other idea of how to divide nicely, except making the DB changes in a different PR but that's small compared to the rest so decided to leave it here.
General comments:
Who wants to review?