Ebi step 3 #1464

antgonza · 2015-09-03T21:28:04Z

Sorry for the large amount of changes. Have no other idea of how to divide nicely, except making the DB changes in a different PR but that's small compared to the rest so decided to leave it here.

General comments:

Added scientific name (required for EBI submission) to the populate_test_db.sql so we can test
Remember that commands.submit_EBI is a work in progress and is just so we have an idea of how the calls will look
Removed iter_file_via_list_of_dicts, add_sample, add_sample_prep and add_samples_from_templates as these were required for from-file submission, which we will not support anymore as everything is being loaded from the DB
Removed _generate_library_descriptor as this was using/forcing to have some library design fields that are not required anymore
Removed all write_*_xml as now we simply call the write/generate functions
Removed write_all_xml_files cause this will defeat the idea of having networkx call/submit jobs. See point 2 of this list.
Replaced gzopen -> GzipFile cause this will allow us to set mtime (float to calculate gzip's timestamp) for testing. FYI: gzip incorporates the timestamp in the file so we need to know/define the timestamp for testing.
Removed all qiita_ware/test/test_data/* as now the files are generated and deleted on the fly.

Who wants to review?

josenavas · 2015-09-03T22:12:10Z

qiita_db/metadata_template/test/test_sample_template.py

Instead of deleting this lines, can you just check that is returning an empty set? I think this was the original idea of this test but got lost when scientific_name become required. This way we will have a test for a template that is not missing anything.

josenavas · 2015-09-03T22:28:44Z

qiita_ware/ebi.py

This comment is not accurate because you're actually checking the data type on line 153.

Anyways, I don't think the logic here is correct. In here we should be checking the columns required by EBI. In the current code, you're checking all the required columns, for multiple reasons - however I think that the correct thing to do is checking only the EBI columns.

Also, you don't need to actually check the data type - the templates have a property that will return the columns that are required for their datatype:

In [11]: st = SampleTemplate(1) In [12]: st.columns_restrictions Out[12]: {'EBI': Restriction(columns={'physical_specimen_location': 'varchar', 'collection_timestamp': 'timestamp', 'scientific_name': 'varchar', 'taxon_id': 'integer'}, error_msg='EBI submission disabled'), 'qiita_main': Restriction(columns={'description': 'varchar', 'physical_specimen_remaining': 'bool', 'dna_extracted': 'bool', 'longitude': 'float8', 'sample_type': 'varchar', 'latitude': 'float8', 'host_subject_id': 'varchar'}, error_msg='Processed data approval disabled')} In [13]: pt = PrepTemplate(1) In [14]: pt.columns_restrictions Out[14]: {'EBI': Restriction(columns={'center_name': 'varchar', 'platform': 'varchar', 'primer': 'varchar', 'experiment_design_description': 'varchar', 'library_construction_protocol': 'varchar'}, error_msg='EBI submission disabled'), 'demultiplex': Restriction(columns={'primer': 'varchar', 'barcode': 'varchar'}, error_msg='Demultiplexing disabled. You will not be able to preprocess your raw data'), 'demultiplex_multiple': Restriction(columns={'primer': 'varchar', 'barcode': 'varchar', 'run_prefix': 'varchar'}, error_msg='Demultiplexing with multiple input files disabled. If your raw data includes multiple raw input files, you will not be able to preprocess your raw data')}

Given this, I think the correct path here is:

restrictions = self.sample_template.columns_restrictions['EBI'].columns.keys() restrictions.extend(self.prep_template.columns_restrictions['EBI'].columns.keys()

I don't have a strong feeling about this just thought it was easier to remove than add. If nobody opposes I will only use EBI restrictions.

Actually, thinking a bit more about this, you may be able to take advantage of the check_restrictions function from the templates:

pt_missing = self.prep_template.check_restrictions([self.prep_template.column_restrictions['EBI']) st_missing = self.sample_template.check_restrictions([self.sample_template.column_restrictions['EBI'])

And then let the user know where are the columns missing (i.e. not only telling them which columns are missing but where):

if st_missing: # add error_msg for sample template if pt_missing: # add error_msg for prep tempalte

josenavas · 2015-09-03T23:45:19Z

Thanks for the titanic effort here @antgonza ! I have done one pass and I think your errors will go away after addressing some of my comments

ElDeveloper · 2015-09-04T18:59:17Z

@antgonza, I think I'm done with my review. Looks good so far, but it would definitely be helpful if @adamrp reviewed this as well, there's a lot going on.

mortonjt · 2015-09-06T00:54:00Z

qiita_ware/ebi.py

Since this isn't a private function would it be worthwhile to add the Parameters section in the numpydocs? (To clarify that s is a str)

I know you didn't modify this, but can you reword this sentence so it fits in a single line in order to follow the numpy standard?

mortonjt · 2015-09-06T00:56:29Z

I did notice that in ebi.py is missing proper doc strings for send_sequences() and send_xml(). Do you guys mind if I fix this in a future PR?

antgonza · 2015-09-08T13:35:42Z

@mortonjt thanks for offering but those will be added in step 4 of this branch.

antgonza · 2015-09-08T15:26:08Z

Addressed all issues and questions. Can someone do a final review/merge?

josenavas · 2015-09-08T16:37:10Z

Sure, going over it again!

josenavas · 2015-09-08T17:04:20Z

qiita_db/support_files/populate_test_db.sql

It looks like my previous comment here was a bit confusing, sorry about that. The scientific_name should be 'root metagenome'.

'1118232' is the taxon_id (which is already present in the template).

IMO, not blocking - but would like to hear from @ElDeveloper

Not blocking at all.

josenavas · 2015-09-08T18:09:08Z

thanks @antgonza! few more comments.

I've added a couple of concerns to the joy list, as I think that this PR is getting to big and they can be fixed later and/or the final code will change once your work continues moving forward. In short, the issues that I'm concerned are:

UTF-8 char errors (add test)
Add a test in which the prep template has less samples than the sample template
EBISubmission.__init__ is getting long and complex, so probably we want to revisit it. I can see this function changing once we move all the steps so I don't want to discuss it here.

antgonza · 2015-09-08T19:32:18Z

Thanks for adding. The other issues (not on the list) have been addressed.

adamrp · 2015-09-08T20:25:47Z

qiita_ware/ebi.py

IMHO, the date_to_hold parameter should be renamed. To me, this name implies that the release should be held until today.

Also, this should default to None, then in the body of the function it should be set to date.today() (or today + 365 days) if it is None. Otherwise, this will not behave as you expect it to. As it is now, date.today() will be called only once, when the def is first executed (so, only once each time Qiita is launched, I assume), and the date_to_hold will be the same for all submissions, regardless of the actual submission date.

adamrp · 2015-09-08T20:41:31Z

A few comments on the code, but it looks good to me. It's great how much code can be deleted now that we are not relying at all on files and can instead use the data model! 😄

adamrp · 2015-09-08T20:59:10Z

qiita_ware/ebi.py

these should be dashes (-) and not underscores (_) for numpydoc

antgonza · 2015-09-09T13:47:43Z

Ping! @adamrp @josenavas @mortonjt

josenavas · 2015-09-09T16:02:34Z

qiita_ware/ebi.py

Really minor but EB.Submission.valid_platforms -> EBISubmission.valid_platforms

Ebi step 3

josenavas · 2015-09-09T16:09:57Z

Las commit was modifying only documentation - no reason to wait for the tests 😄 thanks @antgonza !!

antgonza added 5 commits August 31, 2015 12:41

changes to DB

bc86472

removing not necessary test data

8564165

step 3, initial commit

664d137

merging with ebi-step-2

6320220

figuringout error

08c7b2b

josenavas reviewed Sep 3, 2015
View reviewed changes

fixing error in last commit and addressing some of @josenavas comments

40e2abf

josenavas reviewed Sep 3, 2015
View reviewed changes

antgonza added 2 commits September 4, 2015 02:46

addressing @josenavas comments

9f05273

fixing flake8

3f938bf

mortonjt reviewed Sep 6, 2015
View reviewed changes

addressing @ElDeveloper and @mortonjt

e813cf0

josenavas reviewed Sep 8, 2015
View reviewed changes

addressing @josenavas and %mortonjt comments

31a9f7b

adamrp reviewed Sep 8, 2015
View reviewed changes

addressing @adamrp comments

6286792

adamrp reviewed Sep 8, 2015
View reviewed changes

qiita_ware/ebi.py Outdated

Copy link

Contributor

adamrp Sep 8, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should be dashes (-) and not underscores (_) for numpydoc

changing from _ to -, @adamrp

47bb358

josenavas reviewed Sep 9, 2015
View reviewed changes

qiita_ware/ebi.py Outdated

Copy link

Contributor

josenavas Sep 9, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really minor but EB.Submission.valid_platforms -> EBISubmission.valid_platforms

minor comment from @josenavas

b21fec8

josenavas added a commit that referenced this pull request Sep 9, 2015

Merge pull request #1464 from antgonza/ebi-step-3

d0ae1c4

Ebi step 3

josenavas merged commit d0ae1c4 into qiita-spots:ebi-improvements Sep 9, 2015

Uh oh!

Ebi step 3 #1464

Ebi step 3 #1464

Uh oh!

Conversation

antgonza commented Sep 3, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

josenavas commented Sep 3, 2015

Uh oh!

ElDeveloper commented Sep 4, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mortonjt commented Sep 6, 2015

Uh oh!

antgonza commented Sep 8, 2015

Uh oh!

antgonza commented Sep 8, 2015

Uh oh!

josenavas commented Sep 8, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

josenavas commented Sep 8, 2015

Uh oh!

antgonza commented Sep 8, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adamrp commented Sep 8, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

antgonza commented Sep 9, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

josenavas commented Sep 9, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants