-
Notifications
You must be signed in to change notification settings - Fork 80
Relaxing metadata requirements #1021
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relaxing metadata requirements #1021
Conversation
This is ready for review @antgonza Fixes #465 but in a more intelligent way (it actually chooses the right prep template) There is a bunch of other issues fixed here, I will compile the list tomorrow morning and list it here so they get closed... I am too asleep right now to do it. This is another gigantic PR so 3 reviewers are more than welcome. Also these are sensitive changes so please review carefully. Basically, the system will allow you to upload any sample/prep template, doesn't matter which columns are missing. However, it will disable some functionality based on those missing columns (preprocessing, EBI submission or processed data approval). |
You have real pep8 failures. This is a huge PR, thanks for working on this. Do you think it will be worth splitting in 2 PRs? One with the split of qiita_db/metadata_template (merge first) and another with the code changes (this should allow us to find quickly the changes/modifications). I didn't think about how difficult it will be to review until I open this and it took a while to load, even without any comments or discussion. Sorry for this. |
# requried_sample_info table; and drop the required_sample_info_status table | ||
conn_handler.add_to_queue( | ||
queue_name, | ||
"""ALTER TABLE qiita.required_sample_info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these drops can be a single SQL statement
alter table qiita.required_sample_info drop column collection_timestamp, drop column host_subject_id;
just keep adding to the list of drop column statements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@antgonza I don't think we can split in 2 PR... the reason is that the changes on the metadata template need most of the other changes in order to pass the tests... The actual code changes after this being solved (i.e. limiting the functionality of the interface) only include ~20 lines of code (it was way faster than I expected) so the first PR will still be huge, plus the master state will be inconsistent... In order to make the review more smooth, I would recommend to review the Does anyone knows why flake8 is complaining about those lines? There are 2 other similar lines in that same file and it is not complaining... (is an init.py file for reference) |
Fixes partially #247 - the mapping files generated for analysis include the prep info, unsure about the search, @squirrelo was that solved or the search still doesn't work over prep template info? Fixes #781 |
😄 green! |
@josenavas what's the status of this PR? |
This PR is being re-introduced with a bunch of small PR that I'm doing. So this PR can be closed as trying to fix the merge conflicts will be horrible. I'll be submitting a bunch of small PR today to a new branch in upstream so the entire functionality can make it. |
OK, will close. Remember that we need to have things merged before 9am tomorrow so the sooner the PRs are ready, the better. |
Yeah, I know, I'm still finding bugs on the metadata template that need to be addressed before the refactor comes in. Some of them are tricky to test... |
Fixes #941
It also includes the code from @squirrelo in #1016
I still have to push the changes on DB schema and de-activate the bits when metadata columns are not included, plus improve a little bit some parts of the code.
While implementing this change, I noticed a lot of small bugs; and I think that was a result of the complexity of the code. I think the best solution is a refactor, as started here, which divides the code in a easier-to understand structure.
Some of these bugs where caused due to improper testing. I noticed that some of the tests were copy-pasted directly from the python output (maybe from ipython) and just PEP8-ing them. This is not ok and as a remainder, I would like to say that tests should be executed "by hand" and manually checking that the result is correct; otherwise the tests are useless.
I am one day behind on this PR, I will have it ready by tomorrow (I was not able to work on this on Thursday giving the problems that we had on travis on PR #900 and I was working on solving them). Sorry about that.