-
Notifications
You must be signed in to change notification settings - Fork 80
Fix metadata obj #1075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix metadata obj #1075
Conversation
'EBI': Restriction(columns={'collection_timestamp': 'timestamp', | ||
'physical_specimen_location': 'varchar'}, | ||
error_msg="EBI submission disabled"), | ||
'Qiita_main': Restriction(columns={'sample_type': 'varchar', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I like the casing. This is not an actual issue but looks weird and perhaps all should be lower case ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lowercased!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I don't like about this name is that "qiita main" isn't really meaningful.
Additionally the meaning of each of these keys should be documented somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you recommend for this? The restrictions under the qiita_main
key are restrictions that only apply to our official Qiita main site, but they will not apply on a single user environment...
Re documentation: do you think it's enough to add a comment here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, then the restriction only applies to non-single-user systems right? As for better documenting this, I think we need something more than just a comment in the "constants" file. However I cannot think of a good way to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re documentation: I will add a comment here and I've created a new issue: #1111
I think I used the wrong words on my previous comment, the idea is that those restrictions only apply to our system. Other labs that may want to have a qiita installation locally with multiple users shouldn't have these restrictions either... Is one of those cases in which our system is more specialized than the rest of the users.
Thinking a bit more about this. I think that having this in a configuration file will allow us to change these restrictions dynamically, and we can have them documented there. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, yeah it makes more sense that way. Otherwise it is one of those
seemingly arbitrary restrictions of the system. If these are only for us
then they should be parametrized, although we do want to provide Qiita
as a service, the software should not be specific to our needs in any
way.
On (Apr-27-15|15:12), josenavas wrote:
@@ -6,10 +6,68 @@
The full license is in the file LICENSE, distributed with this software.
-----------------------------------------------------------------------------
+from collections import namedtuple
+from future.utils import viewkeys, viewvalues
+
+Restriction = namedtuple('Restriction', ['columns', 'error_msg'])
+
+# A dict containing the restrictions that apply to the sample templates
+SAMPLE_TEMPLATE_COLUMNS = {
- 'EBI': Restriction(columns={'collection_timestamp': 'timestamp',
'physical_specimen_location': 'varchar'},
error_msg="EBI submission disabled"),
- 'Qiita_main': Restriction(columns={'sample_type': 'varchar',
Re documentation: I will add a comment here and I've created a new issue: #1111
I think I used the wrong words on my previous comment, the idea is that those restrictions only apply to our system. Other labs that may want to have a qiita installation locally with multiple users shouldn't have these restrictions either... Is one of those cases in which our system is more specialized than the rest of the users.
Thinking a bit more about this. I think that having this in a configuration file will allow us to change these restrictions dynamically, and we can have them documented there. What do you think?
Reply to this email directly or view it on GitHub:
https://github.com/biocore/qiita/pull/1075/files#r29197135
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree the idea of making this parametrized. Also, we don't know
when EBI decides to change their requirements. However, in interests of
time, I think we should create an issue and I can solve it as a part of
Alpha 0.2, since that will require some refactor. Does that sound good to
you?
2015-04-27 15:24 GMT-07:00 Yoshiki Vázquez Baeza notifications@github.com:
In qiita_db/metadata_template/constants.py
#1075 (comment):@@ -6,10 +6,68 @@
The full license is in the file LICENSE, distributed with this software.
-----------------------------------------------------------------------------
+from collections import namedtuple
+from future.utils import viewkeys, viewvalues
+
+Restriction = namedtuple('Restriction', ['columns', 'error_msg'])
+
+# A dict containing the restrictions that apply to the sample templates
+SAMPLE_TEMPLATE_COLUMNS = {
- 'EBI': Restriction(columns={'collection_timestamp': 'timestamp',
'physical_specimen_location': 'varchar'},
error_msg="EBI submission disabled"),
- 'Qiita_main': Restriction(columns={'sample_type': 'varchar',
Got it, yeah it makes more sense that way. Otherwise it is one of those
seemingly arbitrary restrictions of the system. If these are only for us
then they should be parametrized, although we do want to provide Qiita as a
service, the software should not be specific to our needs in any way.
… <#14cfcfaf015149db_>
On (Apr-27-15|15:12), josenavas wrote: > @@ -6,10 +6,68 @@ > # The fulllicense is in the file LICENSE, distributed with this software. >
+from collections import namedtuple > +from future.utils import
viewkeys, viewvalues > + > +Restriction = namedtuple('Restriction',
['columns', 'error_msg']) > + > +# A dict containing the restrictions that
apply to the sample templates > +SAMPLE_TEMPLATE_COLUMNS = { > + 'EBI':
Restriction(columns={'collection_timestamp': 'timestamp', > +
'physical_specimen_location': 'varchar'}, > + error_msg="EBI submission
disabled"), > + 'Qiita_main': Restriction(columns={'sample_type':
'varchar', Re documentation: I will add a comment here and I've created a
new issue: #1111 I think I used
the wrong words on my previous comment, the idea is that those restrictions
only apply to our system. Other labs that may want to have a qiita
installation locally with multiple users shouldn't have these restrictions
either... Is one of those cases in which our system is more specialized
than the rest of the users. Thinking a bit more about this. I think that
having this in a configuration file will allow us to change these
restrictions dynamically, and we can have them documented there. What do
you think? --- Reply to this email directly or view it on GitHub:
https://github.com/biocore/qiita/pull/1075/files#r29197135—
Reply to this email directly or view it on GitHub
https://github.com/biocore/qiita/pull/1075/files#r29197966.
Jose Navas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this sounds good we should resolve this before too long though.
Can you open an issue about this?
On (Apr-27-15|15:26), josenavas wrote:
@@ -6,10 +6,68 @@
The full license is in the file LICENSE, distributed with this software.
-----------------------------------------------------------------------------
+from collections import namedtuple
+from future.utils import viewkeys, viewvalues
+
+Restriction = namedtuple('Restriction', ['columns', 'error_msg'])
+
+# A dict containing the restrictions that apply to the sample templates
+SAMPLE_TEMPLATE_COLUMNS = {
- 'EBI': Restriction(columns={'collection_timestamp': 'timestamp',
'physical_specimen_location': 'varchar'},
error_msg="EBI submission disabled"),
- 'Qiita_main': Restriction(columns={'sample_type': 'varchar',
Yeah, I agree the idea of making this parametrized. Also, we don't know
when EBI decides to change their requirements. However, in interests of
time, I think we should create an issue and I can solve it as a part of
Alpha 0.2, since that will require some refactor. Does that sound good to
you?2015-04-27 15:24 GMT-07:00 Yoshiki Vázquez Baeza notifications@github.com:
In qiita_db/metadata_template/constants.py
#1075 (comment):@@ -6,10 +6,68 @@
The full license is in the file LICENSE, distributed with this software.
-----------------------------------------------------------------------------
+from collections import namedtuple
+from future.utils import viewkeys, viewvalues
+
+Restriction = namedtuple('Restriction', ['columns', 'error_msg'])
+
+# A dict containing the restrictions that apply to the sample templates
+SAMPLE_TEMPLATE_COLUMNS = {
- 'EBI': Restriction(columns={'collection_timestamp': 'timestamp',
'physical_specimen_location': 'varchar'},
error_msg="EBI submission disabled"),
- 'Qiita_main': Restriction(columns={'sample_type': 'varchar',
Got it, yeah it makes more sense that way. Otherwise it is one of those
seemingly arbitrary restrictions of the system. If these are only for us
then they should be parametrized, although we do want to provide Qiita as a
service, the software should not be specific to our needs in any way.
… <#14cfcfaf015149db_>
On (Apr-27-15|15:12), josenavas wrote: > @@ -6,10 +6,68 @@ > # The fulllicense is in the file LICENSE, distributed with this software. >
+from collections import namedtuple > +from future.utils import
viewkeys, viewvalues > + > +Restriction = namedtuple('Restriction',
['columns', 'error_msg']) > + > +# A dict containing the restrictions that
apply to the sample templates > +SAMPLE_TEMPLATE_COLUMNS = { > + 'EBI':
Restriction(columns={'collection_timestamp': 'timestamp', > +
'physical_specimen_location': 'varchar'}, > + error_msg="EBI submission
disabled"), > + 'Qiita_main': Restriction(columns={'sample_type':
'varchar', Re documentation: I will add a comment here and I've created a
new issue: #1111 I think I used
the wrong words on my previous comment, the idea is that those restrictions
only apply to our system. Other labs that may want to have a qiita
installation locally with multiple users shouldn't have these restrictions
either... Is one of those cases in which our system is more specialized
than the rest of the users. Thinking a bit more about this. I think that
having this in a configuration file will allow us to change these
restrictions dynamically, and we can have them documented there. What do
you think? --- Reply to this email directly or view it on GitHub:
https://github.com/biocore/qiita/pull/1075/files#r29197135—
Reply to this email directly or view it on GitHub
https://github.com/biocore/qiita/pull/1075/files#r29197966.Jose Navas
Reply to this email directly or view it on GitHub:
https://github.com/biocore/qiita/pull/1075/files#r29198175
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done #1113
2015-04-27 15:32 GMT-07:00 Yoshiki Vázquez Baeza notifications@github.com:
In qiita_db/metadata_template/constants.py
#1075 (comment):@@ -6,10 +6,68 @@
The full license is in the file LICENSE, distributed with this software.
-----------------------------------------------------------------------------
+from collections import namedtuple
+from future.utils import viewkeys, viewvalues
+
+Restriction = namedtuple('Restriction', ['columns', 'error_msg'])
+
+# A dict containing the restrictions that apply to the sample templates
+SAMPLE_TEMPLATE_COLUMNS = {
- 'EBI': Restriction(columns={'collection_timestamp': 'timestamp',
'physical_specimen_location': 'varchar'},
error_msg="EBI submission disabled"),
- 'Qiita_main': Restriction(columns={'sample_type': 'varchar',
Yeah, this sounds good we should resolve this before too long though. Can
you open an issue about this?
… <#14cfd0256077b3f3_>
On (Apr-27-15|15:26), josenavas wrote: > @@ -6,10 +6,68 @@ > # The fulllicense is in the file LICENSE, distributed with this software. >
+from collections import namedtuple > +from future.utils import
viewkeys, viewvalues > + > +Restriction = namedtuple('Restriction',
['columns', 'error_msg']) > + > +# A dict containing the restrictions that
apply to the sample templates > +SAMPLE_TEMPLATE_COLUMNS = { > + 'EBI':
Restriction(columns={'collection_timestamp': 'timestamp', > +
'physical_specimen_location': 'varchar'}, > + error_msg="EBI submission
disabled"), > + 'Qiita_main': Restriction(columns={'sample_type':
'varchar', Yeah, I agree the idea of making this parametrized. Also, we
don't know when EBI decides to change their requirements. However, in
interests of time, I think we should create an issue and I can solve it as
a part of Alpha 0.2, since that will require some refactor. Does that sound
good to you? 2015-04-27 15:24 GMT-07:00 Yoshiki Vázquez Baeza <
notifications@github.com>: > In qiita_db/metadata_template/constants.py >
#1075 (comment): > > >
@@ -6,10 +6,68 @@ > > # The full license is in the file LICENSE,distributed with this software. > >
+from collections import namedtuple > > +from future.utils import
viewkeys, viewvalues > > + > > +Restriction = namedtuple('Restriction',
['columns', 'error_msg']) > > + > > +# A dict containing the restrictions
that apply to the sample templates > > +SAMPLE_TEMPLATE_COLUMNS = { > > +
'EBI': Restriction(columns={'collection_timestamp': 'timestamp', > > +
'physical_specimen_location': 'varchar'}, > > + error_msg="EBI submission
disabled"), > > + 'Qiita_main': Restriction(columns={'sample_type':
'varchar', > > Got it, yeah it makes more sense that way. Otherwise it is
one of those > seemingly arbitrary restrictions of the system. If these are
only for us > then they should be parametrized, although we do want to
provide Qiita as a > service, the software should not be specific to our
needs in any way. > … <#14cfcfaf015149db_> > On (Apr-27-15|15:12),
josenavas wrote: > @@ -6,10 +6,68 @@ > # The full > license is in the fileLICENSE, distributed with this software. > # >
+from collections import namedtuple > +from future.utils import >
viewkeys, viewvalues > + > +Restriction = namedtuple('Restriction', >
['columns', 'error_msg']) > + > +# A dict containing the restrictions that
apply to the sample templates > +SAMPLE_TEMPLATE_COLUMNS = { > + 'EBI': >
Restriction(columns={'collection_timestamp': 'timestamp', > + >
'physical_specimen_location': 'varchar'}, > + error_msg="EBI submission >
disabled"), > + 'Qiita_main': Restriction(columns={'sample_type': >
'varchar', Re documentation: I will add a comment here and I've created a >
new issue: #1111 I think I used >
the wrong words on my previous comment, the idea is that those restrictions
only apply to our system. Other labs that may want to have a qiita >
installation locally with multiple users shouldn't have these restrictions
either... Is one of those cases in which our system is more specialized >
than the rest of the users. Thinking a bit more about this. I think that >
having this in a configuration file will allow us to change these >
restrictions dynamically, and we can have them documented there. What do >
you think? --- Reply to this email directly or view it on GitHub: >
https://github.com/biocore/qiita/pull/1075/files#r29197135 > > — > Reply
to this email directly or view it on GitHub > <
https://github.com/biocore/qiita/pull/1075/files#r29197966>. > -- Jose
Navas --- Reply to this email directly or view it on GitHub:
https://github.com/biocore/qiita/pull/1075/files#r29198175—
Reply to this email directly or view it on GitHub
https://github.com/biocore/qiita/pull/1075/files#r29198597.
Jose Navas
Can you pull from upstream? |
…fix-metadata-obj
…fix-metadata-obj
Done! |
@josenavas, from looking at the tests in Travis, it seems like there's a test case that's raising a warning that's not being catched, this might or not be intentional, but would be good to update the tests so there's no warnings being raised, see this for example:
|
Ok, I'll update the tests so no warning is raised. There are quite a few instances of those... will ASAP. |
|
||
Returns | ||
------- | ||
md_template : DataFrame | ||
Cleaned copy of the input md_template | ||
|
||
Raises |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the warning being raised when there are missing columns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, I was unsure if the warnings were documented in the same way :-)
After these changes go in, do we need to update ftp://ftp.microbio.me/pub/qiita/sample_prep_template_examples.tgz ? |
# Remove the sample_id and _id_column columns | ||
db_cols.remove('sample_id') | ||
db_cols.remove(cls._id_column) | ||
headers = sorted(md_template.keys().tolist()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we no longer have to check the column names are unique? Or is this guaranteed by .keys
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point is it assumed that the md_template has been validated and clean (I think it's a safe assumption since this is a private function). Those checks are done in _clean_validate_template
. That sounds reasonable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, thanks!
@antgonza we probably will need to update those, good call. I will take a note to update those once the entire branch is ready to be merged in master. I just found a small bug on the code and I'm fixing it. However, it requires modifying multiple tests. Will ping when I update everything (and including the comments made so far). |
error_msg="Processed data approval disabled") | ||
} | ||
|
||
# A dict containing the restrictions that apply to the sample templates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sample -> prep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@antgonza @ElDeveloper this should be ready for another review, thanks guys! |
if warning_msg: | ||
warnings.warn( | ||
"Some functionality will be disabled due to missing " | ||
"columns:\n\t%s.\nCheck https://github.com/biocore/qiita/wiki" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a link so the user can access it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How should I make it a link? Note that this error is raised in qiita_db. I'm hesitant to add specific html here since this warning can be seen from the CLI...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was also thinking about this, but it's going to be tricky since this
can also be raised from a terminal and you would get an HTML formatted
error message, which is not appropriate. Ideas?
On (Apr-27-15|15:22), Antonio Gonzalez wrote:
# Retrieve the headers of the metadata template
headers = list(md_template.keys())
if warning_msg:
warnings.warn(
"Some functionality will be disabled due to missing "
"columns:\n\t%s.\nCheck https://github.com/biocore/qiita/wiki"
Should this be a link so the user can access it?
Reply to this email directly or view it on GitHub:
https://github.com/biocore/qiita/pull/1075/files#r29197809
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the issue with CLI? I think is fine ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ElDeveloper you're the CLI guru, what do you think? I prefer not to add HTML code in the CLI error/warning output, but if more people agrees I'm happy to add it...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be a separate conversation/issue. BTW IMOO the parsing should be in python/templates and not in JS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works!
On (Apr-27-15|16:29), Antonio Gonzalez wrote:
# Retrieve the headers of the metadata template
headers = list(md_template.keys())
if warning_msg:
warnings.warn(
"Some functionality will be disabled due to missing "
"columns:\n\t%s.\nCheck https://github.com/biocore/qiita/wiki"
I think this should be a separate conversation/issue. BTW IMOO the parsing should be in python/templates and not in JS.
Reply to this email directly or view it on GitHub:
https://github.com/biocore/qiita/pull/1075/files#r29202450
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that means that this should stay as it is and we'll improve the qiita_pet error parsing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, can you open an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! Done: #1115
2015-04-27 17:04 GMT-07:00 Antonio Gonzalez notifications@github.com:
In qiita_db/metadata_template/base_metadata_template.py
#1075 (comment):
# Retrieve the headers of the metadata template
headers = list(md_template.keys())
if warning_msg:
warnings.warn(
"Some functionality will be disabled due to missing "
"columns:\n\t%s.\nCheck https://github.com/biocore/qiita/wiki"
yup, can you open an issue?
—
Reply to this email directly or view it on GitHub
https://github.com/biocore/qiita/pull/1075/files#r29204247.
Jose Navas
All my comments have been addressed, I only have a question, but that's not a blocking comment. @antgonza can you merge once your comments/questions are addressed? |
@antgonza can you merge if there are no outstanding issues? |
This is ready to go if there are no more comments! |
LGTM. |
This PR is built on top of #1074
This PR makes all the needed modifications in the metadata template objects to get all the tests under
qiita_db/metadata_template/tests
to pass. The PR right now is a bit big, but as the other PR in which this one depends gets merged, the size will be reduced.