Skip to content

Start improving search #988

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 19 commits into from

Conversation

squirrelo
Copy link
Contributor

This puts base changes in place to make the search object more easily extendable. It also breaks the computational part of searches into qiita_ware, and centralizes the search stuff into qiita_ware as a passthrough and computational work.

Additionally, it allows searching over prep template data and data-type of processed data.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.11%) to 78.92% when pulling 929d27b on squirrelo:start-improving-search into dfe897a on biocore:master.

@squirrelo
Copy link
Contributor Author

@antgonza @adamrp @ElDeveloper This is ready for review, and needs to be merged before some work on #900 can continue.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.11%) to 78.92% when pulling 929d27b on squirrelo:start-improving-search into dfe897a on biocore:master.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 3d29605 on squirrelo:start-improving-search into * on biocore:master*.

@squirrelo
Copy link
Contributor Author

This isn't actually failing, but for some reason the following failure happense infrequently (see how the other test session passed)

======================================================================
FAIL: test_categories (qiita_db.test.test_metadata_template.DecoratedClass)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/biocore/qiita/qiita_db/test/test_metadata_template.py", line 1311, in test_categories
    self.assertEqual(obs, exp)
AssertionError: Items in the second set but not the first:
'longitude'

I've opened issue for it

@@ -1144,6 +1144,21 @@ def study(self):
"processed_data_id=%s".format(self._study_processed_table),
[self._id])[0]

@property
def samples(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this function to make use of a single SQL query. Right now is executing at least 4.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is a vestige of the old way of finding processed data samples. Do we want to keep this function still?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't need this property, why are you adding it here? Remove then.

@josenavas
Copy link
Contributor

@squirrelo few comments

@josenavas
Copy link
Contributor

thanks @squirrelo pulled down and it looks like works correctly.
@antgonza can you review/merge? This PR is blocking some development in my PR as I have to change the search object too and, have it merged will be very useful rather than trying to resolve the merge conflicts...

@squirrelo
Copy link
Contributor Author

Also @antgonza note the demo DB you have is not consistent and will not work with this pull request because of that. There are missing prep tables referenced, etc.

@antgonza
Copy link
Member

Thanks, I fixed those when @josenavas added queues for submission.
However, while trying to search for studies:

ERROR:tornado.application:Uncaught exception POST /analysis/2 (::1)
HTTPRequest(protocol='http', host='localhost:21174', method='POST',
uri='/analysis/2', version='HTTP/1.1', remote_ip='::1',
headers={'Origin': 'http://localhost:21174', 'Content-Length': '66',
'Accept-Language': 'en-US,en;q=0.8,es;q=0.6', 'Accept-Encoding':
'gzip, deflate', 'Host': 'localhost:21174', 'Accept':
'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_2)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.89
Safari/537.36', 'Connection': 'keep-alive', 'Referer':
'http://localhost:21174/analysis/2', 'Cache-Control': 'max-age=0',
'Cookie': '_ga=GA1.1.641551533.1412272534;
user="ImFudGdvbnphQGdtYWlsLmNvbSI=|1426701791|debed46f478f366d37f16a99184ef92a6dc26654"',
'Content-Type': 'application/x-www-form-urlencoded'})
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/web.py",
line 1141, in _when_complete
    callback()
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/web.py",
line 1162, in _execute_method
    self._when_complete(method(*self.path_args, **self.path_kwargs),
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tornado/web.py",
line 2297, in wrapper
    return method(self, *args, **kwargs)
  File "/Users/antoniog/svn_programs/qiita/qiita_pet/handlers/analysis_handlers.py",
line 147, in post
    results, meta_headers = search(query, user)
  File "/Users/antoniog/svn_programs/qiita/qiita_db/search.py", line
275, in __call__
    results = self._process_to_dict(conn_handler.execute_queue('search'),
  File "/Users/antoniog/svn_programs/qiita/qiita_db/sql_connection.py",
line 349, in execute_queue
    self._rollback_raise_error(queue, sql, sql_args, e)
  File "/Users/antoniog/svn_programs/qiita/qiita_db/sql_connection.py",
line 312, in _rollback_raise_error
    str(sql_args), e)))
QiitaDBExecutionError:
Error running SQL query in queue search: SELECT
sr.study_id,sr.processed_data_id,sr.sample_id,sr.study_title FROM
(SELECT * FROM qiita.study s RIGHT JOIN qiita.study_processed_data
USING (study_id) JOIN qiita.required_sample_info USING (study_id) JOIN
qiita.common_prep_info USING (sample_id) JOIN qiita.processed_data
USING (processed_data_id) JOIN qiita.data_type USING (data_type_id)
JOIN qiita.sample_969 USING (sample_id) JOIN qiita.prep_142 USING
(sample_id) ) AS sr WHERE LOWER(sr.study_title) LIKE '%test%'
ARGS: None
Error: column reference "study_id" is ambiguous
LINE 1: SELECT sr.study_id,sr.processed_data_id,sr.sample_id,sr.stud...
               ^

ERROR:tornado.access:500 POST /analysis/2 (::1) 174.84ms

@josenavas
Copy link
Contributor

Thanks @antgonza just noticed that this PR depends on #992 so it cannot be merged until that one is merged.

@josenavas
Copy link
Contributor

I just noticed that @squirrelo closed the other PR. Anyways, the changes done in here limit the metadata columns, so this PR cannot be merged until the columns are limited in the metadata template and a data patch is in place to fix the existing templates

(column_name = 'study_id' OR column_name = 'processed_data_id')"""

tables = [c[0] for c in conn_handler.execute_fetchall(sql)]
for t in tables:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not ok to drop the columns, as you don't know which data are those columns holding. Please re-name the column names.

@josenavas
Copy link
Contributor

@squirrelo one more comment

@squirrelo squirrelo force-pushed the start-improving-search branch from 627f758 to f8d3fce Compare March 18, 2015 19:12
@josenavas
Copy link
Contributor

Thanks @squirrelo
@antgonza can you re-try on your database to see if this fixes the issue?

@antgonza
Copy link
Member

antgonza commented Mar 18, 2015 via email

@josenavas
Copy link
Contributor

Nop that you can't try or nop it didn't fix the issue?

@antgonza
Copy link
Member

antgonza commented Mar 18, 2015 via email

@squirrelo
Copy link
Contributor Author

Same error or new error?

@josenavas
Copy link
Contributor

@antgonza did you apply the patch?

@josenavas
Copy link
Contributor

@squirrelo just noticed that the patch is not complete. You need to re-generate the filepaths for those sample/prep templates that changed and the QIIME mapping file. Check patch num 14 to see how it's done.

@squirrelo
Copy link
Contributor Author

Given the increasing complexity of this pull request and the offline discussion about the schema changes that will have a drastic effect on this pull request, I'm closing this for now.

@squirrelo squirrelo closed this Mar 19, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants