Skip to content

Analysis cart creation #1025

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 24 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
90339fd
skeleton for demoing
squirrelo Mar 20, 2015
72267a0
add conformation to removing proc data
squirrelo Mar 25, 2015
10f077d
Merge branch 'master' of https://github.com/biocore/qiita into cart-a…
squirrelo Mar 28, 2015
f1ca570
add patch to create default analyses for all existing users
squirrelo Mar 28, 2015
e362c25
add default analysis on user creation
squirrelo Mar 28, 2015
cfe9ba2
streamline UI for cart
squirrelo Mar 30, 2015
28f8480
further refining IU
squirrelo Mar 30, 2015
f0c3f80
add default analyses using only SQL
squirrelo Mar 31, 2015
6d30936
changes to tests to reflect patch
squirrelo Mar 31, 2015
4f3756f
fix more tests, add analysis_workflow steps for carts
squirrelo Mar 31, 2015
b5bf4ed
update user private_analyses to ignore default cart
squirrelo Mar 31, 2015
68488c7
update test again to reflect change
squirrelo Mar 31, 2015
90b3de2
merge upstream/master
squirrelo Apr 1, 2015
bd3d513
Merge branch 'master' of https://github.com/biocore/qiita into cart-a…
squirrelo Apr 2, 2015
f5026f8
move default analysis pull to user object
squirrelo Apr 2, 2015
874632b
implement the default_analysis in qiita_pet
squirrelo Apr 2, 2015
ddf2ff4
more comments addressed
squirrelo Apr 2, 2015
1dc1440
pep8
squirrelo Apr 2, 2015
cf1123f
remove magic numbers from tests
squirrelo Apr 2, 2015
15c71dc
replace processed_date retriveal
squirrelo Apr 3, 2015
8e129e6
add info modal for proc data
squirrelo Apr 3, 2015
e784006
more UI changes
squirrelo Apr 3, 2015
d7b4778
couple small UI changes
squirrelo Apr 3, 2015
ba42c6a
use info glyph
squirrelo Apr 3, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 46 additions & 5 deletions qiita_db/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -1303,12 +1303,53 @@ def data_type(self, ret_id=False):
return data_type[0]

@property
def processed_date(self):
"""Return the processed date"""
def processing_info(self):
"""Return the processing item and settings used to create the data

Returns
-------
dict
Parameter settings keyed to the parameter, along with date and
algorithm used
"""
# Get processed date and the info for the dynamic table
conn_handler = SQLConnectionHandler()
return conn_handler.execute_fetchone(
"SELECT processed_date FROM qiita.{0} WHERE "
"processed_data_id=%s".format(self._table), (self.id,))[0]
sql = """SELECT processed_date, processed_params_table,
processed_params_id FROM qiita.{0}
WHERE processed_data_id=%s""".format(self._table)
static_info = conn_handler.execute_fetchone(sql, (self.id,))

# Get the info from the dynamic table, including reference used
sql = """SELECT * from qiita.{0}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting confused in this SQL query, it looks like it is enough to do:

SELECT * FROM qiita.{0}
  JOIN qiita.reference USING (reference_id)
WHERE processed_params_id = {1}

I think this SQL query can return only a single value, since there will be only one row with the given processed_params_id.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, was being way to protective about the join screwing up. Changed.

JOIN qiita.reference USING (reference_id)
WHERE processed_params_id = {1}
""".format(static_info['processed_params_table'],
static_info['processed_params_id'])
dynamic_info = dict(conn_handler.execute_fetchone(sql))

# replace reference filepath_ids with full filepaths
# figure out what columns have filepaths and what don't
ref_fp_cols = {'sequence_filepath', 'taxonomy_filepath',
'tree_filepath'}
fp_ids = [str(dynamic_info[col]) for col in ref_fp_cols
if dynamic_info[col] is not None]
# Get the filepaths and create dict of fpid to filepath
sql = ("SELECT filepath_id, filepath FROM qiita.filepath WHERE "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this query can be merged with the previous so a single query is done in this entire code, would you mind to take a look? I can potentially give it a shot later today if you can't.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason it's this way is because only sequence_filepath is a required column, so the other two may not exist. This leads to issues with JOINs that expect all three columns to exist.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh true, thanks!

"filepath_id IN ({})").format(','.join(fp_ids))
lookup = {fp[0]: fp[1] for fp in conn_handler.execute_fetchall(sql)}
# Loop through and replace ids
for key in ref_fp_cols:
if dynamic_info[key] is not None:
dynamic_info[key] = lookup[dynamic_info[key]]

# add missing info to the dictionary and remove id column info
dynamic_info['processed_date'] = static_info['processed_date']
dynamic_info['algorithm'] = static_info[
'processed_params_table'].split('_')[-1]
del dynamic_info['processed_params_id']
del dynamic_info['reference_id']

return dynamic_info

@property
def samples(self):
Expand Down
14 changes: 14 additions & 0 deletions qiita_db/support_files/patches/21.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
-- March 28, 2015
-- Add default analyses for all existing users
DO $do$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this patch

I think we can potentially use some of the SQL tricks to reduce some of the code in the qiita_db objects and we will improve the performance/quality of our code.

DECLARE
eml varchar;
aid bigint;
BEGIN
FOR eml IN
SELECT email FROM qiita.qiita_user
LOOP
INSERT INTO qiita.analysis (email, name, description, dflt, analysis_status_id) VALUES (eml, eml || '-dflt', 'dflt', true, 1) RETURNING analysis_id INTO aid;
INSERT INTO qiita.analysis_workflow (analysis_id, step) VALUES (aid, 2);
END LOOP;
END $do$;
6 changes: 6 additions & 0 deletions qiita_db/support_files/populate_test_db.sql
Original file line number Diff line number Diff line change
Expand Up @@ -420,3 +420,9 @@ INSERT INTO qiita.collection_job (collection_id, job_id) VALUES (1, 1);

--share collection with shared user
INSERT INTO qiita.collection_users (email, collection_id) VALUES ('shared@foo.bar', 1);

--add default analysis for users
INSERT INTO qiita.analysis (email, name, description, dflt, analysis_status_id) VALUES ('test@foo.bar', 'test@foo.bar-dflt', 'dflt', true, 1), ('admin@foo.bar', 'admin@foo.bar-dflt', 'dflt', true, 1), ('shared@foo.bar', 'shared@foo.bar-dflt', 'dflt', true, 1), ('demo@microbio.me', 'demo@microbio.me-dflt', 'dflt', true, 1);

-- Attach samples to analysis
INSERT INTO qiita.analysis_sample (analysis_id, processed_data_id, sample_id) VALUES (3,1,'1.SKD8.640184'), (3,1,'1.SKB7.640196'), (3,1,'1.SKM9.640192'), (3,1,'1.SKM4.640180')
42 changes: 22 additions & 20 deletions qiita_db/test/test_analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
from qiita_db.job import Job
from qiita_db.user import User
from qiita_db.exceptions import QiitaDBStatusError
from qiita_db.util import get_mountpoint
from qiita_db.util import get_mountpoint, get_count
from qiita_db.study import Study, StudyPerson
from qiita_db.data import ProcessedData
from qiita_db.metadata_template import SampleTemplate
Expand Down Expand Up @@ -92,36 +92,36 @@ def test_has_access_no_access(self):
def test_create(self):
sql = "SELECT EXTRACT(EPOCH FROM NOW())"
time1 = float(self.conn_handler.execute_fetchall(sql)[0][0])

new_id = get_count("qiita.analysis") + 1
new = Analysis.create(User("admin@foo.bar"), "newAnalysis",
"A New Analysis")
self.assertEqual(new.id, 3)
self.assertEqual(new.id, new_id)
sql = ("SELECT analysis_id, email, name, description, "
"analysis_status_id, pmid, EXTRACT(EPOCH FROM timestamp) "
"FROM qiita.analysis WHERE analysis_id = 3")
obs = self.conn_handler.execute_fetchall(sql)
self.assertEqual(obs[0][:-1], [3, 'admin@foo.bar', 'newAnalysis',
"FROM qiita.analysis WHERE analysis_id = %s")
obs = self.conn_handler.execute_fetchall(sql, [new_id])
self.assertEqual(obs[0][:-1], [new_id, 'admin@foo.bar', 'newAnalysis',
'A New Analysis', 1, None])
self.assertTrue(time1 < float(obs[0][-1]))

def test_create_parent(self):
sql = "SELECT EXTRACT(EPOCH FROM NOW())"
time1 = float(self.conn_handler.execute_fetchall(sql)[0][0])

new_id = get_count("qiita.analysis") + 1
new = Analysis.create(User("admin@foo.bar"), "newAnalysis",
"A New Analysis", Analysis(1))
self.assertEqual(new.id, 3)
self.assertEqual(new.id, new_id)
sql = ("SELECT analysis_id, email, name, description, "
"analysis_status_id, pmid, EXTRACT(EPOCH FROM timestamp) "
"FROM qiita.analysis WHERE analysis_id = 3")
obs = self.conn_handler.execute_fetchall(sql)
self.assertEqual(obs[0][:-1], [3, 'admin@foo.bar', 'newAnalysis',
"FROM qiita.analysis WHERE analysis_id = %s")
obs = self.conn_handler.execute_fetchall(sql, [new_id])
self.assertEqual(obs[0][:-1], [new_id, 'admin@foo.bar', 'newAnalysis',
'A New Analysis', 1, None])
self.assertTrue(time1 < float(obs[0][-1]))

sql = "SELECT * FROM qiita.analysis_chain WHERE child_id = 3"
obs = self.conn_handler.execute_fetchall(sql)
self.assertEqual(obs, [[1, 3]])
sql = "SELECT * FROM qiita.analysis_chain WHERE child_id = %s"
obs = self.conn_handler.execute_fetchall(sql, [new_id])
self.assertEqual(obs, [[1, new_id]])

def test_retrieve_owner(self):
self.assertEqual(self.analysis.owner, "test@foo.bar")
Expand Down Expand Up @@ -242,21 +242,23 @@ def test_retrieve_biom_tables_none(self):
self.assertEqual(new.biom_tables, None)

def test_set_step(self):
new_id = get_count("qiita.analysis") + 1
new = Analysis.create(User("admin@foo.bar"), "newAnalysis",
"A New Analysis", Analysis(1))
new.step = 2
sql = "SELECT * FROM qiita.analysis_workflow WHERE analysis_id = 3"
obs = self.conn_handler.execute_fetchall(sql)
self.assertEqual(obs, [[3, 2]])
sql = "SELECT * FROM qiita.analysis_workflow WHERE analysis_id = %s"
obs = self.conn_handler.execute_fetchall(sql, [new_id])
self.assertEqual(obs, [[new_id, 2]])

def test_set_step_twice(self):
new_id = get_count("qiita.analysis") + 1
new = Analysis.create(User("admin@foo.bar"), "newAnalysis",
"A New Analysis", Analysis(1))
new.step = 2
new.step = 4
sql = "SELECT * FROM qiita.analysis_workflow WHERE analysis_id = 3"
obs = self.conn_handler.execute_fetchall(sql)
self.assertEqual(obs, [[3, 4]])
sql = "SELECT * FROM qiita.analysis_workflow WHERE analysis_id = %s"
obs = self.conn_handler.execute_fetchall(sql, [new_id])
self.assertEqual(obs, [[new_id, 4]])

def test_retrieve_step(self):
new = Analysis.create(User("admin@foo.bar"), "newAnalysis",
Expand Down
15 changes: 13 additions & 2 deletions qiita_db/test/test_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -928,9 +928,20 @@ def test_link_filepaths_status_setter(self):
pd._set_link_filepaths_status('failed: error')
self.assertEqual(pd.link_filepaths_status, 'failed: error')

def test_processed_date(self):
def test_processing_info(self):
pd = ProcessedData(1)
self.assertEqual(pd.processed_date, datetime(2012, 10, 1, 9, 30, 27))
exp = {
'algorithm': 'uclust',
'processed_date': datetime(2012, 10, 1, 9, 30, 27),
'enable_rev_strand_match': True,
'similarity': 0.97,
'suppress_new_clusters': True,
'reference_name': 'Greengenes',
'reference_version': '13_8',
'sequence_filepath': 'GreenGenes_13_8_97_otus.fasta',
'taxonomy_filepath': 'GreenGenes_13_8_97_otu_taxonomy.txt',
'tree_filepath': 'GreenGenes_13_8_97_otus.tree'}
self.assertEqual(pd.processing_info, exp)

def test_samples(self):
pd = ProcessedData(1)
Expand Down
11 changes: 6 additions & 5 deletions qiita_db/test/test_job.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
from qiita_core.util import qiita_test_checker
from qiita_db.job import Job, Command
from qiita_db.user import User
from qiita_db.util import get_mountpoint
from qiita_db.util import get_mountpoint, get_count
from qiita_db.analysis import Analysis
from qiita_db.exceptions import (QiitaDBDuplicateError, QiitaDBStatusError,
QiitaDBUnknownIDError)
Expand Down Expand Up @@ -219,16 +219,17 @@ def test_create_exists(self):

def test_create_exists_return_existing(self):
"""Makes sure creation doesn't duplicate a job by returning existing"""
new_id = get_count("qiita.analysis") + 1
Analysis.create(User("demo@microbio.me"), "new", "desc")
self.conn_handler.execute(
"INSERT INTO qiita.analysis_sample "
"(analysis_id, processed_data_id, sample_id) VALUES "
"(3, 1, '1.SKB8.640193'), (3, 1, '1.SKD8.640184'), "
"(3, 1, '1.SKB7.640196'), (3, 1, '1.SKM9.640192'), "
"(3, 1, '1.SKM4.640180')")
"({0}, 1, '1.SKB8.640193'), ({0}, 1, '1.SKD8.640184'), "
"({0}, 1, '1.SKB7.640196'), ({0}, 1, '1.SKM9.640192'), "
"({0}, 1, '1.SKM4.640180')".format(new_id))
new = Job.create("18S", "Beta Diversity",
{"--otu_table_fp": 1, "--mapping_fp": 1},
Analysis(3), return_existing=True)
Analysis(new_id), return_existing=True)
self.assertEqual(new.id, 2)

def test_retrieve_datatype(self):
Expand Down
6 changes: 2 additions & 4 deletions qiita_db/test/test_setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,9 @@

from unittest import TestCase, main

from qiita_core.util import qiita_test_checker
from qiita_db.util import get_count, check_count


@qiita_test_checker()
class SetupTest(TestCase):
"""Tests that the test database have been successfully populated"""

Expand Down Expand Up @@ -111,7 +109,7 @@ def test_job(self):
self.assertEqual(get_count("qiita.job"), 3)

def test_analysis(self):
self.assertEqual(get_count("qiita.analysis"), 2)
self.assertEqual(get_count("qiita.analysis"), 6)

def test_analysis_job(self):
self.assertEqual(get_count("qiita.analysis_job"), 3)
Expand All @@ -123,7 +121,7 @@ def test_analysis_filepath(self):
self.assertEqual(get_count("qiita.analysis_filepath"), 2)

def test_analysis_sample(self):
self.assertEqual(get_count("qiita.analysis_sample"), 9)
self.assertEqual(get_count("qiita.analysis_sample"), 13)

def test_analysis_users(self):
self.assertEqual(get_count("qiita.analysis_users"), 1)
Expand Down
11 changes: 11 additions & 0 deletions qiita_db/test/test_user.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,13 @@ def test_create_user(self):
'email': 'new@test.bar'}
self._check_correct_info(obs, exp)

# make sure default analysis created
sql = ("SELECT email, name, description, dflt FROM qiita.analysis "
"WHERE email = 'new@test.bar'")
obs = self.conn_handler.execute_fetchall(sql)
exp = [['new@test.bar', 'new@test.bar-dflt', 'dflt', True]]
self.assertEqual(obs, exp)

def test_create_user_info(self):
user = User.create('new@test.bar', 'password', self.userinfo)
self.assertEqual(user.id, 'new@test.bar')
Expand Down Expand Up @@ -212,6 +219,10 @@ def test_set_info_bad_info(self):
with self.assertRaises(QiitaDBColumnError):
self.user.info = self.userinfo

def test_default_analysis(self):
obs = self.user.default_analysis
self.assertEqual(obs, 4)

def test_get_user_studies(self):
user = User('test@foo.bar')
self.assertEqual(user.user_studies, {1})
Expand Down
29 changes: 24 additions & 5 deletions qiita_db/user.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ class User(QiitaObject):
info
user_studies
shared_studies
default_analysis
private_analyses
shared_analyses

Expand Down Expand Up @@ -224,10 +225,21 @@ def create(cls, email, password, info=None):
# for sql insertion
columns = info.keys()
values = [info[col] for col in columns]
queue = "add_user_%s" % email
conn_handler.create_queue(queue)
# crete user
sql = "INSERT INTO qiita.{0} ({1}) VALUES ({2})".format(
cls._table, ','.join(columns), ','.join(['%s'] * len(values)))
conn_handler.add_to_queue(queue, sql, values)
# create user default sample holder
sql = ("INSERT INTO qiita.analysis "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is personal opinion but I think this highly improves code readability and even when failures happen they're easier to read on the command line. Instead of using quotes ("), I like to user triple quotes (""") and then align the SQL in a easy to read way. In this example:

sql = """INSERT INTO qiita.analysis
           (email, name, description, dflt, analysis_status_id)
         VALUES (%s, %s, %s, %s, 1)"""

The cool thing about this, is that if the query fails, it gets also formatted on the CLI. Also it is a more natural way of reading SQL as things align better (note the small indentation on the column names).

This is not blocking though...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do this, but it is also something to put in contributing.md and needs to be consistently done across the entire codebase. Again, if that's agreeable I will make the change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's not on contributing so no worries if you don't want to change it. I'm doing it as I'm changing other parts of the code, but there is no documentation at this point.

"(email, name, description, dflt, analysis_status_id) "
"VALUES (%s, %s, %s, %s, 1)")
conn_handler.add_to_queue(queue, sql,
(email, '%s-dflt' % email, 'dflt', True))

conn_handler.execute_queue(queue)

sql = ("INSERT INTO qiita.%s (%s) VALUES (%s)" %
(cls._table, ','.join(columns), ','.join(['%s'] * len(values))))
conn_handler.execute(sql, values)
return cls(email)

@classmethod
Expand Down Expand Up @@ -329,6 +341,13 @@ def info(self, info):
"email = %s".format(self._table, ','.join(sql_insert)))
conn_handler.execute(sql, data)

@property
def default_analysis(self):
sql = ("SELECT analysis_id FROM qiita.analysis WHERE email = %s AND "
"dflt = true")
conn_handler = SQLConnectionHandler()
return conn_handler.execute_fetchone(sql, [self._id])[0]

@property
def sandbox_studies(self):
"""Returns a list of sandboxed study ids owned by the user"""
Expand Down Expand Up @@ -360,8 +379,8 @@ def shared_studies(self):
@property
def private_analyses(self):
"""Returns a list of private analysis ids owned by the user"""
sql = ("Select analysis_id from qiita.analysis WHERE email = %s AND "
"analysis_status_id <> 6")
sql = ("SELECT analysis_id FROM qiita.analysis "
"WHERE email = %s AND dflt = false")
conn_handler = SQLConnectionHandler()
analysis_ids = conn_handler.execute_fetchall(sql, (self._id, ))
return {a[0] for a in analysis_ids}
Expand Down
17 changes: 17 additions & 0 deletions qiita_pet/handlers/analysis_handlers.py
Original file line number Diff line number Diff line change
Expand Up @@ -363,3 +363,20 @@ def validate_absolute_path(self, root, absolute_path):
root, absolute_path)
else:
raise QiitaPetAuthorizationError(user_id, absolute_path)


class SelectedSamplesHandler(BaseHandler):
@authenticated
def get(self):
# Format sel_data to get study IDs for the processed data
sel_data = defaultdict(dict)
proc_data_info = {}
sel_samps = Analysis(self.current_user.default_analysis).samples
for pid, samps in viewitems(sel_samps):
proc_data = ProcessedData(pid)
sel_data[proc_data.study][pid] = samps
# Also get processed data info
proc_data_info[pid] = proc_data.processing_info
proc_data_info[pid]['data_type'] = proc_data.data_type()
self.render("analysis_selected.html", sel_data=sel_data,
proc_info=proc_data_info)
4 changes: 4 additions & 0 deletions qiita_pet/static/js/qiita.js
Original file line number Diff line number Diff line change
Expand Up @@ -26,3 +26,7 @@ function fillAbstract(table, row) {
$('#title-text-area').text($('#' + table).find('#study' + row + "-title").text());
$('#abstract-text-area').text($('#'+table).dataTable().fnGetData(row, 2));
}

function show_hide(div) {
$('#' + div).toggle();
}
Loading