Skip to content

List proc data in search #1039

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
200 changes: 139 additions & 61 deletions qiita_pet/handlers/study_handlers/listing_handlers.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
from __future__ import division
from json import dumps
from future.utils import viewitems
from collections import defaultdict

from tornado.web import authenticated, HTTPError
from tornado.gen import coroutine, Task
Expand All @@ -21,6 +22,7 @@
from qiita_db.exceptions import QiitaDBIncompatibleDatatypeError
from qiita_db.util import get_table_cols
from qiita_db.data import ProcessedData
from qiita_core.exceptions import IncompetentQiitaDeveloperError

from qiita_pet.handlers.base_handlers import BaseHandler
from qiita_pet.handlers.util import study_person_linkifier, pubmed_linkifier
Expand All @@ -42,74 +44,149 @@ def _get_shared_links_for_study(study):
return ", ".join(shared)


def _build_study_info(user, results=None):
"""builds list of dicts for studies table, with all html formatted"""
def _build_single_study_info(study, info, study_proc, proc_samples):
"""Clean up and add to the study info for HTML purposes
Parameters
----------
study : Study object
The study to build information for
info : dict
Information from Study.get_info
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last two arguments are missing here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

study_proc : dict of dict of lists
Dictionary keyed on study_id that lists all processed data associated
with that study. This list of processed data ids is keyed by data type
proc_samples : dict of lists
Dictionary keyed on proc_data_id that lists all samples associated with
that processed data.
Returns
-------
dict
info-information + extra information for the study,
slightly HTML formatted
"""
PI = StudyPerson(info['principal_investigator_id'])
status = study.status
if info['pmid'] is not None:
info['pmid'] = ", ".join([pubmed_linkifier([p])
for p in info['pmid']])
else:
info['pmid'] = ""
if info["number_samples_collected"] is None:
info["number_samples_collected"] = 0
info["shared"] = _get_shared_links_for_study(study)
info["num_raw_data"] = len(study.raw_data())
info["status"] = status
info["study_id"] = study.id
info["pi"] = study_person_linkifier((PI.email, PI.name))
del info["principal_investigator_id"]
del info["email"]
# Build the proc data info list for the child row in datatable
info["proc_data_info"] = []
for data_type, proc_datas in viewitems(study_proc[study.id]):
info["proc_data_info"].extend([
_build_single_proc_data_info(pd_id, data_type, proc_samples[pd_id])
for pd_id in proc_datas])
return info


def _build_single_proc_data_info(proc_data_id, data_type, samples):
"""Build the proc data info list for the child row in datatable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an extra blank blank line here, can you remove it?

Parameters
----------
proc_data_id : int
The processed data attached to he study, in the form
{study_id: [proc_data_id, proc_data_id, ...], ...}
data_type : str
Data type of the processed data
proc_samples : dict of lists
The samples available in the processed data, in the form
{proc_data_id: [samp1, samp2, ...], ...}
Returns
-------
dict
The information for the processed data, in the form {info: value, ...}
"""
proc_data = ProcessedData(proc_data_id)
proc_info = proc_data.processing_info
proc_info['pid'] = proc_data_id
proc_info['data_type'] = data_type
proc_info['samples'] = sorted(samples)
proc_info['processed_date'] = str(proc_info['processed_date'])
return proc_info


def _build_study_info(user, study_proc=None, proc_samples=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the code on this function looks good, I had a hard time understanding completely what the function is doing. I think it will improve the readability and maintenance of the code if parts of this function are encapsulated in their own functions. For example, the construction of the proc_data_info elements can be encapsulated into their own function. This also makes it easier to test.

Also, IMOO the html introduced here is hard to read. Is it possible to move it to a global so it becomes easier to read (less problems with PEP8)...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactored and now renders the gnarly stuff client-side usingt he datatables module itself.

"""Builds list of dicts for studies table, with all HTML formatted
Parameters
----------
user : User object
logged in user
study_proc : dict of lists, optional
Dictionary keyed on study_id that lists all processed data associated
with that study. Required if proc_samples given.
proc_samples : dict of lists, optional
Dictionary keyed on proc_data_id that lists all samples associated with
that processed data. Required if study_proc given.
Returns
-------
infolist: list of dict of lists and dicts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

o.O possible to have a format description, this is really convoluted... Something like:[{study_id: [{foo:bar}]}]. I think it will help to explain what is returned here...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking a bit more on how this looks like and reading a bit further the code, I think you will take advantage of using namedtuples. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly. How do those work when JSON serialized?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, probably worth give it a shot and try different things. The code as it is right now is really hard to follow and I don't think that is maintainable.
On the other hand, are you forced to use JSON? The reason why I'm asking is because it looks like you're using all this information in a javascript function, so you probably can just pass the python object and use the template engine to fill those values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has to be JSON because all this is sent over AJAX to the datatable, which is what allows the table to be dynamic for each metadata search.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it looks like serializing namedtuples to json is a bit tricky http://stackoverflow.com/q/5906831/3746629. Giving this, I think that breaking the function in multiple functions (as I suggested before) and introducing a highly detailed documentation, will help to make this code maintainable.

IMOO, this function should only create the list. The outer dictionary should be created by another function, and the inner dictionary by another function. This will put a logical break up in the code that will be easier to follow, as well as easier testable, since the error (if any) will be in 1 of these 3 functions rather than in a single one as it is right now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

study and processed data info for JSON serialiation for datatables
Each dict in the list is a single study, and contains the text
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing the return description.

Notes
-----
Both study_proc and proc_samples must be passed, or neither passed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest also adding a comment on the description of each parameter: must be provided if 'proc_samples' is provided; or something along those lines

"""
build_samples = False
# Logic check to make sure both needed parts passed
if study_proc is not None and proc_samples is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's no tests for these exception cases. Would you mind adding those?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

raise IncompetentQiitaDeveloperError(
'Must pass proc_samples when study_proc given')
elif proc_samples is not None and study_proc is None:
raise IncompetentQiitaDeveloperError(
'Must pass study_proc when proc_samples given')
elif study_proc is None:
build_samples = True

# get list of studies for table
study_list = user.user_studies.union(
study_set = user.user_studies.union(
Study.get_by_status('public')).union(user.shared_studies)
if results is not None:
study_list = study_list.intersection(results)
if not study_list:
if study_proc is not None:
study_set = study_set.intersection(study_proc)
if not study_set:
# No studies left so no need to continue
return []

# get info for the studies
cols = ['study_id', 'email', 'principal_investigator_id',
'pmid', 'study_title', 'metadata_complete',
'number_samples_collected', 'study_abstract']
study_info = Study.get_info(study_list, cols)
study_info = Study.get_info(study_set, cols)

infolist = []
for row, info in enumerate(study_info):
for info in study_info:
# Convert DictCursor to proper dict
info = dict(info)
study = Study(info['study_id'])
status = study.status
# Just passing the email address as the name here, since
# name is not a required field in qiita.qiita_user
PI = StudyPerson(info['principal_investigator_id'])
PI = study_person_linkifier((PI.email, PI.name))
if info['pmid'] is not None:
pmids = ", ".join([pubmed_linkifier([p])
for p in info['pmid']])
else:
pmids = ""
if info["number_samples_collected"] is None:
info["number_samples_collected"] = "0"
shared = _get_shared_links_for_study(study)
meta_complete_glyph = "ok" if info["metadata_complete"] else "remove"
# build the HTML elements needed for table cell
title = ("<a href='#' data-toggle='modal' "
"data-target='#study-abstract-modal' "
"onclick='fillAbstract(\"studies-table\", {0})'>"
"<span class='glyphicon glyphicon-file' "
"aria-hidden='true'></span></a> | "
"<a href='/study/description/{1}' "
"id='study{0}-title'>{2}</a>").format(
str(row), str(study.id), info["study_title"])
meta_complete = "<span class='glyphicon glyphicon-%s'></span>" % \
meta_complete_glyph
if status == 'public':
shared = "Not Available"
else:
shared = ("<span id='shared_html_{0}'>{1}</span><br/>"
"<a class='btn btn-primary btn-xs' data-toggle='modal' "
"data-target='#share-study-modal-view' "
"onclick='modify_sharing({0});'>Modify</a>".format(
study.id, shared))

infolist.append({
"checkbox": "<input type='checkbox' value='%d' />" % study.id,
"id": study.id,
"title": title,
"meta_complete": meta_complete,
"num_samples": info["number_samples_collected"],
"shared": shared,
"num_raw_data": len(study.raw_data()),
"pi": PI,
"pmid": pmids,
"status": status,
"abstract": info["study_abstract"]

})
# Build the processed data info for the study if none passed
if build_samples:
proc_data_list = study.processed_data()
proc_samples = {}
study_proc = {study.id: defaultdict(list)}
for pid in proc_data_list:
proc_data = ProcessedData(pid)
study_proc[study.id][proc_data.data_type()].append(pid)
proc_samples[pid] = proc_data.samples

study_info = _build_single_study_info(study, info, study_proc,
proc_samples)
infolist.append(study_info)
return infolist


Expand Down Expand Up @@ -195,12 +272,12 @@ def get(self, ignore):

if user != self.current_user.id:
raise HTTPError(403, 'Unauthorized search!')
res = None
if query:
# Search for samples matching the query
search = QiitaStudySearch()
try:
res, meta = search(query, self.current_user)
search(query, self.current_user)
study_proc, proc_samples, _ = search.filter_by_processed_data()
except ParseException:
self.clear()
self.set_status(400)
Expand All @@ -223,9 +300,10 @@ def get(self, ignore):
info={'User': self.current_user.id,
'query': query})
return
if not res:
res = {}
info = _build_study_info(self.current_user, results=res)
else:
study_proc = proc_samples = None
info = _build_study_info(self.current_user, study_proc=study_proc,
proc_samples=proc_samples)
# build the table json
results = {
"sEcho": echo,
Expand Down
82 changes: 69 additions & 13 deletions qiita_pet/templates/list_studies.html
Original file line number Diff line number Diff line change
@@ -1,35 +1,75 @@
{% extends sitebase.html %}
{% block head %}
<link rel="stylesheet" href="/static/vendor/css/jquery.dataTables.css" type="text/css">

<style>
td.details-control {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably go in style.css.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's specific to this page, so I've left it in header.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

On (Apr-13-15|12:37), Joshua Shorenstein wrote:

@@ -1,7 +1,11 @@
{% extends sitebase.html %}
{% block head %}

+<style>
+td.details-control {

it's specific to this page, so I've left it in header.


Reply to this email directly or view it on GitHub:
https://github.com/biocore/qiita/pull/1039/files#r28271335

cursor: pointer;
}
</style>
<script src="/static/vendor/js/jquery.dataTables.min.js"></script>
<script src="/static/vendor/js/jquery.dataTables.plugin.natural.js"></script>

<script type="text/javascript">
var current_study;
var ajaxURL = "/study/search/?&user={{current_user.id}}&sEcho=" + Math.floor(Math.random()*1001);
$(document).ready(function() {

function format ( d ) {
// `d` is the original data object for the row
var proc_data_table = '<h4>Processed Data</h4><table class="table" cellpadding="5" cellspacing="0" border="0" style="padding-left:50px;"><tr><th></th><th>ID</th><th>Data type</th><th>Processed Date</th><th>Algorithm</th><th>Reference</th><th>Samples</th></tr>';
for(i=0;i<d.proc_data_info.length;i++) {
var proc_data = d.proc_data_info[i];
proc_data_table += '<tr><td><input type="checkbox"></td><td>' + proc_data.pid + '</td><td>' + proc_data.data_type + '</td><td>' + proc_data.processed_date + '</td><td>' + proc_data.algorithm + '</td><td>' + proc_data.reference_name + ' ' + proc_data.reference_version + '</td><td>' + proc_data.samples.length + '</td></tr>';
}
proc_data_table += '</table>';
return proc_data_table;
}

$('#studies-table').dataTable({
"columns": [
{ "data": "checkbox" },
{ "data": "title" },
{ "data": "abstract" },
{ "data": "id" },
{ "data": "meta_complete" },
{ "data": "num_samples" },
{ "className": 'details-control', "orderable": false},
{ "data": "study_title" },
{ "data": "study_abstract" },
{ "data": "study_id" },
{ "data": "metadata_complete" },
{ "data": "number_samples_collected" },
{ "data": "num_raw_data" },
{ "data": "shared" },
{ "data": "pi" },
{ "data": "pmid" },
{ "data": "status" }
{ "data": "status" },
{"className": 'details-control', "orderable": false, "data": null, "defaultContent": '<span class="glyphicon glyphicon-chevron-down"></span>'}
],
order: [[10, "desc"], [ 1, "asc" ]],
columnDefs: [{type:'natural', targets:[3,4,5,9]}, {"targets": [ 2 ],"visible": false}],
columnDefs: [
{type:'natural', targets:[3,4,5,9]},
{"targets": [ 2 ],"visible": false},
// render the study checkbox cell
{"render": function ( data, type, row, meta ) {
return "<input type='checkbox' value='"+ row.study_id +"'>";
}, targets: [0]},
// render the title cell
{"render": function ( data, type, row, meta ) {
return "<a href='#' data-toggle='modal' data-target='#study-abstract-modal' onclick=\"fillAbstract('studies-table', "+ meta.row +")\"><span class='glyphicon glyphicon-file' aria-hidden='true'></span></a> | <a href='/study/description/"+ row.study_id +"' id='study"+ meta.row +"-title'>"+ data +"</a>";
}, targets: [1]},
// render the metadata complete cell
{"render": function ( data, type, row, meta ) {
var glyph = 'remove';
if(data === true) { glyph = 'ok' }
return "<span class='glyphicon glyphicon-"+ glyph +"'></span>";
}, targets: [4]},
// render the shared with cell
{"render": function ( data, type, row, meta ) {
var glyph = 'remove';
if(data === true) { glyph = 'ok' }
return "<span id='shared_html_"+ row.study_id +"'>"+ data +"</span><br/><a class='btn btn-primary btn-xs' data-toggle='modal' data-target='#share-study-modal-view' onclick='modify_sharing("+ row.study_id +");'>Modify</a>";
}, targets: [7]},
],
"oLanguage": {
"sSearch": "Narrow search results by column data (Title, abstract, PI, etc):",
"sLoadingRecords": "Loading table data",
"sZeroRecords": "No studies found"
},
},
"ajax": {
"url": ajaxURL + "&query=",
"error": function(jqXHR, textStatus, ex) {
Expand All @@ -39,6 +79,23 @@
}
}
});
// Add event listener for opening and closing details
$('#studies-table tbody').on('click', 'td.details-control', function () {
var table = $('#studies-table').DataTable();
var tr = $(this).closest('tr');
var row = table.row( tr );

if ( row.child.isShown() ) {
// This row is already open - close it
row.child.hide();
tr.removeClass('shown');
}
else {
// Open this row
row.child( format(row.data()) ).show();
tr.addClass('shown');
}
} );

$('#users_list').chosen({width: "100%"});
$('#users_list').on('change', function(evt, params) {
Expand Down Expand Up @@ -105,7 +162,7 @@ <h1>Metadata Search</h1>
<div class="row">
<div class="col-sm-12" id="user-studies-div">
<h1>Available Studies</h1>
<table id="studies-table" class="display table-bordered table-hover">
<table id="studies-table" class="table table-bordered">
<thead>
<tr>
<th></th>
Expand All @@ -119,10 +176,9 @@ <h1>Available Studies</h1>
<th>Principal Investigator</th>
<th>Pubmed ID(s)</th>
<th>Status</th>
<th>Expand</th>
</tr>
</thead>
<tbody>
</tbody>
</table>
<!--Abstract Modal-->
<div class="modal fade" tabindex="-1" role="dialog" aria-labelledby="myLargeModalLabel" aria-hidden="true" id="study-abstract-modal">
Expand Down
Loading