Skip to content
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 48 additions & 6 deletions qiita_pet/handlers/admin_processing_job.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,28 +105,70 @@ def get(self):

@execute_as_transaction
def post(self):

# Get user-inputted qiita id and sample names
# Get user-inputted qiita id and sample names / tube_ids
qid = self.get_argument("qid")
snames = self.get_argument("snames").split()

# Get study give qiita id
st = Study(qid).sample_template
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

st is a not a properly descriptive name for a variable, especially one that plays an important role some ten lines later in the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would changing the variable name to study work better, or would a variable name like qt_study be more appropriate?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

study is fine.


# Stripping leading qiita id from sample names
# Example: 1.SKB1.640202 -> SKB1.640202
qsnames = list(Study(qid).sample_template)
qsnames = list(st)
for i, qsname in enumerate(qsnames):
if qsname.startswith(qid):
qsnames[i] = qsname.replace(f'{qid}.', "", 1)

# Adds tube ids to a dict with key as tube id and value as qsname
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you're munging data like this to get it into the proper shape you need, it often seems intuitive to the writer, but to later readers it can appear somewhat opaque. In these cases it's important to write comments that communicate why you're doing it and what you hope to achieve. The above comment is just an English-language summary of the code below it, which I can already read.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be appropriate:

# Creates a way to access a tube_id by its corresponding sample name
# and vice versa, which is important to adding tube_id in parentheses
# after a sample name a few lines later

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much improved, thanks!

tube_ids_dict = dict()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid naming variables in terms of their data-structure (something_list, something_dict). They nearly always can be given a more descriptive name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, would tube_id_lookup be an appropriate name for the variable? Would it be better if I also change the name of the tube_id_rev variable too as well then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, those sound good.

tube_ids_rev = dict()
tube_ids = set()
if "tube_id" in st.categories:
for qsname, tid in st.get_category("tube_id").items():
formatted_name = qsname
if qsname.startswith(qid):
formatted_name = qsname.replace(f'{qid}.', "", 1)

tube_ids.add(tid)
tube_ids_dict[formatted_name] = tid
tube_ids_rev[tid] = formatted_name

# Adds tube ids after sample name in paranthesis
if len(tube_ids) > 0:
for i, sname in enumerate(snames):
if sname in qsnames:
snames[i] = f'{sname} ({tube_ids_dict[sname]})'
elif sname in tube_ids:
snames[i] = f'{tube_ids_rev[sname]} ({sname})'

# Finds duplicates in the samples
seen = dict()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try using collections.Counter. Once you have a Counter() object populated you can get the list of duplicates with a list comprehension:
list_of_dupes = [x for x in my_counter if my_counter[x] > 1]

The code will look cleaner, but more importantly list comprehensions and Counter() will be more efficient under the hood than the basic for loop/conditional implementation you have here.

for sample in snames:
if sample in seen:
seen[sample] += 1
else:
seen[sample] = 1

duplicates = []
for sample, num in seen.items():
if num > 1:
duplicates.append(f'{sample} \u00D7 {num}')
duplicates = set(duplicates)

# Remove blank samples from sample names
blank = [x for x in snames if x.lower().startswith('blank')]
blank = set([x for x in snames if x.lower().startswith('blank')])
snames = [x for x in snames if 'blank' not in x.lower()]

# Validate user's sample names against qiita study
qsnames = set(qsnames)
if len(tube_ids) == 0:
qsnames = set(qsnames)
else:
qsnames = set([f'{x} ({tube_ids_dict[x]})' for x in qsnames])
snames = set(snames)
matching = qsnames.intersection(snames)
missing = qsnames.difference(snames)
extra = snames.difference(qsnames)

self.render("sample_validation.html", input=False, matching=matching,
missing=missing, extra=extra, blank=blank)
missing=missing, extra=extra, blank=blank,
duplicates=duplicates)
10 changes: 9 additions & 1 deletion qiita_pet/templates/sample_validation.html
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
<style>
.column {
float: left;
width: 25%;
width: 20%;
}

.row:after {
Expand Down Expand Up @@ -49,6 +49,14 @@ <h2>Blank</h2>
{% end %}
</ul>
</div>
<div class="column">
<h2>Duplicates</h2>
<ul>
{% for sample in duplicates %}
<li>{{ sample }}</li>
{% end %}
</ul>
</div>
<div class="column">
<h2>Extra</h2>
<ul>
Expand Down