Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions qiita_db/support_files/patches/python_patches/66.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# August 31, 2018
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch needs to be moved to its own file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do this in my PR as I'm fixing conflicts; however, in the past we have decided to have only one patch per release cycle ... thus; the idea of having in the same patch

# Strip any UTF-8 characters that are not also printable ASCII characters
# from study titles. As some analysis packages cannot interpret UTF-8
# characters, it becomes important to remove them from study titles, as
# they are used as metadata/identifiers when creating new analyses.
from qiita_db.study import Study
from re import sub

studies = Study.get_by_status('public')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized that this will only loop over the public studies and we need to check all studies ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking through the methods, I haven't yet seen a way to enumerate through all of the studies, regardless of status (Study.get_all() or similar). It looks like the source of truth for the set of valid status strings is ultimately the qiita.visibility table, which lists the following:
awaiting_approval
sandbox
private
public

A nested loop to process each status in turn isn't horrible, but this might be a good time to write a get_all(), if it's not already available.

Another alternative would be to simply perform the same action in SQL directly. It looks like the only state we're changing through the API is study_title in the qiita.study table. The update statement below has already been tested, and could be placed in 66.sql:

update qiita.study set study_title = REGEXP_REPLACE(study_title,'[^\x20-\x7E]+','','g');

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, perhaps we should open an issue about making sure that all objects have an iter method. Anyway, we normally do:

studies = Study.get_by_status('private').union(
    Study.get_by_status('public')).union(Study.get_by_status('sandbox'))


for study in studies:
title = study.title
new_title = sub(r'[^\x20-\x7E]+', '', title)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps to speed up, it should only replace if the title changed ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doh!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

study.title = new_title
13 changes: 10 additions & 3 deletions qiita_pet/templates/edit_study.html
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
{% extends sitebase.html %}
{% block head %}
<script type="text/javascript" src="{% raw qiita_config.portal_dir %}/static/vendor/js/jquery.validate.min.js"></script>
<script type = "text/javascript"
src = "{% raw qiita_config.portal_dir %}/static/vendor/js/jquery.validate.min.js">
</script>
<style>
.custom-combobox {
position: relative;
Expand Down Expand Up @@ -32,7 +34,9 @@
var title = $(this).val();
// removing any duplicated whitespaces
title = title.replace(/ +(?= )/g, '');
// removing wite spaces from the front of the text
//remove all utf-8 encoded characters that are not also printable ASCII characters.
title = title.replace(/[^\x20-\x7E]+/g, "");
// removing white spaces from the front of the text
$(this).val(title.trimLeft());
});
$("#create_study").validate({
Expand Down Expand Up @@ -156,8 +160,11 @@ <h3>
{% if form_item.label.text == 'Environmental Packages' %}
{% set kwargs['size'] = len(form_item.choices) %}
{% set additional_info = 'You can select multiple entries by control-clicking (mac: command-clicking)' %}
{% elif form_item.label.text == 'Principan Investigator'%}
{% elif form_item.label.text == 'Principal Investigator'%}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

{% set kwargs['class_'] = kwargs['class_'] + ' chzn-select' %}
{% elif form_item.label.text == 'Study Title'%}
{% set additional_info = 'Study titles may only contain ASCII characters' %}

{% end %}
<tr>
<td width="20%">{% raw form_item.label %} <br /> <small>{{additional_info}}</small></td>
Expand Down