-
Notifications
You must be signed in to change notification settings - Fork 80
Delete process data #1034
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete process data #1034
Changes from 6 commits
dc0b2ec
12575f5
44057a6
2fa574e
d63fb3e
99f196f
c195dad
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1257,6 +1257,60 @@ def create(cls, processed_params_table, processed_params_id, filepaths, | |
pd.add_filepaths(filepaths, conn_handler) | ||
return cls(pd_id) | ||
|
||
@classmethod | ||
def delete(cls, processed_data_id): | ||
"""Removes the processed data with id processed_data_id | ||
|
||
Parameters | ||
---------- | ||
processed_data_id : int | ||
The processed data id | ||
|
||
Raises | ||
------ | ||
QiitaDBUnknownIDError | ||
If the processed data id doesn't exist | ||
QiitaDBStatusError | ||
If the processed data status is not sandbox | ||
""" | ||
if cls(processed_data_id).status != 'sandbox': | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm just thinking about this. Should we limit the removal only to sandbox? I think we should be able to remove the processed data in any moment except if it is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fine with me either way. However, remember that the process data is
the one that locks (sets the permissions of) everything (preprocess
data, pre template, sample ...) so allowing to only remove sandbox
kind of makes sense.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, fine with me. We can revisit this later and it is an easy change if we decide something different. I'm going to pull down the code to test. |
||
raise QiitaDBStatusError( | ||
"Illegal operation on non sandbox processed data") | ||
|
||
conn_handler = SQLConnectionHandler() | ||
|
||
analysis_ids = [str(_id[0]) for _id in conn_handler.execute_fetchall( | ||
"SELECT DISTINCT analysis_id FROM qiita.analysis_sample WHERE " | ||
"processed_data_id = {0} ORDER BY " | ||
"analysis_id".format(processed_data_id))] | ||
|
||
if analysis_ids: | ||
raise QiitaDBError( | ||
"Processed data %d is linked to (meta)analyses: %s" % | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we improve the error message here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. analysis names would be better than IDs. There is no way to link an ID to an analysis in the UI There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Alternately, raise error without the IDs and use the Logger to store the IDs There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Going with adding names and QiitaDBError so the user can see the error. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One thing I just thought of if showing names: this will potentially show names of analyses that the user does not have access to. Probably not the end of the world, but could be a privacy issue. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that's fine. The only way this could happen is if the owner
made it public, then move it to sandbox, and is trying to delete it.
Which will be a full other beast ...
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And users should not be able to put it back to sandbox if it was public and 2015-04-03 9:31 GMT-07:00 Antonio Gonzalez notifications@github.com:
Jose Navas |
||
(processed_data_id, ", ".join(analysis_ids))) | ||
|
||
# delete | ||
queue = "delete_processed_data_%d" % processed_data_id | ||
conn_handler.create_queue(queue) | ||
|
||
sql = ("DELETE FROM qiita.preprocessed_processed_data WHERE " | ||
"processed_data_id = {0}".format(processed_data_id)) | ||
conn_handler.add_to_queue(queue, sql) | ||
|
||
sql = ("DELETE FROM qiita.processed_filepath WHERE " | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also needs to remove the filepath table entries. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not needed, see my comment below (about There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's actually needed (I think you are getting confused with some
table names), if not:
Error: update or delete on table "processed_data" violates foreign key
constraint "fk_processed_data_filepath" on table "processed_filepath"
DETAIL: Key (processed_data_id)=(2) is still referenced from table
"processed_filepath".
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @antgonza sorry for the confusion, I was answering to @squirrelo comment about deleting the entry on |
||
"processed_data_id = {0}".format(processed_data_id)) | ||
conn_handler.add_to_queue(queue, sql) | ||
|
||
sql = ("DELETE FROM qiita.study_processed_data WHERE " | ||
"processed_data_id = {0}".format(processed_data_id)) | ||
conn_handler.add_to_queue(queue, sql) | ||
|
||
sql = ("DELETE FROM qiita.processed_data WHERE " | ||
"processed_data_id = {0}".format(processed_data_id)) | ||
conn_handler.add_to_queue(queue, sql) | ||
|
||
conn_handler.execute_queue(queue) | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This also needs to delete the files on the filesystem, correct? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is something that we have not discussed as a group, but I will just make a note here. If you take a look, it doesn't even remove the row from qiita.filepath table. We are planning to run "purge_filepaths" in a cron job. That function, identifies which rows from qiita.filepath are not referenced and removes them from the table and from the filesystem. The advantage of this, is that the server is more free to answer requests rather than doing maintenance work, that can be done by a worker. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For both comments: What we are currently doing is to leave the filepath entry (it doesn't affect anything) as deleting files could be pretty expensive. Additionally, the plan is to drop one of the already created functions once we have this in: #814 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, can an issue be opened specifically for that then? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See #814 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. BTW @josenavas, we discussed this decision as a group, search in your email for "deleting files and automatic jobs" There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My question then is will that job be fired off as soon as the server starts or will there be a delay? Any delay means that the files could sit around indefinitely, as the user could start the server, quickly do something, and then shut it down again without that cleanup ever firing. Also I do not have that email in any of my accounts. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry about that @antgonza I found the email. We should write those small parts on the Contributing.md file. Next week, I'm going to put a new version together and I will submit a PR, so we can all review it. The reason why I thought it was not discussed is because that is handled inconsistently across the code base, so I thought that was not agreed. Thanks for pointing that out! @squirrelo I forwarded the email to you for your reference. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure. Will just conclude by saying if this pull request is leveraging that expected functionality, that functionality should probably go in pretty quickly after this request. |
||
@property | ||
def preprocessed_data(self): | ||
r"""The preprocessed data id used to generate the processed data""" | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -785,6 +785,21 @@ def test_create(self): | |
# preprocessed_data_id, processed_Data_id | ||
self.assertEqual(obs, [[1, 2]]) | ||
|
||
def test_delete(self): | ||
"""Correctly deletes a processed data and raises an error if it has | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In theory, these descriptions have to be one-liners, otherwise they do not show up correctly (or at least they did not show up before...) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You're missing 1 test: the processed data cannot be removed because it is linked to a (meta)analysis. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK |
||
been used in a (meta)analysis or if it's public""" | ||
pd = ProcessedData.create(self.params_table, self.params_id, | ||
self.filepaths, | ||
preprocessed_data=self.preprocessed_data, | ||
processed_date=self.date) | ||
ProcessedData.delete(pd.id) | ||
|
||
with self.assertRaises(QiitaDBUnknownIDError): | ||
ProcessedData.delete(pd.id) | ||
|
||
with self.assertRaises(QiitaDBStatusError): | ||
ProcessedData.delete(1) | ||
|
||
def test_create_no_date(self): | ||
"""Correctly adds a processed data with no date on it""" | ||
# All the other settings have been already tested on test_create | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -222,6 +222,26 @@ | |
} | ||
} | ||
|
||
function delete_processed_data(processed_data_id) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Outside the scope of this pull request, but we really should start using the AJAX POST capabilities instead of this form creation/submission we're currently doing. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, can you add an issue so we do not forget? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Issue #1036 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks.
|
||
if (confirm('Are you sure you want to delete processed data ID: ' + processed_data_id + '?')) { | ||
var form = $("<form>") | ||
.attr("action", window.location.href) | ||
.attr("method", "post") | ||
.append($("<input>") | ||
.attr("type", "hidden") | ||
.attr("name", "processed_data_id") | ||
.attr("value", processed_data_id)) | ||
.append($("<input>") | ||
.attr("type", "hidden") | ||
.attr("name", "action") | ||
.attr("value", "delete_processed_data")); | ||
$("body").append(form); | ||
form.submit(); | ||
} else { | ||
return false; | ||
} | ||
} | ||
|
||
function make_public(pd_id) { | ||
if (confirm("Are you sure you want to make this study public?")) { | ||
var form = $("<form>") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are not raising this error, so no need to list here. In fact, there is no call to
exists
so probably useful to add.Also, you're missing the description of the
QiitaDBError
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, I guess I erased the wrong ones in one of the iterations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this needs an exists call since line 1276 will do it for you, but I agree about removing the Raises text and adding QiitaDBStatusError instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
;) that's what I was doing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call