Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(back-end): Add Quart POST route to cancel specific video processing processes #171

Merged
merged 6 commits into from
Nov 25, 2023

Conversation

MyStackOverflows
Copy link
Contributor

Welcome to PrivacyPal! 👋

Fixes: #167

Description of the change:

As a result of the conversation on #167, we decided to forgo adding review accept/reject routes for Quart since next.js can/should handle those internally. Instead, we decided to add a Quart route so that next.js can request a specific video processing process be cancelled and cleaned up. Also includes some changes according to flask8 linting.

Motivation for the change:

Conversation on #167

@MyStackOverflows MyStackOverflows added feat New feature or request area/back-end Back-end work labels Nov 24, 2023
@MyStackOverflows MyStackOverflows added this to the Term 1 Week 12 milestone Nov 24, 2023
@MyStackOverflows MyStackOverflows self-assigned this Nov 24, 2023
MyStackOverflows and others added 2 commits November 24, 2023 15:29
Signed-off-by: Paul Unger <44368989+MyStackOverflows@users.noreply.github.com>
@MyStackOverflows MyStackOverflows marked this pull request as draft November 24, 2023 23:36
@MyStackOverflows
Copy link
Contributor Author

Since, for this feature, we don't want any of the data associated with the process we're cancelling, would using SIGKILL be more appropriate than SIGTERM?

@MyStackOverflows
Copy link
Contributor Author

MyStackOverflows commented Nov 24, 2023

mainly asking because when using terminate rather than kill, the process doesn't actually seem to be ending and actually goes through till where it prints "Done processing video". Also because when calling terminate(), the Quart shutdown procedure Thuan wrote seems to be activated

@MyStackOverflows
Copy link
Contributor Author

@tthvo ping

@MyStackOverflows MyStackOverflows removed the request for review from connordoman November 24, 2023 23:51
@tthvo
Copy link
Contributor

tthvo commented Nov 24, 2023

Since, for this feature, we don't want any of the data associated with the process we're cancelling, would using SIGKILL be more appropriate than SIGTERM?

Yeh, that's correct. Since we just want to discard everything, SIGKILL would be approriate here.

Also because when calling terminate(), the Quart shutdown procedure Thuan wrote seems to be activated

Sounds like a bug to me. What are u seeing in logs?

@MyStackOverflows
Copy link
Contributor Author

Sounds like a bug to me. What are u seeing in logs?

this is what's happening when using SIGTERM rather than SIGKILL:
image
the cancel request is received, and right afterwards, the server starts shutting down. the actual video processing process isn't cancelled as we can see the "done processing" message afterwards.

@connordoman
Copy link
Contributor

Since, for this feature, we don't want any of the data associated with the process we're cancelling, would using SIGKILL be more appropriate than SIGTERM?

Is there any chance kill will be too abrupt and leave something corrupted?

@MyStackOverflows
Copy link
Contributor Author

Is there any chance kill will be too abrupt and leave something corrupted?

it shouldn't leave anything corrupt, and even if it does, the files created by the process are cleaned up by the route afterwards

@tthvo
Copy link
Contributor

tthvo commented Nov 25, 2023

it shouldn't leave anything corrupt, and even if it does, the files created by the process are cleaned up by the route afterwards

Thats right. Unless there is something else, we will monitor.

the cancel request is received, and right afterwards, the server starts shutting down. the actual video processing process isn't cancelled as we can see the "done processing" message afterwards.

Hmm, that's odd. I will have a look.

@MyStackOverflows
Copy link
Contributor Author

MyStackOverflows commented Nov 25, 2023

@tthvo this is what I have so far, it works flawlessly with SIGKILL, just exceedingly strange what happens when you SIGTERM

@tthvo
Copy link
Contributor

tthvo commented Nov 25, 2023

I think u just found the most outrageous bug there. I might have overcomplicated things a bit. Try this patch:

diff --git a/app/video-processing/server.py b/app/video-processing/server.py
index c7af27e..aa452ff 100644
--- a/app/video-processing/server.py
+++ b/app/video-processing/server.py

-@app.before_serving
+@app.while_serving
 async def before():
     # Initiate process tracker
     app.config["TRACKER"] = ProcessTracker.get_instance()
@@ -107,32 +107,7 @@ async def before():
     prune: mp.Process = mp.Process(target=tracker.main, daemon=True)
     prune.start()
 
-    # Overwrite SIGINT and SIGTERM handler to terminate subprocesses
-    old_handlers = {
-        signal.SIGINT: signal.getsignal(signal.SIGINT),
-        signal.SIGTERM: signal.getsignal(signal.SIGTERM)
-    }
-
-    def process_cleanup(_signal, _stack):
-        # FIXME: Workaround pytest pid issues
-        if app.config["ENVIRONMENT"] == "testing" or mp.current_process().name == "MainProcess":  # Only on main process
-            # Terminate all video processing processes
-            tracker.terminate_processes()
-            # Kill prune process
-            prune.kill()
-            # Ensure all subprocesses are terminated/killed
-            while prune.is_alive() or tracker.is_any_alive():
-                print(f"Sub-processes are still running. Retry in {app.config['CLEANUP_DELAY']}s.")
-                time.sleep(app.config["CLEANUP_DELAY"])
-            print("All subprocesses are terminated.")
-
-        # Call other handlers
-        try:
-            old_handler = old_handlers[_signal]
-            if callable(old_handler):
-                old_handler(_signal, _stack)
-        except Exception:
-            pass
-
-    signal.signal(signal.SIGINT, process_cleanup)
-    signal.signal(signal.SIGTERM, process_cleanup)
+    yield
+
+    prune.kill()
+    tracker.terminate_processes()

@tthvo
Copy link
Contributor

tthvo commented Nov 25, 2023

That will allow SIGTERM to terminate sub-proccesses. Only SIGINT will hang, but when deployed (k8s/ocp), SIGINT won't be used so there is no longer concern there.

@tthvo
Copy link
Contributor

tthvo commented Nov 25, 2023

Hold on. It still causes the server to shutdown anyway. Very interesting...

@MyStackOverflows
Copy link
Contributor Author

yeah I was just about to say, it still causes the issue. I'll use SIGKILL for now in /cancel_process since it directly and instantly kills the process and is more in line with the scope of this PR, we can look into the weird terminate issues on a separate issue

@tthvo
Copy link
Contributor

tthvo commented Nov 25, 2023

Sounds good. I will have a look. Is this one ready?

@MyStackOverflows MyStackOverflows marked this pull request as ready for review November 25, 2023 00:43
@MyStackOverflows MyStackOverflows merged commit 4191127 into develop Nov 25, 2023
8 checks passed
@tthvo tthvo deleted the gh-167-qaurt-review-response-route branch November 30, 2023 06:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/back-end Back-end work feat New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants