Conversation
Questions:
|
AlbertoCasasOrtiz
left a comment
There was a problem hiding this comment.
LGTM! See questions in my previous comment.
This is true, and mimics what happened before. I wasn't sure if we wanted to restart the container indefinitely, so I went with 3 somewhat arbitrarily. There could be some better changes to stop a container if another container is stopped, but it doesn't seem super straightforward. In theory, the test trial should be able to pick up on that, but would have to figure out how that could work better (maybe in another PR).
Good question, it was more of a quick test where I just added an Exception in the loop code. Open to ideas for better testing, though. |
I agree, this is way better that what we had before. Let's worry about that in another PR.
That's what I did, plus processing a few trials with openpose to make sure it works properly. I think that should be enough given it was working on hrnet and seems to work on openpose now. |
Addresses #227, #233, #234 (and a related openpose docker issue).
Most testing is straightforward, but testing the on-failure mechanism is less straightforward. In case it's helpful, steps I took:
sudo kill -9 <PID>to send an exit signal to the process running the docker container.ps -efto list the running processes. Two ways to find it:docker ps. Then, find the process inps -efthat contains that container ID.ps -ef, there will be processes related to/mmpose/loop_mmpose.pyand/openpose/loop_openpose.py. You can see what processes they depend on as well, and find the PID that way (it will be the same as option 1 if traced correctly).sudo killwith no trials running.sudo killto stop the mmpose container and cause a processing error. Then, I reprocessed the trial again and let it run to completion.