Description
🐛 Describe the bug
I'm following this tutorial but the docker run didn't start properly
Error logs
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
This command is not for general use and should only be run as the result of a call to
ProcessBuilder.start() or Runtime.exec() in a java application
WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
2023-07-28T12:40:14,354 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Initializing plugins manager...
2023-07-28T12:40:14,658 [INFO ] main org.pytorch.serve.metrics.configuration.MetricConfiguration - Successfully loaded metrics configuration from /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml
2023-07-28T12:40:15,063 [INFO ] main org.pytorch.serve.ModelServer -
Torchserve version: 0.8.1
TS Home: /home/venv/lib/python3.9/site-packages
Current directory: /home/model-server
Temp directory: /home/model-server/tmp
Metrics config path: /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml
Number of GPUs: 0
Number of CPUs: 4
Max heap size: 982 M
Python executable: /home/venv/bin/python
Config file: /home/model-server/config.properties
Inference address: http://0.0.0.0:8080
Management address: http://0.0.0.0:8081
Metrics address: http://0.0.0.0:8082
Model Store: /home/model-server/model-store
Initial Models: N/A
Log dir: /home/model-server/logs
Metrics dir: /home/model-server/logs
Netty threads: 32
Netty client threads: 0
Default workers per model: 4
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Limit Maximum Image Pixels: true
Prefer direct buffer: false
Allowed Urls: [file://.|http(s)?://.]
Custom python dependency for model allowed: false
Enable metrics API: true
Metrics mode: log
Disable system metrics: false
Workflow Store: /home/model-server/model-store
Model config: N/A
2023-07-28T12:40:15,097 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Loading snapshot serializer plugin...
2023-07-28T12:40:15,181 [INFO ] main org.pytorch.serve.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
2023-07-28T12:40:15,471 [INFO ] main org.pytorch.serve.ModelServer - Inference API bind to: http://0.0.0.0:8080
2023-07-28T12:40:15,472 [INFO ] main org.pytorch.serve.ModelServer - Initialize Management server with: EpollServerSocketChannel.
2023-07-28T12:40:15,477 [INFO ] main org.pytorch.serve.ModelServer - Management API bind to: http://0.0.0.0:8081
2023-07-28T12:40:15,480 [INFO ] main org.pytorch.serve.ModelServer - Initialize Metrics server with: EpollServerSocketChannel.
2023-07-28T12:40:15,482 [INFO ] main org.pytorch.serve.ModelServer - Metrics API bind to: http://0.0.0.0:8082
Model server started.
This command is not for general use and should only be run as the result of a call to
ProcessBuilder.start() or Runtime.exec() in a java application
2023-07-28T12:40:16,282 [ERROR] pool-3-thread-1 org.pytorch.serve.metrics.MetricCollector -
java.io.IOException: Cannot run program "/home/venv/bin/python" (in directory "/home/venv/lib/python3.9/site-packages"): error=0, Failed to exec spawn helper: pid: 47, exit value: 1
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1143) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1073) ~[?:?]
at java.lang.Runtime.exec(Runtime.java:594) ~[?:?]
at org.pytorch.serve.metrics.MetricCollector.run(MetricCollector.java:44) [model-server.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) [?:?]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: java.io.IOException: error=0, Failed to exec spawn helper: pid: 47, exit value: 1
at java.lang.ProcessImpl.forkAndExec(Native Method) ~[?:?]
at java.lang.ProcessImpl.(ProcessImpl.java:314) ~[?:?]
at java.lang.ProcessImpl.start(ProcessImpl.java:244) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1110) ~[?:?]
... 9 more
2023-07-28T12:40:33,226 [DEBUG] epollEventLoopGroup-3-1 org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model mnist
2023-07-28T12:40:33,229 [DEBUG] epollEventLoopGroup-3-1 org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model mnist
2023-07-28T12:40:33,230 [INFO ] epollEventLoopGroup-3-1 org.pytorch.serve.wlm.ModelManager - Model mnist loaded.
2023-07-28T12:40:33,239 [DEBUG] epollEventLoopGroup-3-1 org.pytorch.serve.wlm.ModelManager - updateModel: mnist, count: 4
2023-07-28T12:40:33,262 [DEBUG] W-9000-mnist_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.9/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9000, --metrics-config, /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml]
2023-07-28T12:40:33,272 [DEBUG] W-9001-mnist_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.9/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9001, --metrics-config, /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml]
This command is not for general use and should only be run as the result of a call to
ProcessBuilder.start() or Runtime.exec() in a java application
2023-07-28T12:40:33,285 [DEBUG] W-9003-mnist_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.9/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9003, --metrics-config, /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml]
This command is not for general use and should only be run as the result of a call to
ProcessBuilder.start() or Runtime.exec() in a java application
2023-07-28T12:40:33,300 [DEBUG] W-9002-mnist_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.9/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9002, --metrics-config, /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml]
2023-07-28T12:40:33,296 [ERROR] W-9001-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error
org.pytorch.serve.wlm.WorkerInitializationException: Failed start worker process
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:179) ~[model-server.jar:?]
at org.pytorch.serve.wlm.WorkerThread.connect(WorkerThread.java:339) ~[model-server.jar:?]
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:183) [model-server.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: java.io.IOException: Cannot run program "/home/venv/bin/python" (in directory "/home/model-server/tmp/models/df29a96231534b05945fa892495ecf6c"): error=0, Failed to exec spawn helper: pid: 59, exit value: 1
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1143) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1073) ~[?:?]
at java.lang.Runtime.exec(Runtime.java:594) ~[?:?]
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:161) ~[model-server.jar:?]
... 7 more
Caused by: java.io.IOException: error=0, Failed to exec spawn helper: pid: 59, exit value: 1
at java.lang.ProcessImpl.forkAndExec(Native Method) ~[?:?]
at java.lang.ProcessImpl.(ProcessImpl.java:314) ~[?:?]
at java.lang.ProcessImpl.start(ProcessImpl.java:244) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1110) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1073) ~[?:?]
at java.lang.Runtime.exec(Runtime.java:594) ~[?:?]
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:161) ~[model-server.jar:?]
... 7 more
This command is not for general use and should only be run as the result of a call to
ProcessBuilder.start() or Runtime.exec() in a java application
2023-07-28T12:40:33,314 [ERROR] W-9003-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error
org.pytorch.serve.wlm.WorkerInitializationException: Failed start worker process
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:179) ~[model-server.jar:?]
at org.pytorch.serve.wlm.WorkerThread.connect(WorkerThread.java:339) ~[model-server.jar:?]
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:183) [model-server.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: java.io.IOException: Cannot run program "/home/venv/bin/python" (in directory "/home/model-server/tmp/models/df29a96231534b05945fa892495ecf6c"): error=0, Failed to exec spawn helper: pid: 62, exit value: 1
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1143) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1073) ~[?:?]
at java.lang.Runtime.exec(Runtime.java:594) ~[?:?]
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:161) ~[model-server.jar:?]
... 7 more
Caused by: java.io.IOException: error=0, Failed to exec spawn helper: pid: 62, exit value: 1
at java.lang.ProcessImpl.forkAndExec(Native Method) ~[?:?]
at java.lang.ProcessImpl.(ProcessImpl.java:314) ~[?:?]
at java.lang.ProcessImpl.start(ProcessImpl.java:244) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1110) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1073) ~[?:?]
at java.lang.Runtime.exec(Runtime.java:594) ~[?:?]
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:161) ~[model-server.jar:?]
... 7 more
2023-07-28T12:40:33,318 [DEBUG] W-9003-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - W-9003-mnist_1.0 State change null -> WORKER_STOPPED
2023-07-28T12:40:33,319 [INFO ] W-9003-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery start timestamp: 1690548033319
2023-07-28T12:40:33,322 [DEBUG] W-9001-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - W-9001-mnist_1.0 State change null -> WORKER_STOPPED
2023-07-28T12:40:33,286 [ERROR] W-9000-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error
org.pytorch.serve.wlm.WorkerInitializationException: Failed start worker process
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:179) ~[model-server.jar:?]
at org.pytorch.serve.wlm.WorkerThread.connect(WorkerThread.java:339) ~[model-server.jar:?]
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:183) [model-server.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: java.io.IOException: Cannot run program "/home/venv/bin/python" (in directory "/home/model-server/tmp/models/df29a96231534b05945fa892495ecf6c"): error=0, Failed to exec spawn helper: pid: 54, exit value: 1
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1143) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1073) ~[?:?]
at java.lang.Runtime.exec(Runtime.java:594) ~[?:?]
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:161) ~[model-server.jar:?]
... 7 more
Caused by: java.io.IOException: error=0, Failed to exec spawn helper: pid: 54, exit value: 1
at java.lang.ProcessImpl.forkAndExec(Native Method) ~[?:?]
at java.lang.ProcessImpl.(ProcessImpl.java:314) ~[?:?]
at java.lang.ProcessImpl.start(ProcessImpl.java:244) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1110) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1073) ~[?:?]
at java.lang.Runtime.exec(Runtime.java:594) ~[?:?]
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:161) ~[model-server.jar:?]
... 7 more
2023-07-28T12:40:33,324 [INFO ] W-9003-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9003 in 1 seconds.
2023-07-28T12:40:33,325 [INFO ] W-9001-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery start timestamp: 1690548033325
2023-07-28T12:40:33,326 [INFO ] W-9001-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9001 in 1 seconds.
2023-07-28T12:40:33,328 [DEBUG] W-9000-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-mnist_1.0 State change null -> WORKER_STOPPED
2023-07-28T12:40:33,334 [INFO ] W-9000-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery start timestamp: 1690548033334
2023-07-28T12:40:33,335 [INFO ] W-9000-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9000 in 1 seconds.
This command is not for general use and should only be run as the result of a call to
ProcessBuilder.start() or Runtime.exec() in a java application
2023-07-28T12:40:33,338 [ERROR] W-9002-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error
org.pytorch.serve.wlm.WorkerInitializationException: Failed start worker process
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:179) ~[model-server.jar:?]
at org.pytorch.serve.wlm.WorkerThread.connect(WorkerThread.java:339) ~[model-server.jar:?]
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:183) [model-server.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: java.io.IOException: Cannot run program "/home/venv/bin/python" (in directory "/home/model-server/tmp/models/df29a96231534b05945fa892495ecf6c"): error=0, Failed to exec spawn helper: pid: 65, exit value: 1
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1143) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1073) ~[?:?]
at java.lang.Runtime.exec(Runtime.java:594) ~[?:?]
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:161) ~[model-server.jar:?]
... 7 more
Caused by: java.io.IOException: error=0, Failed to exec spawn helper: pid: 65, exit value: 1
at java.lang.ProcessImpl.forkAndExec(Native Method) ~[?:?]
at java.lang.ProcessImpl.(ProcessImpl.java:314) ~[?:?]
at java.lang.ProcessImpl.start(ProcessImpl.java:244) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1110) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1073) ~[?:?]
at java.lang.Runtime.exec(Runtime.java:594) ~[?:?]
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:161) ~[model-server.jar:?]
... 7 more
2023-07-28T12:40:33,344 [DEBUG] W-9002-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - W-9002-mnist_1.0 State change null -> WORKER_STOPPED
2023-07-28T12:40:33,345 [INFO ] W-9002-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery start timestamp: 1690548033344
2023-07-28T12:40:33,345 [INFO ] W-9002-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - Retry worker: 9002 in 1 seconds.
2023-07-28T12:40:33,345 [DEBUG] epollEventLoopGroup-3-1 org.pytorch.serve.wlm.ModelVersionedRefs - Removed model: mnist version: 1.0
2023-07-28T12:40:33,347 [DEBUG] epollEventLoopGroup-3-1 org.pytorch.serve.wlm.WorkerThread - W-9003-mnist_1.0 State change WORKER_STOPPED -> WORKER_SCALED_DOWN
2023-07-28T12:40:33,348 [DEBUG] epollEventLoopGroup-3-1 org.pytorch.serve.wlm.WorkerThread - W-9002-mnist_1.0 State change WORKER_STOPPED -> WORKER_SCALED_DOWN
2023-07-28T12:40:33,349 [DEBUG] epollEventLoopGroup-3-1 org.pytorch.serve.wlm.WorkerThread - W-9001-mnist_1.0 State change WORKER_STOPPED -> WORKER_SCALED_DOWN
2023-07-28T12:40:33,349 [DEBUG] epollEventLoopGroup-3-1 org.pytorch.serve.wlm.WorkerThread - W-9000-mnist_1.0 State change WORKER_STOPPED -> WORKER_SCALED_DOWN
2023-07-28T12:40:33,356 [INFO ] epollEventLoopGroup-3-1 org.pytorch.serve.wlm.ModelManager - Model mnist unregistered.
2023-07-28T12:40:33,371 [INFO ] epollEventLoopGroup-3-1 ACCESS_LOG - /172.17.0.1:38924 "POST /models?model_name=mnist&url=mnist.mar&initial_workers=4 HTTP/1.1" 500 763
2023-07-28T12:40:33,374 [INFO ] epollEventLoopGroup-3-1 TS_METRICS - Requests5XX.Count:1.0|#Level:Host|#hostname:b68f79c2508d,timestamp:1690548033
2023-07-28T12:40:34,367 [DEBUG] W-9001-mnist_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.9/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9001, --metrics-config, /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml]
2023-07-28T12:40:34,366 [DEBUG] W-9002-mnist_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.9/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9002, --metrics-config, /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml]
2023-07-28T12:40:34,366 [DEBUG] W-9000-mnist_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.9/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9000, --metrics-config, /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml]
2023-07-28T12:40:34,366 [DEBUG] W-9003-mnist_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/home/venv/bin/python, /home/venv/lib/python3.9/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /home/model-server/tmp/.ts.sock.9003, --metrics-config, /home/venv/lib/python3.9/site-packages/ts/configs/metrics.yaml]
This command is not for general use and should only be run as the result of a call to
ProcessBuilder.start() or Runtime.exec() in a java application
This command is not for general use and should only be run as the result of a call to
ProcessBuilder.start() or Runtime.exec() in a java application
This command is not for general use and should only be run as the result of a call to
ProcessBuilder.start() or Runtime.exec() in a java application
This command is not for general use and should only be run as the result of a call to
ProcessBuilder.start() or Runtime.exec() in a java application
2023-07-28T12:40:34,422 [ERROR] W-9001-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error
org.pytorch.serve.wlm.WorkerInitializationException: Failed start worker process
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:179) ~[model-server.jar:?]
at org.pytorch.serve.wlm.WorkerThread.connect(WorkerThread.java:339) ~[model-server.jar:?]
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:183) [model-server.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: java.io.IOException: Cannot run program "/home/venv/bin/python" (in directory "/home/model-server/tmp/models/df29a96231534b05945fa892495ecf6c"): error=0, Failed to exec spawn helper: pid: 75, exit value: 1
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1143) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1073) ~[?:?]
at java.lang.Runtime.exec(Runtime.java:594) ~[?:?]
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:161) ~[model-server.jar:?]
... 5 more
Caused by: java.io.IOException: error=0, Failed to exec spawn helper: pid: 75, exit value: 1
at java.lang.ProcessImpl.forkAndExec(Native Method) ~[?:?]
at java.lang.ProcessImpl.(ProcessImpl.java:314) ~[?:?]
at java.lang.ProcessImpl.start(ProcessImpl.java:244) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1110) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1073) ~[?:?]
at java.lang.Runtime.exec(Runtime.java:594) ~[?:?]
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:161) ~[model-server.jar:?]
... 5 more
2023-07-28T12:40:34,422 [ERROR] W-9002-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error
org.pytorch.serve.wlm.WorkerInitializationException: Failed start worker process
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:179) ~[model-server.jar:?]
at org.pytorch.serve.wlm.WorkerThread.connect(WorkerThread.java:339) ~[model-server.jar:?]
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:183) [model-server.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: java.io.IOException: Cannot run program "/home/venv/bin/python" (in directory "/home/model-server/tmp/models/df29a96231534b05945fa892495ecf6c"): error=0, Failed to exec spawn helper: pid: 72, exit value: 1
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1143) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1073) ~[?:?]
at java.lang.Runtime.exec(Runtime.java:594) ~[?:?]
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:161) ~[model-server.jar:?]
... 5 more
Caused by: java.io.IOException: error=0, Failed to exec spawn helper: pid: 72, exit value: 1
at java.lang.ProcessImpl.forkAndExec(Native Method) ~[?:?]
at java.lang.ProcessImpl.(ProcessImpl.java:314) ~[?:?]
at java.lang.ProcessImpl.start(ProcessImpl.java:244) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1110) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1073) ~[?:?]
at java.lang.Runtime.exec(Runtime.java:594) ~[?:?]
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:161) ~[model-server.jar:?]
... 5 more
2023-07-28T12:40:34,422 [ERROR] W-9003-mnist_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker error
org.pytorch.serve.wlm.WorkerInitializationException: Failed start worker process
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:179) ~[model-server.jar:?]
at org.pytorch.serve.wlm.WorkerThread.connect(WorkerThread.java:339) ~[model-server.jar:?]
at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:183) [model-server.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
Caused by: java.io.IOException: Cannot run program "/home/venv/bin/python" (in directory "/home/model-server/tmp/models/df29a96231534b05945fa892495ecf6c"): error=0, Failed to exec spawn helper: pid: 70, exit value: 1
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1143) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1073) ~[?:?]
at java.lang.Runtime.exec(Runtime.java:594) ~[?:?]
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:161) ~[model-server.jar:?]
... 5 more
Caused by: java.io.IOException: error=0, Failed to exec spawn helper: pid: 70, exit value: 1
at java.lang.ProcessImpl.forkAndExec(Native Method) ~[?:?]
at java.lang.ProcessImpl.(ProcessImpl.java:314) ~[?:?]
at java.lang.ProcessImpl.start(ProcessImpl.java:244) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1110) ~[?:?]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1073) ~[?:?]
at java.lang.Runtime.exec(Runtime.java:594) ~[?:?]
at org.pytorch.serve.wlm.WorkerLifeCycle.startWorker(WorkerLifeCycle.java:161) ~[model-server.jar:?]
... 5 more
Installation instructions
I have torchserve 0.8.1 installed but this error occurs when I run with the latest docker image pytorch/torchserve:0.8.1-cpu
Model Packaing
torch-model-archiver --model-name mnist --version 1.0 --model-file examples/image_classifier/mnist/mnist.py --serialized-file examples/image_classifier/mnist/mnist_cnn.pt --handler examples/image_classifier/mnist/mnist_handler.py
config.properties
I didn't pass customized config.properties
Versions
Environment headers
Torchserve branch:
torchserve==0.8.1
torch-model-archiver==0.6.0
Python version: 3.8 (64-bit runtime)
Python executable: /Users/lzchen/miniforge3/envs/tjc-main/bin/python
Versions of relevant python libraries:
captum==0.6.0
numpy==1.20.3
psutil==5.9.4
requests==2.28.1
sentence-transformers==2.2.2
sentencepiece==0.1.95
torch==1.10.2
torch-model-archiver==0.6.0
torchserve==0.8.1
torchvision==0.9.0a0
transformers==4.24.0
wheel==0.37.1
torch==1.10.2
**Warning: torchtext not present ..
torchvision==0.9.0a0
**Warning: torchaudio not present ..
Java Version:
OS: Mac OSX 13.4.1 (arm64)
GCC version: N/A
Clang version: 14.0.3 (clang-1403.0.22.14.1)
CMake version: N/A
Versions of npm installed packages:
**Warning: newman, newman-reporter-html markdown-link-check not installed...
Repro instructions
torch-model-archiver --model-name mnist --version 1.0 --model-file examples/image_classifier/mnist/mnist.py --serialized-file examples/image_classifier/mnist/mnist_cnn.pt --handler examples/image_classifier/mnist/mnist_handler.py
mkdir model_store
mv mnist.mar model_store/
docker run --rm -it -p 8080:8080 -p 8081:8081 -p 8082:8082 -v $(pwd)/model_store:/home/model-server/model-store pytorch/torchserve:latest-cpu
Possible Solution
No response