-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Description
Hello thank you for the repo it is very complete, I was able to launch a model on an p2.xlarge EC2 instance on AWS.
I'm having performance problems. I have the impression that the gpu is not being used because I get inference times similar to the inference times when I run a model on my mac which has no GPU.
The encoding image command
curl http://127.0.0.1:8080/predictions/sam_vit_h_encode -T slick_example.png takes around 2 minutes to run as you can see on the followings logs:
2023-11-08T08:31:23,510 [INFO ] W-9000-sam_vit_h_encode_1.0.0-stdout MODEL_LOG - XXXXX Inference time: 114.42793655395508
I was expecting "ms" performance.
Also when investigating logs, I see pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:100.0|#Level:Host,DeviceId:0|#hostname:ac47803e69a1,timestamp:1699394016 so it looks like GPU is used.
Metadata
Metadata
Assignees
Labels
No labels