Skip to content

Inference is long even on GPU machine #36

@RemyNtshaykolo

Description

@RemyNtshaykolo

Hello thank you for the repo it is very complete, I was able to launch a model on an p2.xlarge EC2 instance on AWS.

I'm having performance problems. I have the impression that the gpu is not being used because I get inference times similar to the inference times when I run a model on my mac which has no GPU.
The encoding image command
curl http://127.0.0.1:8080/predictions/sam_vit_h_encode -T slick_example.png takes around 2 minutes to run as you can see on the followings logs:

2023-11-08T08:31:23,510 [INFO ] W-9000-sam_vit_h_encode_1.0.0-stdout MODEL_LOG - XXXXX  Inference time:  114.42793655395508

I was expecting "ms" performance.

Also when investigating logs, I see pool-3-thread-1 TS_METRICS - GPUUtilization.Percent:100.0|#Level:Host,DeviceId:0|#hostname:ac47803e69a1,timestamp:1699394016 so it looks like GPU is used.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions