Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GRPC Plugin #2506

Open
Null1515 opened this issue Oct 31, 2024 · 14 comments
Open

GRPC Plugin #2506

Null1515 opened this issue Oct 31, 2024 · 14 comments
Labels
enhancement New feature or request

Comments

@Null1515
Copy link

Hey there,

First of all, I want to thank you for this amazing piece of art! Your work has been incredibly valuable.

I noticed that there is a gRPC plugin available in the source code, and I’m interested in utilizing it. However, I’m not sure how to go about using it effectively.

Could you please provide some guidance or documentation on how to use the gRPC plugin? Any tips or examples would be greatly appreciated!

Thank you for your help!

@Null1515 Null1515 added the enhancement New feature or request label Oct 31, 2024
@frankfliu
Copy link
Contributor

frankfliu commented Nov 2, 2024

@Null1515

Currently gRPC plugin is not installed by default, here is how to install this plugin:

  1. clone djl-serving: git clone https://github.com/deepjavalibrary/djl.git
  2. build grpc plugin: cd plugins && ./gradlew jar
  3. copy plugin jar file into your djl-serving installation folder: cp build/libs/grpc-0.31.0-SNAPSHOT.jar /usr/local/djl-serving-0.31.0-SNAPSHOT/plugins/
  4. start djl-serving: djl-serving, grpc service will be listening on 8082 port

You can use inference.proto to create your own grpc client to run inference against djl-serving. See example code: https://github.com/deepjavalibrary/djl-serving/blob/master/plugins/grpc/src/test/java/ai/djl/serving/grpc/GrpcTest.java#L69

@Null1515
Copy link
Author

Null1515 commented Nov 2, 2024

Hello again,

Thank you for your previous response—I really appreciate your help!

I wanted to follow up on the earlier discussion. I have attempted the steps outlined, but it seems that the gRPC plugin isn't being added successfully. I suspect this may be due to a missing component in its JAR file. Additionally, I plan to use this within a Docker environment, which adds another layer of complexity.

I also have another request regarding inference from a Java client using a BufferedImage. The current implementation requires the use of the ImageIO package to encode the image into a byte array. However, this encoding process can be quite time-consuming. For instance, encoding an image at a resolution of 1920x1080 takes approximately 200ms for PNG and 40ms for BMP formats.

Could you please provide guidance on how to utilize a direct byte array from a BufferedImage with your inference API? Below is the code snippet I’m currently working with:

// Image to Direct Byte Array
WritableRaster raster = image.getRaster();
DataBufferByte data = (DataBufferByte) raster.getDataBuffer();
byte[] dataBuffer = data.getData();

// Direct Byte Array to Image
DataBuffer buffer = new DataBufferByte(dataBuffer, dataBuffer.length);

BufferedImage image1;
boolean hasAlpha = image.getColorModel().hasAlpha();
if (hasAlpha) {
    ColorModel cm = new ComponentColorModel(ColorSpace.getInstance(ColorSpace.CS_sRGB), new int[]{8, 8, 8, 8}, true,
            false, Transparency.TRANSLUCENT, DataBuffer.TYPE_BYTE);
    image1 = new BufferedImage(cm, Raster.createInterleavedRaster(buffer, imageWidth, imageHeight, imageWidth * 4, 4, new int[]{2, 1, 0, 3}, null),
            false, null);
} else {
    ColorModel cm = new ComponentColorModel(ColorSpace.getInstance(ColorSpace.CS_sRGB), new int[]{8, 8, 8, 8}, false,
            false, 1, DataBuffer.TYPE_BYTE);
    image1 = new BufferedImage(cm, Raster.createInterleavedRaster(buffer, imageWidth, imageHeight, imageWidth * 3, 3, new int[]{2, 1, 0}, null),
            false, null);
}

Thank you for your assistance!

@frankfliu
Copy link
Contributor

@Null1515

  1. What error message are you seeing? did you see something like: INFO GrpcServerImpl gRPC bind to port: 127.0.0.1:8082?
  2. build docker image should be pretty straightforward. you can use djl-serving docker as your base image, docker file can be found: https://github.com/deepjavalibrary/djl-serving/tree/master/serving/docker, and the image is published to dockerhub: https://hub.docker.com/r/deepjavalibrary/djl-serving/tags
  3. For image process, I recommend you to use ImageFactory, if the performance is critical, you can use OpenCVImageFactory

@Null1515
Copy link
Author

Null1515 commented Nov 2, 2024

Hello,

I would like to bring to your attention some details from the log file regarding the plugin initialization process:

INFO FolderScanPluginManager scanning for plugins...
INFO FolderScanPluginManager scanning in plug-in folder: /usr/local/djl-serving-0.28.0/plugins
INFO PropertyFilePluginMetaDataReader Plugin found: console/jar:file:/usr/local/djl-serving-0.28.0/plugins/management-console-0.28.0.jar!/META-INF/plugin.definition
INFO PropertyFilePluginMetaDataReader Plugin found: cache-engines/jar:file:/usr/local/djl-serving-0.28.0/plugins/cache-0.28.0.jar!/META-INF/plugin.definition
INFO PropertyFilePluginMetaDataReader Plugin found: static-file-plugin/jar:file:/usr/local/djl-serving-0.28.0/plugins/static-file-plugin-0.28.0.jar!/META-INF/plugin.definition
INFO PropertyFilePluginMetaDataReader Plugin found: secure-mode/jar:file:/usr/local/djl-serving-0.28.0/plugins/secure-mode-0.28.0.jar!/META-INF/plugin.definition
INFO PropertyFilePluginMetaDataReader Plugin found: kserve/jar:file:/usr/local/djl-serving-0.28.0/plugins/kserve-0.28.0.jar!/META-INF/plugin.definition
INFO FolderScanPluginManager Loading plugin: {console/jar:file:/usr/local/djl-serving-0.28.0/plugins/management-console-0.28.0.jar!/META-INF/plugin.definition}
INFO PluginMetaData plugin console changed state to INITIALIZED
INFO FolderScanPluginManager Loading plugin: {static-file-plugin/jar:file:/usr/local/djl-serving-0.28.0/plugins/static-file-plugin-0.28.0.jar!/META-INF/plugin.definition}
INFO PluginMetaData plugin static-file-plugin changed state to INITIALIZED
INFO FolderScanPluginManager Loading plugin: {cache-engines/jar:file:/usr/local/djl-serving-0.28.0/plugins/cache-0.28.0.jar!/META-INF/plugin.definition}
INFO PluginMetaData plugin cache-engines changed state to INITIALIZED
INFO FolderScanPluginManager Loading plugin: {secure-mode/jar:file:/usr/local/djl-serving-0.28.0/plugins/secure-mode-0.28.0.jar!/META-INF/plugin.definition}
INFO PluginMetaData plugin secure-mode changed state to INITIALIZED
INFO FolderScanPluginManager Loading plugin: {kserve/jar:file:/usr/local/djl-serving-0.28.0/plugins/kserve-0.28.0.jar!/META-INF/plugin.definition}
INFO PluginMetaData plugin kserve changed state to INITIALIZED
INFO PluginMetaData plugin console changed state to ACTIVE reason: plugin ready
INFO PluginMetaData plugin static-file-plugin changed state to ACTIVE reason: plugin ready
INFO PluginMetaData plugin cache-engines changed state to ACTIVE reason: plugin ready
INFO PluginMetaData plugin secure-mode changed state to ACTIVE reason: plugin ready
INFO PluginMetaData plugin kserve changed state to ACTIVE reason: plugin ready

While the plugins have been successfully initialized, I want to note that the gRPC plugin is not being added. I will attach the relevant .jar plugin files for your reference.

Additionally, I have a couple of questions:

How can I add a gRPC plugin to the Docker image?
I am currently using FFmpeg along with other packages and would like to utilize BufferedImage as a source. Can you confirm if this is a viable method to send my BufferedImage to djl-serve?
Uploading grpc.zip…

@frankfliu
Copy link
Contributor

@Null1515
You are using 0.28.0, which not supported. You need to use 0.31.0-SNAPSHOT

@Null1515
Copy link
Author

Null1515 commented Nov 3, 2024

Thanks for your last answer!
I wanted to follow up on my previous question about utilizing BufferedImage as a source. I am currently using FFmpeg along with other packages and would like to know if you can confirm if using OpenCV extension is a viable method to send my BufferedImage to djl-serve. If you have any updates or insights, I’d really appreciate it!

@frankfliu
Copy link
Contributor

@Null1515

I'm not sure about that. The OpenCV extension we use (depends on org.openpnp:opencv) might have conflict with FFmpeg. For video use case, I think the best way is to create a video extension similar to https://github.com/deepjavalibrary/djl/tree/master/extensions/audio

And implement ImageFactory based on org.bytedeco:opencv package.

@Null1515
Copy link
Author

Null1515 commented Nov 3, 2024

How can i implement my own byte array decoder in djl serve?

@frankfliu
Copy link
Contributor

I assume you are using org.bytedeco:ffmpeg, If org.openpnp:opencv not conflict with org.bytedeco:ffmpeg, you can directly copy OpenCVImageFactory directly. You can find implementation for converting Mat to NDArray here: https://github.com/deepjavalibrary/djl/blob/master/extensions/opencv/src/main/java/ai/djl/opencv/OpenCVImage.java#L134

You can implement your own Translator in Java use any dependencies your want to use. You can package your java code with your model, see: https://docs.djl.ai/master/docs/create_serving_ready_model.html#bundle-your-data-processing-scripts-together-with-model-artifacts

After package your model, you also need copy all your 3rd party dependencies in deps folder in djl-serving:

djl-serving -i org.openpnp:opencv:4.9.0-1
djl-serving -i ai.djl.opencv:opencv:0.30.0
...

@Null1515
Copy link
Author

Null1515 commented Dec 17, 2024

Hi there,

I have dockerized version 0.31.0-snapshot and added the gRPC plugin to it. Now, I see the following message:

INFO GrpcServerImpl gRPC bind to port: 127.0.0.1:8082.

I'm running the Docker container with the following command:
sudo docker run -it --runtime=nvidia --gpus all --shm-size 2g -p 8080:8080 -p 8082:8082 serving-gpu:1.2.0
Is there anything wrong with this setup? I encountered an "io.grpc.StatusRuntimeException: UNAVAILABLE: Network closed for unknown reason" error when trying to perform Inference/Ping.

Any tips would be greatly appreciated!

Thank you!

@frankfliu
Copy link
Contributor

@Null1515

Your grpc is listening on localhost (127.0.0.1), which means it can only accept connection from localhost. You need to add the following in config.properties:

grpc_address=0.0.0.0

or you can set env in the dockerfile:

ENV SERVING_GRPC_ADDRESS=0.0.0.0

@Null1515
Copy link
Author

Hi again,

Thank you for your help with my earlier mistake; I really appreciate it!

I wanted to ask you something regarding gRPC. I’ve noticed that my inference time hasn’t improved at all. While I understand that REST can have latency issues, I expected gRPC to be faster. Do you have any tips for improving inference latency?

For context, I’m using TorchScript with YOLOv8.

Thanks in advance for your assistance!

@frankfliu
Copy link
Contributor

@Null1515

This is common misunderstanding about gRPC vs REST. In general gRPC performance comes from protobuf, which can generate close-to-optimal serialization encoding. In most of time, it's true that gRPC has better performance due to bad hand-coded encoding. However, the automatically generated code can never beat highly tuned hand craft encoding (we actually found one encoding performance bug in the past, which can be avoid if using gRPC). With DJLServing the REST API has the same or better performance than gRPC.

The reason we introduce gRPC plugin is not for performance. It's purely to make it easy for existing gRPC user to onboard with DJLServing.

@Null1515
Copy link
Author

Thank you for the clarification! I appreciate your insights on the differences between gRPC and REST, especially regarding serialization and performance. Before we close this issue, could you share any tips or best practices for improving inference latency? Your expertise would be really helpful!

Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants