Does DJL support cuda 12.3.1? #2902

SidneyLann · 2023-12-23T08:40:24Z

I install cuda by cuda_12.3.1_545.23.08_linux.run on Centos 9 and PtEngine.getInstance().getDevices(1) return cpu but not gpu. Does DJL support cuda 12.3.1? Why can't get gpu? Thanks.

frankfliu · 2023-12-24T00:47:27Z

Which DJL version are you using?

Please try DJL 0.26.0-SNAPSHOT, it should work for PyTorch 2.1.1 cuda 12.
See: https://docs.djl.ai/master/engines/pytorch/pytorch-engine/index.html#supported-pytorch-versions

frankfliu · 2023-12-24T00:50:47Z

PyTorch 2.0.1 or PyTorch 1.13.1 should also work with CUDA 12.3, but you need explicitly set PYTORCH_FLAVOR=cu118

SidneyLann · 2023-12-24T12:48:09Z

No pytorch-native-cu121 in https://oss.sonatype.org/content/repositories/snapshots/ for 0.26.0-SNAPSHOT

0.26 has cu11?

SidneyLann · 2023-12-24T12:54:50Z

I can train yolo8 by cuda 12.3.1 + pytorch 2.1.2 in python, but below code still get cpu but not gpu for DJL0.26:

Engine engine=Engine.getEngine("PyTorch");
Device[] devices=engine.getDevices(1);
LOG.debug("devices[0]={}", devices[0].getDeviceType());

SidneyLann · 2023-12-24T22:55:50Z

<properties>
	<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
	<maven.compiler.encoding>UTF-8</maven.compiler.encoding>
	<maven.compiler.source>21</maven.compiler.source>
	<maven.compiler.target>21</maven.compiler.target>
	<djl.version>0.26.0-SNAPSHOT</djl.version>
</properties>

<repositories>
	<repository>
		<id>djl.ai</id>
		<url>https://oss.sonatype.org/content/repositories/snapshots/</url>
	</repository>
</repositories>

<dependencyManagement>
	<dependencies>
		<dependency>
			<groupId>ai.djl</groupId>
			<artifactId>bom</artifactId>
			<version>${djl.version}</version>
			<type>pom</type>
			<scope>import</scope>
		</dependency>
	</dependencies>
</dependencyManagement>

<dependencies>
	<dependency>
		<groupId>commons-cli</groupId>
		<artifactId>commons-cli</artifactId>
		<version>1.6.0</version>
	</dependency>
	<dependency>
		<groupId>ai.djl</groupId>
		<artifactId>api</artifactId>
	</dependency>
	<dependency>
		<groupId>ai.djl</groupId>
		<artifactId>model-zoo</artifactId>
	</dependency>
	<dependency>
		<groupId>ai.djl.pytorch</groupId>
		<artifactId>pytorch-model-zoo</artifactId>
	</dependency>
	<dependency>
		<groupId>ai.djl.pytorch</groupId>
		<artifactId>pytorch-engine</artifactId>
		<scope>runtime</scope>
	</dependency>
</dependencies>

Does this enough to use cuda?

frankfliu · 2023-12-24T23:20:09Z

Your pom.xml looks good. It should automatically download pytorch 2.1.1 native library for cuda 12.1.
See: https://github.com/deepjavalibrary/djl/blob/master/examples/pom.xml

You should be able to add the following to use offline native package:

<dependency>
    <groupId>ai.djl.pytorch</groupId>
    <artifactId>pytorch-native-cu121</artifactId>
    <classifier>linux-x86_64</classifier>
    <scope>runtime</scope>
</dependency>

cd examples
mvn clean package -DskipTests
mvn exec:java -Dai.djl.default_engine=PyTorch

Downloading from djl.ai: https://oss.sonatype.org/content/repositories/snapshots/ai/djl/pytorch/pytorch-native-cu121/2.1.1/pytorch-native-cu121-2.1.1.pom
Downloading from central: https://repo.maven.apache.org/maven2/ai/djl/pytorch/pytorch-native-cu121/2.1.1/pytorch-native-cu121-2.1.1.pom
Downloaded from central: https://repo.maven.apache.org/maven2/ai/djl/pytorch/pytorch-native-cu121/2.1.1/pytorch-native-cu121-2.1.1.pom (1.3 kB at 8.0 kB/s)
Downloading from djl.ai: https://oss.sonatype.org/content/repositories/snapshots/ai/djl/pytorch/pytorch-native-cu121/2.1.1/pytorch-native-cu121-2.1.1-linux-x86_64.jar
Downloading from central: https://repo.maven.apache.org/maven2/ai/djl/pytorch/pytorch-native-cu121/2.1.1/pytorch-native-cu121-2.1.1-linux-x86_64.jar

SidneyLann · 2023-12-25T09:03:57Z

oss.sonatype.org has no pytorch-native-cu121-2.1.1-linux-x86_64.jar, but repo.maven.apache.org dose have. But still can't get gpu for below code:

  String modelFile = AppConfig.HOME_ART + "/model/yolo8x.torchscript";
  YoloV8TranslatorFactory yoloV8TranslatorFactory = new YoloV8TranslatorFactory();
  Engine engine=Engine.getEngine("PyTorch");
  Device[] devices=engine.getDevices(1);
  LOG.debug("devices[0]={}", devices[0].getDeviceType());
  Map<String, Object> arguments = new HashMap<>();
  arguments.put("width", Integer.valueOf(640));
  arguments.put("height", Integer.valueOf(640));
  arguments.put("resize", "true");
  arguments.put("toTensor", true);
  arguments.put("applyRatio", true);
  arguments.put("threshold", threshold);

  Translator<Image, DetectedObjects> translator = yoloV8TranslatorFactory.newInstance(Image.class, DetectedObjects.class, null, arguments);

  Criteria<Image, DetectedObjects> criteria = Criteria.builder().setTypes(Image.class, DetectedObjects.class).optModelPath(Paths.get(modelFile)).optEngine("PyTorch").optTranslator(translator)
      .optProgress(new ProgressBar()).build();

  ZooModel<Image, DetectedObjects> model = criteria.loadModel();
  Predictor<Image, DetectedObjects> predictor = model.newPredictor();

frankfliu · 2023-12-25T19:38:48Z

I notice you have the following in the log:

Ignore mismatching platform from: jar:file...

This mean the detected os/arch/cuda version doesn't match cu121-linux-x86_64

Can you run the following command in your environment, and see which OS DJL detected:

git clone https://github.com/deepjavalibrary/djl.git
cd djl
./gradlew debugEnv -Dai.djl.default_engine=PyTorch

SidneyLann · 2023-12-26T00:59:39Z

Can't build for centos9+jdk21

SidneyLann · 2023-12-26T01:44:54Z

ai.djl.util.Platform : The bundled library: cu121-linux-x86_64:2.1.1-20231129 doesn't match system: cpu-linux-x86_64:2.1.1
ai.djl.util.Platform : Ignore mismatching platform from: jar:file:/home/sidney/prg/tomcat/webapps/pcng-of-srv-resource/WEB-INF/lib/pytorch-native-cu121-2.1.1-linux-x86_64.jar!/native/lib/pytorch.properties
ai.djl.pytorch.engine.PtEngine : PyTorch graph executor optimizer is enabled, this may impact your inference latency and throughput. See: https://docs.djl.ai/docs/development/inference_performance_optimization.html#graph-executor-optimization

What's mean for these?

frankfliu · 2023-12-26T02:45:07Z

I created a PR to address JDK 21 issue: #2903. For the mean time, please use JDK 17.

ai.djl.util.Platform : The bundled library: cu121-linux-x86_64:2.1.1-20231129 doesn't match system: cpu-linux-x86_64:2.1.1

The above message mean, DJL failed to detect CUDA version. You should see some log related to CudaUtils if you enable debug log:

[DEBUG] - cudart library not found.

The following is just warning, which tells you to tune the performance if you need. graph executor optimizer may have negative performance impact for some models.

PyTorch graph executor optimizer is enabled

SidneyLann · 2023-12-26T09:37:07Z

For jdk17, block in here for 2 hours now, I have run several times, always block here, no cpu time is using, is it normal?

SidneyLann · 2023-12-26T09:40:55Z

Can you run the following command in your environment, and see which OS DJL detected:

Is there a simple way to do this?

frankfliu · 2023-12-26T19:55:37Z

Can you check your debug log?

You should see logs related to:

[DEBUG] ai.djl.util.CudaUtils - ...

SidneyLann · 2023-12-26T23:36:42Z

ai.djl.util.cuda.CudaUtils : cudart library not found.
ai.djl.util.Platform : The bundled library: cu121-linux-x86_64:2.1.1-20231129 doesn't match system: cpu-linux-x86_64:2.1.1
ai.djl.util.Platform : Ignore mismatching platform from: jar:file:/home/sidney/prg/tomcat/webapps/pcng-of-srv-resource/WEB-INF/lib/pytorch-native-cu121-2.1.1-linux-x86_64.jar!/native/lib/pytorch.properties
ai.djl.pytorch.jni.LibUtils : Using cache dir: /home/sidney/.djl.ai/pytorch/2.1.1-cpu-linux-x86_64
ai.djl.pytorch.jni.LibUtils : Loading native library: /home/sidney/.djl.ai/pytorch/2.1.1-cpu-linux-x86_64/libc10.so
ai.djl.pytorch.jni.LibUtils : Loading native library: /home/sidney/.djl.ai/pytorch/2.1.1-cpu-linux-x86_64/libgomp-52f2fd74.so.1
ai.djl.pytorch.jni.LibUtils : Loading native library: /home/sidney/.djl.ai/pytorch/2.1.1-cpu-linux-x86_64/libtorch_cpu.so
ai.djl.pytorch.jni.LibUtils : Loading native library: /home/sidney/.djl.ai/pytorch/2.1.1-cpu-linux-x86_64/libtorch.so
ai.djl.pytorch.jni.LibUtils : Loading native library: /home/sidney/.djl.ai/pytorch/2.1.1-cpu-linux-x86_64/0.26.0-libdjl_torch.so

Can see these logs when running, but you have ask me to compile to see which OS DJL detected? There are no logs related to CudaUtils when compile, right? there only below logs:

Task :engines:mxnet:jnarator:classes UP-TO-DATE
Task :engines:mxnet:jnarator:jar UP-TO-DATE
Task :engines:mxnet:mxnet-engine:jnarator UP-TO-DATE
Task :engines:mxnet:mxnet-engine:compileJava UP-TO-DATE
Task :engines:mxnet:mxnet-engine:processResources UP-TO-DATE
Task :engines:mxnet:mxnet-engine:classes UP-TO-DATE
Task :engines:mxnet:mxnet-engine:jar UP-TO-DATE
Task :engines:mxnet:mxnet-model-zoo:compileJava UP-TO-DATE
Task :engines:mxnet:mxnet-model-zoo:processResources UP-TO-DATE
Task :engines:mxnet:mxnet-model-zoo:classes UP-TO-DATE
Task :engines:mxnet:mxnet-model-zoo:jar UP-TO-DATE
Task :engines:onnxruntime:onnxruntime-engine:compileJava

frankfliu · 2023-12-26T23:38:40Z

ai.djl.util.cuda.CudaUtils : cudart library not found.

means libcudart.so is not in LD_LIBRARY_PATH

frankfliu · 2023-12-26T23:51:27Z

You can use the following command to check if libcudart.so can be found:

ldconfig -p | grep libcudart.so

SidneyLann · 2023-12-26T23:55:16Z

Don't know why cuda_12.3.1_545.23.08_linux.run not install the libs and why python can use cuda? mayb pip install the cuda.

SidneyLann · 2023-12-27T00:23:01Z

Amended .bash_profile and reboot OS, still:
CudaUtils : cudart library not found.

SidneyLann · 2023-12-27T02:30:14Z

Why ldconfig show libcudart.so in this folder but the folder does not have it?!

and can't find libcudart.so

frankfliu · 2023-12-27T02:57:18Z

Can you try cuda docker image?

SidneyLann · 2023-12-27T08:49:05Z

ai.djl.translate.TranslateException: ai.djl.engine.EngineException: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch/ultralytics/nn/tasks.py", line 74, in forward
_35 = (_18).forward(act, _34, )
_36 = (_20).forward((_19).forward(act, _35, ), _29, )
_37 = (_22).forward(_33, _35, (_21).forward(act, _36, ), )
~~~~~~~~~~~~ <--- HERE
return _37
File "code/torch/ultralytics/nn/modules/head.py", line 46, in forward
anchor_points = torch.unsqueeze(CONSTANTS.c0, 0)
lt, rb, = torch.chunk(_14, 2, 1)
x1y1 = torch.sub(anchor_points, lt)
~~~~~~~~~ <--- HERE
x2y2 = torch.add(anchor_points, rb)
c_xy = torch.div(torch.add(x1y1, x2y2), CONSTANTS.c1)

Traceback of TorchScript, original code (most recent call last):
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/utils/tal.py(267): dist2bbox
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/nn/modules/head.py(59): forward
/usr/prg/python/3102/lib/python3.10/site-packages/torch/nn/modules/module.py(1508): _slow_forward
/usr/prg/python/3102/lib/python3.10/site-packages/torch/nn/modules/module.py(1527): _call_impl
/usr/prg/python/3102/lib/python3.10/site-packages/torch/nn/modules/module.py(1518): _wrapped_call_impl
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/nn/tasks.py(81): _predict_once
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/nn/tasks.py(60): predict
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/nn/tasks.py(42): forward
/usr/prg/python/3102/lib/python3.10/site-packages/torch/nn/modules/module.py(1508): _slow_forward
/usr/prg/python/3102/lib/python3.10/site-packages/torch/nn/modules/module.py(1527): _call_impl
/usr/prg/python/3102/lib/python3.10/site-packages/torch/nn/modules/module.py(1518): _wrapped_call_impl
/usr/prg/python/3102/lib/python3.10/site-packages/torch/jit/_trace.py(1065): trace_module
/usr/prg/python/3102/lib/python3.10/site-packages/torch/jit/_trace.py(798): trace
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/engine/exporter.py(302): export_torchscript
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/engine/exporter.py(117): outer_func
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/engine/exporter.py(252): call
/usr/prg/python/3102/lib/python3.10/site-packages/torch/utils/_contextlib.py(115): decorate_context
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/engine/model.py(328): export
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/cfg/init.py(448): entrypoint
/home/sidney/.local/bin/yolo(8):
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

at ai.djl.inference.Predictor.batchPredict(Predictor.java:192) ~[api-0.26.0-SNAPSHOT.jar:na]
at ai.djl.inference.Predictor.predict(Predictor.java:129) ~[api-0.26.0-SNAPSHOT.jar:na]
at com.pcng.resource.service.ArtDetectionService.detect(ArtDetectionService.java:111) ~[classes/:na]
at com.pcng.resource.service.PianoDetectionService.detect(PianoDetectionService.java:31) ~[classes/:na]
at com.pcng.resource.controller.ArtDetectionController.detect(ArtDetectionController.java:45) ~[classes/:na]
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) ~[na:na]
at java.base/java.lang.reflect.Method.invoke(Method.java:580) ~[na:na]
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205) ~[spring-web-6.0.12.jar:6.0.12]
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:150) ~[spring-web-6.0.12.jar:6.0.12]
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:118) ~[spring-webmvc-6.0.12.jar:6.0.12]
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:884) ~[spring-webmvc-6.0.12.jar:6.0.12]
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:797) ~[spring-webmvc-6.0.12.jar:6.0.12]
at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87) ~[spring-webmvc-6.0.12.jar:6.0.12]
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1081) ~[spring-webmvc-6.0.12.jar:6.0.12]
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:974) ~[spring-webmvc-6.0.12.jar:6.0.12]
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1011) ~[spring-webmvc-6.0.12.jar:6.0.12]
at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:914) ~[spring-webmvc-6.0.12.jar:6.0.12]
at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:590) ~[servlet-api.jar:6.0]
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:885) ~[spring-webmvc-6.0.12.jar:6.0.12]
at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:658) ~[servlet-api.jar:6.0]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:205) ~[catalina.jar:10.1.13]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) ~[catalina.jar:10.1.13]
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:51) ~[tomcat-websocket.jar:10.1.13]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) ~[catalina.jar:10.1.13]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) ~[catalina.jar:10.1.13]
at org.springframework.web.filter.RequestContextFilter.doFilterInternal(RequestContextFilter.java:100) ~[spring-web-6.0.12.jar:6.0.12]
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) ~[spring-web-6.0.12.jar:6.0.12]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) ~[catalina.jar:10.1.13]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) ~[catalina.jar:10.1.13]
at org.springframework.web.filter.FormContentFilter.doFilterInternal(FormContentFilter.java:93) ~[spring-web-6.0.12.jar:6.0.12]
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) ~[spring-web-6.0.12.jar:6.0.12]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) ~[catalina.jar:10.1.13]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) ~[catalina.jar:10.1.13]
at org.springframework.web.filter.ServerHttpObservationFilter.doFilterInternal(ServerHttpObservationFilter.java:109) ~[spring-web-6.0.12.jar:6.0.12]
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) ~[spring-web-6.0.12.jar:6.0.12]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) ~[catalina.jar:10.1.13]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) ~[catalina.jar:10.1.13]
at org.springframework.boot.web.servlet.support.ErrorPageFilter.doFilter(ErrorPageFilter.java:124) ~[spring-boot-3.1.4.jar:3.1.4]
at org.springframework.boot.web.servlet.support.ErrorPageFilter$1.doFilterInternal(ErrorPageFilter.java:99) ~[spring-boot-3.1.4.jar:3.1.4]
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) ~[spring-web-6.0.12.jar:6.0.12]
at org.springframework.boot.web.servlet.support.ErrorPageFilter.doFilter(ErrorPageFilter.java:117) ~[spring-boot-3.1.4.jar:3.1.4]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) ~[catalina.jar:10.1.13]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) ~[catalina.jar:10.1.13]
at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:201) ~[spring-web-6.0.12.jar:6.0.12]
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:116) ~[spring-web-6.0.12.jar:6.0.12]
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:174) ~[catalina.jar:10.1.13]
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:149) ~[catalina.jar:10.1.13]
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:167) ~[catalina.jar:10.1.13]
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:90) ~[catalina.jar:10.1.13]
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:482) ~[catalina.jar:10.1.13]
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:115) ~[catalina.jar:10.1.13]
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:93) ~[catalina.jar:10.1.13]
at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:673) ~[catalina.jar:10.1.13]
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74) ~[catalina.jar:10.1.13]
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:341) ~[catalina.jar:10.1.13]
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:391) ~[tomcat-coyote.jar:10.1.13]
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:63) ~[tomcat-coyote.jar:10.1.13]
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:894) ~[tomcat-coyote.jar:10.1.13]
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1740) ~[tomcat-coyote.jar:10.1.13]
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:52) ~[tomcat-coyote.jar:10.1.13]
at org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191) ~[tomcat-util.jar:10.1.13]
at org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659) ~[tomcat-util.jar:10.1.13]
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) ~[tomcat-util.jar:10.1.13]
at java.base/java.lang.Thread.run(Thread.java:1583) ~[na:na]

Caused by: ai.djl.engine.EngineException: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
File "code/torch/ultralytics/nn/tasks.py", line 74, in forward
_35 = (_18).forward(act, _34, )
_36 = (_20).forward((_19).forward(act, _35, ), _29, )
_37 = (_22).forward(_33, _35, (_21).forward(act, _36, ), )
~~~~~~~~~~~~ <--- HERE
return _37
File "code/torch/ultralytics/nn/modules/head.py", line 46, in forward
anchor_points = torch.unsqueeze(CONSTANTS.c0, 0)
lt, rb, = torch.chunk(_14, 2, 1)
x1y1 = torch.sub(anchor_points, lt)
~~~~~~~~~ <--- HERE
x2y2 = torch.add(anchor_points, rb)
c_xy = torch.div(torch.add(x1y1, x2y2), CONSTANTS.c1)

Traceback of TorchScript, original code (most recent call last):
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/utils/tal.py(267): dist2bbox
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/nn/modules/head.py(59): forward
/usr/prg/python/3102/lib/python3.10/site-packages/torch/nn/modules/module.py(1508): _slow_forward
/usr/prg/python/3102/lib/python3.10/site-packages/torch/nn/modules/module.py(1527): _call_impl
/usr/prg/python/3102/lib/python3.10/site-packages/torch/nn/modules/module.py(1518): _wrapped_call_impl
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/nn/tasks.py(81): _predict_once
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/nn/tasks.py(60): predict
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/nn/tasks.py(42): forward
/usr/prg/python/3102/lib/python3.10/site-packages/torch/nn/modules/module.py(1508): _slow_forward
/usr/prg/python/3102/lib/python3.10/site-packages/torch/nn/modules/module.py(1527): _call_impl
/usr/prg/python/3102/lib/python3.10/site-packages/torch/nn/modules/module.py(1518): _wrapped_call_impl
/usr/prg/python/3102/lib/python3.10/site-packages/torch/jit/_trace.py(1065): trace_module
/usr/prg/python/3102/lib/python3.10/site-packages/torch/jit/_trace.py(798): trace
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/engine/exporter.py(302): export_torchscript
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/engine/exporter.py(117): outer_func
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/engine/exporter.py(252): call
/usr/prg/python/3102/lib/python3.10/site-packages/torch/utils/_contextlib.py(115): decorate_context
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/engine/model.py(328): export
/usr/prg/python/3102/lib/python3.10/site-packages/ultralytics/cfg/init.py(448): entrypoint
/home/sidney/.local/bin/yolo(8):
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

at ai.djl.pytorch.jni.PyTorchLibrary.moduleRunMethod(Native Method) ~[pytorch-engine-0.26.0-SNAPSHOT.jar:na]
at ai.djl.pytorch.jni.IValueUtils.forward(IValueUtils.java:57) ~[pytorch-engine-0.26.0-SNAPSHOT.jar:na]
at ai.djl.pytorch.engine.PtSymbolBlock.forwardInternal(PtSymbolBlock.java:145) ~[pytorch-engine-0.26.0-SNAPSHOT.jar:na]
at ai.djl.nn.AbstractBaseBlock.forward(AbstractBaseBlock.java:79) ~[api-0.26.0-SNAPSHOT.jar:na]
at ai.djl.nn.Block.forward(Block.java:127) ~[api-0.26.0-SNAPSHOT.jar:na]
at ai.djl.inference.Predictor.predictInternal(Predictor.java:143) ~[api-0.26.0-SNAPSHOT.jar:na]
at ai.djl.inference.Predictor.batchPredict(Predictor.java:183) ~[api-0.26.0-SNAPSHOT.jar:na]
... 63 common frames omitted

After reinstall cuda, djl-0.26 can use gpu now. however, above exception thrown, what's the problem?

SidneyLann · 2023-12-27T11:44:31Z

Why cu121 need cpu.so?

frankfliu · 2023-12-27T16:10:43Z

Can you try set "mapLocation" to true: https://docs.djl.ai/master/docs/demos/jupyter/load_pytorch_model.html#step-3-load-your-model

SidneyLann · 2023-12-28T00:32:20Z

After .optOption("mapLocation", "true"); the device is gpu now, torchscript gpu infer 26 pics spent 320 seconds, but onnxruntime cpu only spent 85 seconds. It seams gpu not really used for torchscript .

frankfliu · 2023-12-28T00:36:15Z

You can try onnxruntime on DJL if you want to (OnnxRuntime doesn't support cuda12 yet).

Did you compare the performance on PyTorch (GPU vs CPU)? The performance difference might related to image pre-processing.

SidneyLann · 2023-12-28T03:11:02Z

PyTorch gpu spent 320 seconds, PyTorch cpu spent 86 seconds, this is because of issue #2899?

SidneyLann · 2024-01-07T07:33:47Z

When will build for merging #2899 ?

frankfliu · 2024-01-07T20:02:40Z

You can try our nightly snapshot release: https://docs.djl.ai/docs/get.html#nightly-snapshots

0.26.0 release will be available around end of Jan.

SidneyLann · 2024-01-08T05:07:28Z

https://oss.sonatype.org/content/repositories/snapshots/

This repo has no the update for #2899

SidneyLann added the enhancement New feature or request label Dec 23, 2023

SidneyLann closed this as completed Jan 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does DJL support cuda 12.3.1? #2902

Does DJL support cuda 12.3.1? #2902

SidneyLann commented Dec 23, 2023 •

edited

Loading

frankfliu commented Dec 24, 2023 •

edited

Loading

frankfliu commented Dec 24, 2023

SidneyLann commented Dec 24, 2023 •

edited

Loading

SidneyLann commented Dec 24, 2023 •

edited

Loading

SidneyLann commented Dec 24, 2023

frankfliu commented Dec 24, 2023 •

edited

Loading

SidneyLann commented Dec 25, 2023

frankfliu commented Dec 25, 2023

SidneyLann commented Dec 26, 2023 •

edited

Loading

SidneyLann commented Dec 26, 2023

frankfliu commented Dec 26, 2023

SidneyLann commented Dec 26, 2023

SidneyLann commented Dec 26, 2023

frankfliu commented Dec 26, 2023

SidneyLann commented Dec 26, 2023

frankfliu commented Dec 26, 2023

frankfliu commented Dec 26, 2023

SidneyLann commented Dec 26, 2023

SidneyLann commented Dec 27, 2023

SidneyLann commented Dec 27, 2023 •

edited

Loading

frankfliu commented Dec 27, 2023

SidneyLann commented Dec 27, 2023

SidneyLann commented Dec 27, 2023 •

edited

Loading

frankfliu commented Dec 27, 2023 •

edited

Loading

SidneyLann commented Dec 28, 2023

frankfliu commented Dec 28, 2023

SidneyLann commented Dec 28, 2023 •

edited

Loading

SidneyLann commented Jan 7, 2024

frankfliu commented Jan 7, 2024

SidneyLann commented Jan 8, 2024

Does DJL support cuda 12.3.1? #2902

Does DJL support cuda 12.3.1? #2902

Comments

SidneyLann commented Dec 23, 2023 • edited Loading

frankfliu commented Dec 24, 2023 • edited Loading

frankfliu commented Dec 24, 2023

SidneyLann commented Dec 24, 2023 • edited Loading

SidneyLann commented Dec 24, 2023 • edited Loading

SidneyLann commented Dec 24, 2023

frankfliu commented Dec 24, 2023 • edited Loading

SidneyLann commented Dec 25, 2023

frankfliu commented Dec 25, 2023

SidneyLann commented Dec 26, 2023 • edited Loading

SidneyLann commented Dec 26, 2023

frankfliu commented Dec 26, 2023

SidneyLann commented Dec 26, 2023

SidneyLann commented Dec 26, 2023

frankfliu commented Dec 26, 2023

SidneyLann commented Dec 26, 2023

frankfliu commented Dec 26, 2023

frankfliu commented Dec 26, 2023

SidneyLann commented Dec 26, 2023

SidneyLann commented Dec 27, 2023

SidneyLann commented Dec 27, 2023 • edited Loading

frankfliu commented Dec 27, 2023

SidneyLann commented Dec 27, 2023

SidneyLann commented Dec 27, 2023 • edited Loading

frankfliu commented Dec 27, 2023 • edited Loading

SidneyLann commented Dec 28, 2023

frankfliu commented Dec 28, 2023

SidneyLann commented Dec 28, 2023 • edited Loading

SidneyLann commented Jan 7, 2024

frankfliu commented Jan 7, 2024

SidneyLann commented Jan 8, 2024

SidneyLann commented Dec 23, 2023 •

edited

Loading

frankfliu commented Dec 24, 2023 •

edited

Loading

SidneyLann commented Dec 24, 2023 •

edited

Loading

SidneyLann commented Dec 24, 2023 •

edited

Loading

frankfliu commented Dec 24, 2023 •

edited

Loading

SidneyLann commented Dec 26, 2023 •

edited

Loading

SidneyLann commented Dec 27, 2023 •

edited

Loading

SidneyLann commented Dec 27, 2023 •

edited

Loading

frankfliu commented Dec 27, 2023 •

edited

Loading

SidneyLann commented Dec 28, 2023 •

edited

Loading