This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
segfault in native code while trying to use CustomOp #11926
Closed
Description
I'm trying to use CustomOp to create a Constant, as it has been suggested in #8428.
As soon as I define that my CustomOp has no inputs, it fails with a segfault, and I can't find a workaround.
Environment
- Linux setup using https://raw.githubusercontent.com/apache/incubator-mxnet/1.2.1/ci/docker/install/ubuntu_core.sh (actually in a Docker)
- MXNet 1.2.1, scala 2.11.11
Code
class ConstantOp(value: NDArray) extends CustomOp {
def forward(isTrain: Boolean, req: Array[String], inData: Array[NDArray], outData: Array[NDArray], aux: Array[NDArray]): Unit = {
val data = value.copyTo(outData(0).context)
this.assign(outData(0), req(0), data)
data.dispose()
}
def backward(req: Array[String], outGrad: Array[NDArray], inData: Array[NDArray], outData: Array[NDArray], inGrad: Array[NDArray], aux: Array[NDArray]): Unit = {
throw new Exception(s"Backward not supported by Constant")
}
}
class ConstantOpProp(needTopGrad: Boolean = false) extends CustomOpProp(needTopGrad) {
override def listArguments(): Array[String] = Array()
override def listOutputs(): Array[String] = Array("output")
override def inferShape(inShape: Array[Shape]): (Array[Shape], Array[Shape], Array[Shape]) = {
val data = NDArray.deserialize(this.kwargs("value").toCharArray.map(_.toByte))
(Array(), Array(data.shape), null)
}
override def inferType(inType: Array[DType]): (Array[DType], Array[DType], Array[DType]) = {
val data = NDArray.deserialize(this.kwargs("value").toCharArray.map(_.toByte))
(Array(), Array(data.dtype), null)
}
override def createOperator(ctx: String, inShapes: Array[Array[Int]],
inDtypes: Array[Int]): CustomOp = {
// hacky stuff to workaround the declaration using String
val data = NDArray.deserialize(this.kwargs("value").toCharArray.map(_.toByte))
new ConstantOp(data)
}
}
object TestConst {
Operator.register("constant", new ConstantOpProp())
val value = NDArray.array(Array(1f), Shape(1))
val const = Symbol.Custom("constant")()(
kwargs = Map(
"op_type" -> "constant",
// hacky thing to workaround the fact CustomOpProp uses Map[String, String] internally for kwargs
"value" -> String.copyValueOf(value.serialize().map(_.toChar))
))
val a = Symbol.Variable("a")
val symbol = a + const
val e = symbol.bind(Context.defaultCtx, Map(
"a" -> NDArray.array(Array(10f), Shape(1)))
)
e.forward()
println("outputs=" + e.outputs.mkString(", "))
}
Error:
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f37ec25e7a9, pid=14172, tid=0x00007f379cbf7700
#
# JRE version: OpenJDK Runtime Environment (8.0_151-b12) (build 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12)
# Java VM: OpenJDK 64-Bit Server VM (25.151-b12 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V [libjvm.so+0x6797a9]
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /app/hs_err_pid14172.log
in the log
[...]
Stack: [0x00007f6b28ad3000,0x00007f6b28bd4000], sp=0x00007f6b28bcecd0, free space=1007k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so+0x6797a9]
C [mxnet-scala+0x455dd4] Java_org_apache_mxnet_LibInfo_mxCustomOpRegister::{lambda(char const*, int, char const**, char const**, MXCallbackList*)#1}::operator()(char const*, int, char const**, char const**, MXCallbackList*) const::{lambda(int, int*, unsigned int**, void*)#5}::_FUN(int, {lambda(char const*, int, char const**, char const**, MXCallbackList*)#1}, unsigned int*, unsigned int**)+0x454
C [mxnet-scala+0x6acf4c] mxnet::op::custom::InferShape(nnvm::NodeAttrs const&, std::vector<nnvm::TShape, std::allocator<nnvm::TShape> >*, std::vector<nnvm::TShape, std::allocator<nnvm::TShape> >*)+0x2dc
Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j org.apache.mxnet.LibInfo.mxExecutorBindEX(JIII[Ljava/lang/String;[I[II[J[J[I[JJLorg/apache/mxnet/Base$RefLong;)I+0
j org.apache.mxnet.Symbol.bindHelper(Lorg/apache/mxnet/Context;Lscala/collection/Seq;Lscala/collection/Iterable;Lscala/collection/Iterable;Lscala/collection/Iterable;Lscala/collection/Iterable;Lscala/collection/immutable/Map;Lorg/apache/mxnet/Executor;)Lorg/apache/mxnet/Executor;+767
j org.apache.mxnet.Symbol.bind(Lorg/apache/mxnet/Context;Lscala/collection/immutable/Map;)Lorg/apache/mxnet/Executor;+38
[...]
side-note
As you see in the code, I'm obliged to hack NDArrays into strings to transmit the data. That's because CustomOp implementation defines Map[String, String] for kwargs, whereas Symbol.Custom allows Map[String, Any]. It leads to very strange things where we actually have, at runtime, non-string objects behind java String references. But they aren't castable anyway because of the type system. Weird
A change of the def in CustomOp would be welcome.