Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

segfault in native code while trying to use CustomOp #11926

Closed
@mdespriee

Description

@mdespriee

I'm trying to use CustomOp to create a Constant, as it has been suggested in #8428.
As soon as I define that my CustomOp has no inputs, it fails with a segfault, and I can't find a workaround.

Environment

Code

class ConstantOp(value: NDArray) extends CustomOp {

  def forward(isTrain: Boolean, req: Array[String], inData: Array[NDArray], outData: Array[NDArray], aux: Array[NDArray]): Unit = {
    val data = value.copyTo(outData(0).context)
    this.assign(outData(0), req(0), data)
    data.dispose()
  }

  def backward(req: Array[String], outGrad: Array[NDArray], inData: Array[NDArray], outData: Array[NDArray], inGrad: Array[NDArray], aux: Array[NDArray]): Unit = {
    throw new Exception(s"Backward not supported by Constant")
  }
}

class ConstantOpProp(needTopGrad: Boolean = false) extends CustomOpProp(needTopGrad) {

  override def listArguments(): Array[String] = Array()

  override def listOutputs(): Array[String] = Array("output")

  override def inferShape(inShape: Array[Shape]): (Array[Shape], Array[Shape], Array[Shape]) = {
    val data = NDArray.deserialize(this.kwargs("value").toCharArray.map(_.toByte))
    (Array(), Array(data.shape), null)
  }

  override def inferType(inType: Array[DType]): (Array[DType], Array[DType], Array[DType]) = {
    val data = NDArray.deserialize(this.kwargs("value").toCharArray.map(_.toByte))
    (Array(), Array(data.dtype), null)
  }

  override def createOperator(ctx: String, inShapes: Array[Array[Int]],
                              inDtypes: Array[Int]): CustomOp = {
    // hacky stuff to workaround the declaration using String
    val data = NDArray.deserialize(this.kwargs("value").toCharArray.map(_.toByte))
    new ConstantOp(data)
  }
}


object TestConst {
  Operator.register("constant", new ConstantOpProp())

  val value = NDArray.array(Array(1f), Shape(1))
  val const = Symbol.Custom("constant")()(
    kwargs = Map(
      "op_type" -> "constant",
      // hacky thing to workaround the fact CustomOpProp uses Map[String, String] internally for kwargs
      "value" -> String.copyValueOf(value.serialize().map(_.toChar))
    ))

  val a = Symbol.Variable("a")
  val symbol = a + const
  val e = symbol.bind(Context.defaultCtx, Map(
    "a" -> NDArray.array(Array(10f), Shape(1)))
  )

  e.forward()

  println("outputs=" + e.outputs.mkString(", "))
}

Error:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f37ec25e7a9, pid=14172, tid=0x00007f379cbf7700
#
# JRE version: OpenJDK Runtime Environment (8.0_151-b12) (build 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12)
# Java VM: OpenJDK 64-Bit Server VM (25.151-b12 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0x6797a9]
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /app/hs_err_pid14172.log

in the log

[...]

Stack: [0x00007f6b28ad3000,0x00007f6b28bd4000],  sp=0x00007f6b28bcecd0,  free space=1007k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x6797a9]
C  [mxnet-scala+0x455dd4]  Java_org_apache_mxnet_LibInfo_mxCustomOpRegister::{lambda(char const*, int, char const**, char const**, MXCallbackList*)#1}::operator()(char const*, int, char const**, char const**, MXCallbackList*) const::{lambda(int, int*, unsigned int**, void*)#5}::_FUN(int, {lambda(char const*, int, char const**, char const**, MXCallbackList*)#1}, unsigned int*, unsigned int**)+0x454
C  [mxnet-scala+0x6acf4c]  mxnet::op::custom::InferShape(nnvm::NodeAttrs const&, std::vector<nnvm::TShape, std::allocator<nnvm::TShape> >*, std::vector<nnvm::TShape, std::allocator<nnvm::TShape> >*)+0x2dc

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  org.apache.mxnet.LibInfo.mxExecutorBindEX(JIII[Ljava/lang/String;[I[II[J[J[I[JJLorg/apache/mxnet/Base$RefLong;)I+0
j  org.apache.mxnet.Symbol.bindHelper(Lorg/apache/mxnet/Context;Lscala/collection/Seq;Lscala/collection/Iterable;Lscala/collection/Iterable;Lscala/collection/Iterable;Lscala/collection/Iterable;Lscala/collection/immutable/Map;Lorg/apache/mxnet/Executor;)Lorg/apache/mxnet/Executor;+767
j  org.apache.mxnet.Symbol.bind(Lorg/apache/mxnet/Context;Lscala/collection/immutable/Map;)Lorg/apache/mxnet/Executor;+38

[...]

side-note

As you see in the code, I'm obliged to hack NDArrays into strings to transmit the data. That's because CustomOp implementation defines Map[String, String] for kwargs, whereas Symbol.Custom allows Map[String, Any]. It leads to very strange things where we actually have, at runtime, non-string objects behind java String references. But they aren't castable anyway because of the type system. Weird
A change of the def in CustomOp would be welcome.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions