-
Notifications
You must be signed in to change notification settings - Fork 59
Description
This may be a user error, since I'm relatively new to both Aparapi and OpenCL, but it seems to me that there is a problem related to multidimensional local arrays.
When I'm executing the kernel shown at https://github.com/raner/top.java.matrix/blob/syncleus-aparapi-issue-51/src/main/java/top/java/matrix/fast/TiledFastMatrix.java#L72 on my MacBook Air (Mac OS X 10.11, Intel HD Graphics 6000 with 48 execution units), I'm consistently getting a JVM crash:
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x000000011b42650c, pid=88211, tid=5891
#
# JRE version: Java(TM) SE Runtime Environment (8.0_60-b27) (build 1.8.0_60-b27)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.60-b23 mixed mode bsd-amd64 compressed oops)
# Problematic frame:
# C [libaparapi_x86_643808666101765060573.dylib+0xd50c] KernelArg::setLocalBufferArg(JNIEnv_*, int, int, bool)+0x3c
The relevant snippet in the generated OpenCL code seems to be
int tiledRow = (this->val$TILE_SIZE * tile) + localRow;
int tiledColumn = (this->val$TILE_SIZE * tile) + localColumn;
(&this->tileA[localColumn * this->tileA__javaArrayDimension0])[localRow] = this->val$A[((tiledColumn * this->val$numberOfRows) + row)];
(&this->tileB[localColumn * this->tileB__javaArrayDimension0])[localRow] = this->val$B[((column * this->val$numberOfColumns) + tiledRow)];
barrier(CLK_LOCAL_MEM_FENCE);
for (int repeat = 0; repeat<this->val$TILE_SIZE; repeat++){
value = value + ((&this->tileA[repeat * this->tileA__javaArrayDimension0])[localRow] * (&this->tileB[localColumn * this->tileB__javaArrayDimension0])[repeat]);
}
barrier(CLK_LOCAL_MEM_FENCE);
I noticed that it refers to the arrays' dimensions as tile…__javaArrayDimension0 to calculate the proper index into the two-dimensional array, which, I believe, is what necessitates the earlier invocation of KernelArg::setLocalBufferArg. Anyway, I didn't have the time to do a deep dive on this issue, but when I change the arrays to one-dimensional arrays and perform the index calculation myself (as shown in raner/top.java.matrix@cb4988d), the code will work correctly.