Matrix Triangular Solve GPU op #5010

c0g · 2016-10-17T15:23:41Z

Adds a GPU version of the triangular solver op using cuBLAS trsm.

tensorflow-jenkins · 2016-10-17T15:23:43Z

Can one of the admins verify this patch?

mention-bot · 2016-10-17T15:23:43Z

@c0g, thanks for your PR! By analyzing the history of the files in this pull request, we identified @tensorflower-gardener, @ebrevdo and @josh11b to be potential reviewers.

ebrevdo · 2016-10-17T15:45:47Z

tensorflow/core/kernels/matrix_triangular_solve_op.cc

 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at
-


please don't change whitespace in lines you aren't otherwise modifying

Please undo these changes.

ebrevdo · 2016-10-17T15:46:50Z

tensorflow/core/kernels/matrix_triangular_solve_op.cc

+    }
+
+    int64 GetCostPerUnit(const TensorShapes& input_matrix_shapes) const final
+    {


you probably need to multiply this by Eigen's AddCost / MulCost to get appropriate estimates

I messed up the formatting on the original op due to changing white space. Everything in MatrixTriangularSolveOp is original code from TF.

I'm happy to modify the original code to include AddCost/MulCost.

ebrevdo · 2016-10-17T15:47:11Z

tensorflow/core/kernels/matrix_triangular_solve_op.cc

+    {
+        double rows = static_cast<double>(input_matrix_shapes[0].dim_size(0));
+        double num_rhss = static_cast<double>(input_matrix_shapes[1].dim_size(1));
+        double cost = rows * rows * num_rhss;


why are you doing this in double precision?

This is the original tensorflow op code (see above). Possibly because double can represent larger numbers than int? I doubt anyone will try a matrix multiply large enough to overflow a 64 bit int though. Happy to change to int64.

ebrevdo · 2016-10-17T15:48:27Z

tensorflow/python/kernel_tests/matrix_triangular_solve_op_test.py


 if __name__ == "__main__":
-  tf.test.main()
+  tf.test.main()


undo whitespace changes

ebrevdo · 2016-10-17T15:51:46Z

tensorflow/core/kernels/matrix_triangular_solve_op.cc

-      } else {
-        output.noalias() = triangle.solve(rhs);
-      }
+      trans = perftools::gputools::blas::Transpose::kNoTranspose;


can you use longer variable names instead of 'trans', 'lda', 'ldb', etc?

then in ThenBlasTrsm call, you can do e.g. upper_or_lower /* uplo */, ..., etc.

easier for other users to read.

ebrevdo · 2016-10-17T15:52:00Z

tensorflow/core/kernels/matrix_triangular_solve_op.cc

+                           cublas_m, cublas_n, 1.0, matrix_ptr, lda, &out_ptr,
+                           ldb)
+            .ok();
+    // LOG(INFO) << blas_launch_status;


remove commented out code

ebrevdo · 2016-10-17T15:52:59Z

tensorflow/core/kernels/matrix_triangular_solve_op.cc

+        stream
+            ->ThenBlasTrsm(perftools::gputools::blas::Side::kRight, uplo, trans,
+                           perftools::gputools::blas::Diagonal::kNonUnit,
+                           cublas_m, cublas_n, 1.0, matrix_ptr, lda, &out_ptr,


I think this needs to be Scalar(1) instead of 1.0 (which is a double)

ebrevdo · 2016-10-17T15:53:38Z

Thanks for the PR! A few comments.

ebrevdo · 2016-10-17T15:54:52Z

tensorflow/python/kernel_tests/matrix_triangular_solve_op_test.py

 import tensorflow as tf


 class MatrixTriangularSolveOpTest(tf.test.TestCase):


This is not being tested with GPU support. You need to modify this line:

https://github.com/c0g/tensorflow/blob/e46db8b3e2d1efd5cf10ced68b15cbbd1d58a149/tensorflow/python/kernel_tests/BUILD#L216

from tf_py_test to cuda_py_test.

ebrevdo · 2016-10-17T17:31:12Z

Ah in that case let's leave it as double. You're right.

On Oct 17, 2016 9:20 AM, "c0g" notifications@github.com wrote:

@c0g commented on this pull request.

In tensorflow/core/kernels/matrix_triangular_solve_op.cc
#5010:
{
   Base::ValidateSquareSolver(context, input_matrix_shapes);
}
TensorShapes GetOutputMatrixShapes(
   const TensorShapes& input_matrix_shapes) const final
{
   return TensorShapes({ TensorShape({ input_matrix_shapes[0].dim_size(1),
       input_matrix_shapes[1].dim_size(1) }) });
}
int64 GetCostPerUnit(const TensorShapes& input_matrix_shapes) const final

{
   double rows = static_cast<double>(input_matrix_shapes[0].dim_size(0));
   double num_rhss = static_cast<double>(input_matrix_shapes[1].dim_size(1));
   double cost = rows \* rows \* num_rhss;
This is the original tensorflow op code (see above). Possibly because
double can represent larger numbers than int? I doubt anyone will try a
matrix multiply large enough to overflow a 64 bit int though. Happy to
change to int64.

—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
#5010, or mute the thread
https://github.com/notifications/unsubscribe-auth/ABtim1ZVajP4Q6Mph62ruIpHz25QrFSYks5q06BhgaJpZM4KYwau
.

drpngx · 2016-10-19T06:01:12Z

Do we still need changes?

drpngx · 2016-10-19T06:01:23Z

Jenkins, test this please.

c0g · 2016-10-19T07:14:47Z

I have not yet made the requested changes.

On 19 Oct 2016, at 07:03, drpngx notifications@github.com wrote:

Do we still need changes?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

c0g · 2016-10-20T09:25:35Z

@ebrevdo : I reverted to the original file to fix white space then made your other requested changes.

drpngx · 2016-10-21T05:04:07Z

I don't see the revert. Please push?

c0g · 2016-10-21T09:59:05Z

@drpngx you're right I have no idea what happened to them. The changes should now be up. Sorry for the lag/commit spam.

drpngx · 2016-10-22T18:13:21Z

Jenkins, test this please.

alexggmatthews · 2016-10-24T15:49:41Z

Seems to be passing now, only looks to be waiting on verification of @ebrevdo requested changes.

ebrevdo · 2016-10-25T23:54:23Z

tensorflow/core/kernels/matrix_triangular_solve_op.cc

    double rows = static_cast<double>(input_matrix_shapes[0].dim_size(0));
    double num_rhss = static_cast<double>(input_matrix_shapes[1].dim_size(1));
-    double cost = rows * rows * num_rhss;
+        double cost = rows * rows * num_rhss * 


whitespace?

Sorry! Should be gone now.

ebrevdo

small nit; otherwise LGTM.

drpngx · 2016-10-26T15:01:50Z

tensorflow/core/kernels/matrix_triangular_solve_op.cc


+#if GOOGLE_CUDA
+#include "tensorflow/core/platform/stream_executor.h"
+#endif // GOOGLE_CUDA


Sorry, one last thing: we need an extra space before the //

drpngx · 2016-10-26T15:09:48Z

Jenkins, test this please.

Matrix Triangula Solve GPU op

e46db8b

googlebot added the cla: yes label Oct 17, 2016

c0g mentioned this pull request Oct 17, 2016

GPU kernel for MatrixTriangularSolve #4518

Closed

ebrevdo suggested changes Oct 17, 2016

View reviewed changes

ebrevdo assigned ebrevdo, zheng-xq and rmlarsen Oct 17, 2016

ebrevdo reviewed Oct 17, 2016

View reviewed changes

google requested changes

50c6882

c0g added 4 commits October 21, 2016 10:50

reverted whitespace changes

9c648a1

removed mistaken added file

58cc75d

one more whitespace error

d28f016

something in my process nerfs whitespace at the end of files

3aa0415

ebrevdo reviewed Oct 25, 2016

View reviewed changes

ebrevdo approved these changes Oct 25, 2016

View reviewed changes

drpngx added the stat:awaiting response Status - Awaiting response from author label Oct 26, 2016

removed erroneous tab character

d41fb6f

drpngx reviewed Oct 26, 2016

View reviewed changes

GOOGLE_CUDA spaces

ea0bf6f

drpngx removed the stat:awaiting response Status - Awaiting response from author label Oct 26, 2016

drpngx merged commit 0dace14 into tensorflow:master Oct 26, 2016

alexggmatthews mentioned this pull request Nov 23, 2016

Matrix Triangular Solve does not require CUSolver for GPU GPflow/GPflow#213

Closed

		import tensorflow as tf


		class MatrixTriangularSolveOpTest(tf.test.TestCase):

Matrix Triangular Solve GPU op #5010

Matrix Triangular Solve GPU op #5010

Uh oh!

Conversation

c0g commented Oct 17, 2016

Uh oh!

tensorflow-jenkins commented Oct 17, 2016

Uh oh!

mention-bot commented Oct 17, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ebrevdo commented Oct 17, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ebrevdo commented Oct 17, 2016

@c0g commented on this pull request.

Uh oh!

drpngx commented Oct 19, 2016

Uh oh!

drpngx commented Oct 19, 2016

Uh oh!

c0g commented Oct 19, 2016

Uh oh!

c0g commented Oct 20, 2016

Uh oh!

drpngx commented Oct 21, 2016

Uh oh!

c0g commented Oct 21, 2016

Uh oh!

drpngx commented Oct 22, 2016

Uh oh!

alexggmatthews commented Oct 24, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ebrevdo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drpngx commented Oct 26, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants