-
Notifications
You must be signed in to change notification settings - Fork 707
Matrix API Reference
Matrix library functions can be divided into three types:
In addition, matrix library contains
Value operations work over individual values in a matrix, usually transforming them in some way.
# mat.mapValues( mapping function ): Matrix
Maps the values of the matrix to new values using a function
// The new matrix contains the squares of the elements from the original matrix
val squaredMatrix = matrix.mapValues{ value : Int => value * value }
# mat.filterValues( filter function ) : Matrix
Keep only the values of the matrix that set the function to true
// The new matrix contains only the positive non-zero elements from the matrix
val filteredMatrix = matrix.filterValues{ _ > 0 }
# mat.binarizeAs[NewValT] : Matrix
Sets all of the non-zero values of the matrix to the one element of the type
// The new matrix contains ones as integers for all non-zero values from the matri
val binMatrix = matrix.binarizeAs[Int]
# mat.getRow( rowNumber ) : RowVector
Returns the row indexed with the specified value from the original matrix
// Returns the 3-rd row from the matrix
val row = matrix.getRow(3)
// Returns the row from the matrix that is indexed with “France”
val row = matrix.getRow("France")
# matrix.reduceRowVectors{ reduce function } : RowVector
Reduces all row vectors into a single row vector using a associative pairwise aggregation function
// Returns the row vector of all column-wise products of the matrix
// matrix = 1 1 1
// 2 2 1
// 3 0 1
// rowProd = 6 0 1
val rowProd = matrix.reduceRowVectors { (x,y) => x * y }
# matrix.sumRowVectors : RowVector
Reduces all row vectors into the sum row vector
// Returns the row vector of all column-wise sums of the matrix
val rowSum = matrix.sumRowVectors
# matrix.mapRows{ mapping function } : Matrix
Maps all row vectors into new row vector using a function over the list of non-zero elements in the original rows
// Returns the row vector of all column-wise sums of the matrix
val rowSum = matrix.mapRows{ fn }
def fn( list : List[(Int,Double)]) : List[(Int,Double)]
# matrix.topRowElems( numberOfElements ) : Matrix
Returns the matrix containing only the top K elements in each row
// Returns the matrix with top 10 elements
val topkMatrix = matrix.topRowElems(10)
# matrix.rowL1Normalize : Matrix
Returns the matrix containing the L1 row-normalized elements of the original matrix
// Returns the adjacency matrix of the follow graph normalized by outdegree
// matrix = 1 0 1
// 1 1 1
// 1 0 0
// matrixL1Norm = 0.5 0 0.5
// 0.33 0.33 0.33
// 1 0 0
val matrixL1Norm = matrix.rowL1Normalize
# matrix.rowL2Normalize : Matrix
Returns the matrix containing the L2 row-normalized elements of the original matrix
// Returns the adjacency matrix of the follow graph normalized by outdegree
val matrixL2Norm = matrix.rowL2Normalize
# matrix.rowMeanCentering : Matrix
Returns the matrix containing the row-mean centered elements of the original matrix
// Substracts all of the row-wise means from all the elements
val matrixCentered = matrix.rowMeanCentering
# matrix.rowSizeAveStdev : Matrix
Computes the row size, ave and stdev and returns a k x 3 matrix containing the stats
// Computes the row stats
val matrixRowStats = matrix.rowSizeAveStdev
def rowColValSymbols : List[Symbol] = List(rowSym, colSym, valSym)
Column vector operations are similar to the row vector ones, where all function names are renamed from row to col and the return type is in general ColVector.
# matrix1 * ( matrix2 ) : Matrix
Computes the product of two matrices
// Computes the product of two matrices
val matrixProd = matrix1 * matrix2
# matrix / scalar(LiteralScalar) : Matrix
Computes the element-wise division of a matrix by a scalar
// Computes the element-wise division of a matrix by 100
val matrixDiv = matrix / 100
# matrix1.elemWiseOp( matrix2 ){ function }
Computes the element-wise merging of two matrices
// Computes the element-wise division of a matrix by another
matrix1.elemWiseOp(matrix2)((x,y) => x/y)
# matrix1 + (matrix2) : Matrix
Computes the sum of two matrices
// Computes the sum of two matrices
matrixSum = matrix1 + matrix2
# matrix1 - (matrix2) : Matrix
Computes the difference between two matrices
// Computes the difference between two matrices
matrixDiff = matrix1 - matrix2
# matrix1.hProd( matrix2 ) : Matrix
Computes the element-wise product of two matrices
// Computes the element-wise product of two matrices
matrixProd = matrix1.hProd( matrix2 )
# matrix1.zip( matrix2/row/column ) : Matrix
Merges the elements of the two matrices creating a matrix that has as values the pair tuples. Similarly, when zipping a matrix with a row or a vector, it zips the values of the row/column across all of the rows/columns of the matrix
// Returns the matrix with elements of the type ( 0, elem2), ( elem1, 0) and (elem1, elem2)
// matrix = 0 1 1
// 1 2 0
// rowVct = 1 0 1
// matrixVctPairs = (0, 1) (1, 0) (1, 1)
// (1, 1) (2, 0) (0, 1)
matrixPairs = matrix1.zip( matrix2 )
matrixVctPairs = matrix.zip ( rowVct )
# matrix.nonZerosWith( Scalar )
Combines the scalar on the right with all non-zeros in this matrix
// Similar to zip, but combine the scalar on the right with all non-zeros in this matrix:
matrixProd = matrix1.nonZerosWith( 100 )
# matrix.trace : Scalar
Computes the trace of a matrix
// Computes the trace of a matrix
trace = matrix.trace
# matrix.sum : Scalar
Computes the sum of the elements of a matrix
// Computes the sum of the elements of a matrix
sum = matrix.sum
# matrix.transpose : Matrix
Computes the transpose of the matrix
// Computes the transpose of the matrix
matrixTranspose = matrix.transpose
# matrix.diagonal : DiagonalMatrix
Returns the diagonal of the matrix
// Returns the diagonal of the matrix
matrixDiag = matrix.diagonal
# pipe.toMatrix( fields ) : Matrix
Constructs a matrix from a pipe.
// The matrix will contain all of the data from the pipe and will have the names of the row, column and value dimensions ‘a, ‘b and ‘c.
val matrix = pipe.toMatrix(‘a,’b,’c)
# pipe.mapToMatrix( fields ) { mapping function } : Matrix
Constructs a matrix from a pipe by applying first a mapping function
// The new pipe contains all of the triples data in matrix with squared values
val matrix = pipe.mapToMatrix(‘a, ‘b, ‘c){ (x, y, z) => (x, y, z * z) }
# pipe.flatMapToMatrix( fields ) { mapping function } : Matrix
Constructs a matrix from a pipe by applying first a mapping function
// The new pipe contains all of the triples data in matrix with squared values
val matrix = pipe.mapToMatrix( ‘a, ‘b, ‘c){ (x, y, z) => (x, y, z * z) }
# Matrix.readTSV( filename ) : Matrix
Loads a matrix from a TSV file that contains a triple per line
// The matrix contains the data from the TSV file
val newMatrix = Matrix.readTSV( "input" )
# matrix.pipe : RichPipe
Returns the pipe associated with the matrix. The pipe contains tuples of three elements that can be accessed using matrix.rowSym, matrix.colSym, matrix.valSym.
// The new pipe contains all of the triples data in matrix
val newPipe = matrix.pipe
# matrix.pipeAs( toFields ) : RichPipe
Returns the list of the fields of the pipe associated with the matrix: matrix.rowSym, matrix.colSym, matrix.valSym.
// The new pipe contains all of the triples data in matrix renamed to
val pipeFields = matrix.pipeAs( ‘a, ‘b, ‘c)
# matrix.write( toFields ) : Matrix
Writes to a sink and return the matrix data for further processing.
// Writes to a sink and return the matrix data for further processing.
matrix.write( “output” )
# matrix.fields : List[Symbol]
Returns the list of the fields of the pipe associated with the matrix: matrix.rowSym, matrix.colSym, matrix.valSym.
// The new pipe contains all of the triples data in matrix
val pipeFields = matrix.fields
# matrix.hasHint : SizeHint
Returns the list of the fields of the pipe associated with the matrix: matrix.rowSym, matrix.colSym, matrix.valSym.
// The new pipe contains all of the triples data in matrix
val pipeFields = matrix.fields
# matrix.withSizeHint : Matrix
Adds a SizeHint to the matrix
// The new matrix has a new SizeHint
val newMatrix = matrix.withSizeHint( 4000, 4000 )
- Scaladocs
- Getting Started
- Type-safe API Reference
- SQL to Scalding
- Building Bigger Platforms With Scalding
- Scalding Sources
- Scalding-Commons
- Rosetta Code
- Fields-based API Reference (deprecated)
- Scalding: Powerful & Concise MapReduce Programming
- Scalding lecture for UC Berkeley's Analyzing Big Data with Twitter class
- Scalding REPL with Eclipse Scala Worksheets
- Scalding with CDH3U2 in a Maven project
- Running your Scalding jobs in Eclipse
- Running your Scalding jobs in IDEA intellij
- Running Scalding jobs on EMR
- Running Scalding with HBase support: Scalding HBase wiki
- Using the distributed cache
- Unit Testing Scalding Jobs
- TDD for Scalding
- Using counters
- Scalding for the impatient
- Movie Recommendations and more in MapReduce and Scalding
- Generating Recommendations with MapReduce and Scalding
- Poker collusion detection with Mahout and Scalding
- Portfolio Management in Scalding
- Find the Fastest Growing County in US, 1969-2011, using Scalding
- Mod-4 matrix arithmetic with Scalding and Algebird
- Dean Wampler's Scalding Workshop
- Typesafe's Activator for Scalding