Skip to content

Commit e108b9a

Browse files
mengxrrxin
authored andcommitted
[SPARK-1260]: faster construction of features with intercept
The current implementation uses `Array(1.0, features: _*)` to construct a new array with intercept. This is not efficient for big arrays because `Array.apply` uses a for loop that iterates over the arguments. `Array.+:` is a better choice here. Also, I don't see a reason to set initial weights to ones. So I set them to zeros. JIRA: https://spark-project.atlassian.net/browse/SPARK-1260 Author: Xiangrui Meng <meng@databricks.com> Closes #161 from mengxr/sgd and squashes the following commits: b5cfc53 [Xiangrui Meng] set default weights to zeros a1439c2 [Xiangrui Meng] faster construction of features with intercept
1 parent 79e547f commit e108b9a

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

mllib/src/main/scala/org/apache/spark/mllib/regression/GeneralizedLinearAlgorithm.scala

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@ abstract class GeneralizedLinearAlgorithm[M <: GeneralizedLinearModel]
119119
*/
120120
def run(input: RDD[LabeledPoint]) : M = {
121121
val nfeatures: Int = input.first().features.length
122-
val initialWeights = Array.fill(nfeatures)(1.0)
122+
val initialWeights = new Array[Double](nfeatures)
123123
run(input, initialWeights)
124124
}
125125

@@ -134,15 +134,15 @@ abstract class GeneralizedLinearAlgorithm[M <: GeneralizedLinearModel]
134134
throw new SparkException("Input validation failed.")
135135
}
136136

137-
// Add a extra variable consisting of all 1.0's for the intercept.
137+
// Prepend an extra variable consisting of all 1.0's for the intercept.
138138
val data = if (addIntercept) {
139-
input.map(labeledPoint => (labeledPoint.label, Array(1.0, labeledPoint.features:_*)))
139+
input.map(labeledPoint => (labeledPoint.label, labeledPoint.features.+:(1.0)))
140140
} else {
141141
input.map(labeledPoint => (labeledPoint.label, labeledPoint.features))
142142
}
143143

144144
val initialWeightsWithIntercept = if (addIntercept) {
145-
Array(1.0, initialWeights:_*)
145+
initialWeights.+:(1.0)
146146
} else {
147147
initialWeights
148148
}

0 commit comments

Comments
 (0)