forked from apache/spark
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-3936] Add aggregateMessages, which supersedes mapReduceTriplets
aggregateMessages enables neighborhood computation similarly to mapReduceTriplets, but it introduces two API improvements: 1. Messages are sent using an imperative interface based on EdgeContext rather than by returning an iterator of messages. 2. Rather than attempting bytecode inspection, the required triplet fields must be explicitly specified by the user by passing a TripletFields object. This fixes SPARK-3936. Additionally, this PR includes the following optimizations for aggregateMessages and EdgePartition: 1. EdgePartition now stores local vertex ids instead of global ids. This avoids hash lookups when looking up vertex attributes and aggregating messages. 2. Internal iterators in aggregateMessages are inlined into a while loop. In total, these optimizations were tested to provide a 37% speedup on PageRank (uk-2007-05 graph, 10 iterations, 16 r3.2xlarge machines, sped up from 513 s to 322 s). Subsumes apache#2815. Also fixes SPARK-4173. Author: Ankur Dave <ankurdave@gmail.com> Closes apache#3100 from ankurdave/aggregateMessages and squashes the following commits: f5b65d0 [Ankur Dave] Address @rxin comments on apache#3054 and apache#3100 1e80aca [Ankur Dave] Add aggregateMessages, which supersedes mapReduceTriplets 194a2df [Ankur Dave] Test triplet iterator in EdgePartition serialization test e0f8ecc [Ankur Dave] Take activeSet in ExistingEdgePartitionBuilder c85076d [Ankur Dave] Readability improvements b567be2 [Ankur Dave] iter.foreach -> while loop 4a566dc [Ankur Dave] Optimizations for mapReduceTriplets and EdgePartition
- Loading branch information
Showing
15 changed files
with
766 additions
and
376 deletions.
There are no files selected for viewing
51 changes: 51 additions & 0 deletions
51
graphx/src/main/scala/org/apache/spark/graphx/EdgeContext.scala
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one or more | ||
* contributor license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright ownership. | ||
* The ASF licenses this file to You under the Apache License, Version 2.0 | ||
* (the "License"); you may not use this file except in compliance with | ||
* the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
package org.apache.spark.graphx | ||
|
||
/** | ||
* Represents an edge along with its neighboring vertices and allows sending messages along the | ||
* edge. Used in [[Graph#aggregateMessages]]. | ||
*/ | ||
abstract class EdgeContext[VD, ED, A] { | ||
/** The vertex id of the edge's source vertex. */ | ||
def srcId: VertexId | ||
/** The vertex id of the edge's destination vertex. */ | ||
def dstId: VertexId | ||
/** The vertex attribute of the edge's source vertex. */ | ||
def srcAttr: VD | ||
/** The vertex attribute of the edge's destination vertex. */ | ||
def dstAttr: VD | ||
/** The attribute associated with the edge. */ | ||
def attr: ED | ||
|
||
/** Sends a message to the source vertex. */ | ||
def sendToSrc(msg: A): Unit | ||
/** Sends a message to the destination vertex. */ | ||
def sendToDst(msg: A): Unit | ||
|
||
/** Converts the edge and vertex properties into an [[EdgeTriplet]] for convenience. */ | ||
def toEdgeTriplet: EdgeTriplet[VD, ED] = { | ||
val et = new EdgeTriplet[VD, ED] | ||
et.srcId = srcId | ||
et.srcAttr = srcAttr | ||
et.dstId = dstId | ||
et.dstAttr = dstAttr | ||
et.attr = attr | ||
et | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.