Skip to content

Commit 739a333

Browse files
shiv4nshsrowen
authored andcommitted
[SPARK-16911] Fix the links in the programming guide
## What changes were proposed in this pull request? Fix the broken links in the programming guide of the Graphx Migration and understanding closures ## How was this patch tested? By running the test cases and checking the links. Author: Shivansh <shiv4nsh@gmail.com> Closes #14503 from shiv4nsh/SPARK-16911. (cherry picked from commit 6c1ecb1) Signed-off-by: Sean Owen <sowen@cloudera.com>
1 parent 3f8a95b commit 739a333

File tree

3 files changed

+1
-106
lines changed

3 files changed

+1
-106
lines changed

docs/graphx-programming-guide.md

Lines changed: 0 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -67,23 +67,6 @@ operators (e.g., [subgraph](#structural_operators), [joinVertices](#join_operato
6767
[aggregateMessages](#aggregateMessages)) as well as an optimized variant of the [Pregel](#pregel) API. In addition, GraphX includes a growing collection of graph [algorithms](#graph_algorithms) and
6868
[builders](#graph_builders) to simplify graph analytics tasks.
6969

70-
71-
## Migrating from Spark 1.1
72-
73-
GraphX in Spark 1.2 contains a few user facing API changes:
74-
75-
1. To improve performance we have introduced a new version of
76-
[`mapReduceTriplets`][Graph.mapReduceTriplets] called
77-
[`aggregateMessages`][Graph.aggregateMessages] which takes the messages previously returned from
78-
[`mapReduceTriplets`][Graph.mapReduceTriplets] through a callback ([`EdgeContext`][EdgeContext])
79-
rather than by return value.
80-
We are deprecating [`mapReduceTriplets`][Graph.mapReduceTriplets] and encourage users to consult
81-
the [transition guide](#mrTripletsTransition).
82-
83-
2. In Spark 1.0 and 1.1, the type signature of [`EdgeRDD`][EdgeRDD] switched from
84-
`EdgeRDD[ED]` to `EdgeRDD[ED, VD]` to enable some caching optimizations. We have since discovered
85-
a more elegant solution and have restored the type signature to the more natural `EdgeRDD[ED]` type.
86-
8770
# Getting Started
8871

8972
To get started you first need to import Spark and GraphX into your project, as follows:

docs/programming-guide.md

Lines changed: 1 addition & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -1097,7 +1097,7 @@ for details.
10971097
<tr>
10981098
<td> <b>foreach</b>(<i>func</i>) </td>
10991099
<td> Run a function <i>func</i> on each element of the dataset. This is usually done for side effects such as updating an <a href="#accumulators">Accumulator</a> or interacting with external storage systems.
1100-
<br /><b>Note</b>: modifying variables other than Accumulators outside of the <code>foreach()</code> may result in undefined behavior. See <a href="#ClosuresLink">Understanding closures </a> for more details.</td>
1100+
<br /><b>Note</b>: modifying variables other than Accumulators outside of the <code>foreach()</code> may result in undefined behavior. See <a href="#understanding-closures-a-nameclosureslinka">Understanding closures </a> for more details.</td>
11011101
</tr>
11021102
</table>
11031103

@@ -1544,49 +1544,6 @@ and then call `SparkContext.stop()` to tear it down.
15441544
Make sure you stop the context within a `finally` block or the test framework's `tearDown` method,
15451545
as Spark does not support two contexts running concurrently in the same program.
15461546

1547-
# Migrating from pre-1.0 Versions of Spark
1548-
1549-
<div class="codetabs">
1550-
1551-
<div data-lang="scala" markdown="1">
1552-
1553-
Spark 1.0 freezes the API of Spark Core for the 1.X series, in that any API available today that is
1554-
not marked "experimental" or "developer API" will be supported in future versions.
1555-
The only change for Scala users is that the grouping operations, e.g. `groupByKey`, `cogroup` and `join`,
1556-
have changed from returning `(Key, Seq[Value])` pairs to `(Key, Iterable[Value])`.
1557-
1558-
</div>
1559-
1560-
<div data-lang="java" markdown="1">
1561-
1562-
Spark 1.0 freezes the API of Spark Core for the 1.X series, in that any API available today that is
1563-
not marked "experimental" or "developer API" will be supported in future versions.
1564-
Several changes were made to the Java API:
1565-
1566-
* The Function classes in `org.apache.spark.api.java.function` became interfaces in 1.0, meaning that old
1567-
code that `extends Function` should `implement Function` instead.
1568-
* New variants of the `map` transformations, like `mapToPair` and `mapToDouble`, were added to create RDDs
1569-
of special data types.
1570-
* Grouping operations like `groupByKey`, `cogroup` and `join` have changed from returning
1571-
`(Key, List<Value>)` pairs to `(Key, Iterable<Value>)`.
1572-
1573-
</div>
1574-
1575-
<div data-lang="python" markdown="1">
1576-
1577-
Spark 1.0 freezes the API of Spark Core for the 1.X series, in that any API available today that is
1578-
not marked "experimental" or "developer API" will be supported in future versions.
1579-
The only change for Python users is that the grouping operations, e.g. `groupByKey`, `cogroup` and `join`,
1580-
have changed from returning (key, list of values) pairs to (key, iterable of values).
1581-
1582-
</div>
1583-
1584-
</div>
1585-
1586-
Migration guides are also available for [Spark Streaming](streaming-programming-guide.html#migration-guide-from-091-or-below-to-1x),
1587-
[MLlib](ml-guide.html#migration-guide) and [GraphX](graphx-programming-guide.html#migrating-from-spark-091).
1588-
1589-
15901547
# Where to Go from Here
15911548

15921549
You can see some [example Spark programs](http://spark.apache.org/examples.html) on the Spark website.

docs/streaming-programming-guide.md

Lines changed: 0 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -2378,51 +2378,6 @@ additional effort may be necessary to achieve exactly-once semantics. There are
23782378
***************************************************************************************************
23792379
***************************************************************************************************
23802380

2381-
# Migration Guide from 0.9.1 or below to 1.x
2382-
Between Spark 0.9.1 and Spark 1.0, there were a few API changes made to ensure future API stability.
2383-
This section elaborates the steps required to migrate your existing code to 1.0.
2384-
2385-
**Input DStreams**: All operations that create an input stream (e.g., `StreamingContext.socketStream`, `FlumeUtils.createStream`, etc.) now returns
2386-
[InputDStream](api/scala/index.html#org.apache.spark.streaming.dstream.InputDStream) /
2387-
[ReceiverInputDStream](api/scala/index.html#org.apache.spark.streaming.dstream.ReceiverInputDStream)
2388-
(instead of DStream) for Scala, and [JavaInputDStream](api/java/index.html?org/apache/spark/streaming/api/java/JavaInputDStream.html) /
2389-
[JavaPairInputDStream](api/java/index.html?org/apache/spark/streaming/api/java/JavaPairInputDStream.html) /
2390-
[JavaReceiverInputDStream](api/java/index.html?org/apache/spark/streaming/api/java/JavaReceiverInputDStream.html) /
2391-
[JavaPairReceiverInputDStream](api/java/index.html?org/apache/spark/streaming/api/java/JavaPairReceiverInputDStream.html)
2392-
(instead of JavaDStream) for Java. This ensures that functionality specific to input streams can
2393-
be added to these classes in the future without breaking binary compatibility.
2394-
Note that your existing Spark Streaming applications should not require any change
2395-
(as these new classes are subclasses of DStream/JavaDStream) but may require recompilation with Spark 1.0.
2396-
2397-
**Custom Network Receivers**: Since the release to Spark Streaming, custom network receivers could be defined
2398-
in Scala using the class NetworkReceiver. However, the API was limited in terms of error handling
2399-
and reporting, and could not be used from Java. Starting Spark 1.0, this class has been
2400-
replaced by [Receiver](api/scala/index.html#org.apache.spark.streaming.receiver.Receiver) which has
2401-
the following advantages.
2402-
2403-
* Methods like `stop` and `restart` have been added to for better control of the lifecycle of a receiver. See
2404-
the [custom receiver guide](streaming-custom-receivers.html) for more details.
2405-
* Custom receivers can be implemented using both Scala and Java.
2406-
2407-
To migrate your existing custom receivers from the earlier NetworkReceiver to the new Receiver, you have
2408-
to do the following.
2409-
2410-
* Make your custom receiver class extend
2411-
[`org.apache.spark.streaming.receiver.Receiver`](api/scala/index.html#org.apache.spark.streaming.receiver.Receiver)
2412-
instead of `org.apache.spark.streaming.dstream.NetworkReceiver`.
2413-
* Earlier, a BlockGenerator object had to be created by the custom receiver, to which received data was
2414-
added for being stored in Spark. It had to be explicitly started and stopped from `onStart()` and `onStop()`
2415-
methods. The new Receiver class makes this unnecessary as it adds a set of methods named `store(<data>)`
2416-
that can be called to store the data in Spark. So, to migrate your custom network receiver, remove any
2417-
BlockGenerator object (does not exist any more in Spark 1.0 anyway), and use `store(...)` methods on
2418-
received data.
2419-
2420-
**Actor-based Receivers**: The Actor-based Receiver APIs have been moved to [DStream Akka](https://github.com/spark-packages/dstream-akka).
2421-
Please refer to the project for more details.
2422-
2423-
***************************************************************************************************
2424-
***************************************************************************************************
2425-
24262381
# Where to Go from Here
24272382
* Additional guides
24282383
- [Kafka Integration Guide](streaming-kafka-integration.html)

0 commit comments

Comments
 (0)