Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove WriteGraph; resolves #439. #441

Merged
merged 11 commits into from
Apr 14, 2020
224 changes: 0 additions & 224 deletions src/main/scala/io/archivesunleashed/app/WriteGraph.scala

This file was deleted.

17 changes: 13 additions & 4 deletions src/main/scala/io/archivesunleashed/app/WriteGraphML.scala
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,10 @@ import org.apache.spark.sql.Row
*/
object WriteGraphML {

/** Writes graph nodes and edges to file.
/** Writes graph nodes and edges to file (rdd).
*
* @param rdd RDD of elements in format ((datestring, source, target), count)
* @param rdd RDD of elements in format ((CrawlDate, SourceDomain,
* DestinationDomain), Frequency)
* @param graphmlPath output file
*/
def apply(rdd: RDD[((String, String, String), Int)], graphmlPath: String): Boolean = {
Expand All @@ -38,6 +39,12 @@ object WriteGraphML {
}
}

/** Writes graph nodes and edges to file (df).
*
* @param ds Array[Row] elements in format (CrawlDate, SrcDomain,
* DestDomain, count)
* @param graphmlPath output file
*/
def apply(ds: Array[Row], graphmlPath: String): Boolean = {
if (graphmlPath.isEmpty()) {
false
Expand All @@ -48,7 +55,8 @@ object WriteGraphML {

/** Produces the GraphML output from an RDD of tuples and outputs it to graphmlPath.
*
* @param rdd RDD of elements in format ((datestring, source, target), count)
* @param rdd RDD of elements in format ((CrawlDate, SourceDomain,
* DestinationDomain), Frequency)
* @param graphmlPath output file
* @return true on successful run.
*/
Expand Down Expand Up @@ -84,7 +92,8 @@ object WriteGraphML {

/** Produces the GraphML output from an Array[Row] and outputs it to graphmlPath.
*
* @param data a Dataset[Row] of elements in format (datestring, source, target, count)
* @param data a Dataset[Row] of elements in format (CrawlDate, SrcDomain,
* DestDomain, count)
* @param graphmlPath output file
* @return true on success.
*/
Expand Down
Loading