Skip to content

NikolaAndro/PageRank_Hadoop_Spark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PageRank_Hadoop_Spark

Description:

  • This program is done in Scala.

  • Hadoop is used as memory for Spark.

  • commands.scala file is there to make your life easier to copy paste all commands at once to the command prompt.

  • PageRank_complete.scala file contains the problem description and commented code.

Instructions:


Hadoop Commands (copied from: https://github.com/shudipdatta/Spark_Demo/edit/master/commands.txt)


hadoop fs -mkdir InputFolder //make the inputfolder

hadoop fs -copyFromLocal '/path/to/your/inputFolder/PR_Data-1copy.txt' InputFolder //copy PR_Data-1copy.txt to hdfs inputfolde


Spark Commands


  • When running the code, make sure you change the path to your input file in hadoop at the beginning of the program.
  • Execute all commands in the given order.
  • The result will be displayed in the console.

About

Used Spark to manage big data for PageRank algorithm.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages