Skip to content

DistributedSystemsGroup/Spark-TPC-DS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spark-TPC-DS

Spark job for the TPC-DS benchmark.

This code uses this library from Databricks: https://github.com/databricks/spark-sql-perf

To compile put the jar compiled from the above library in lib/ and then run build/sbt assembly

To execute the following arguments must be provided:

  1. HDFS data location ("/user/test/tpcds-data")
  2. scale factor (10)
  3. HDFS result location ("/user/test/tpcds-results")
  4. N. iterations
  5. query to execute:
  • impalakit
  • interactive
  • reporting
  • deepAnalytics
  • simple
  1. dsdgenDir

About

Spark job for the TPC-DS benchmark

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •