A web-based platform for comparing Apache big data processing frameworks
Hadoop: HadoopFiles/README.md
Storm: StormFiles/StormSetup.md
Samza: SamzaFiles/SamzaSetup.md
Spark: SparkFiles/README.md
Flink: FlinkFiles/README.md
This is a NodeJS application that is put into each server's machine and will call the appropriate shell files when invoked.
- Install NodeJS on the server
- Run
npm install
on exec_api folder. - Run
npm start
, and the API will run on port 3000.
This is another NodeJS application that can run on localhost to show the test data.
The historical data will not be available for a client immediately,
as the historical data is stored in a local database when the servers run,
so if you want to see the data we used, you will need to insert the data from stats_raw.xlsx
manually.
The application uses PostgreSQL, so you will need to insert the SQL file manually.
[For more info. please look at Data section]
- Install NodeJS on the machine that will run the site.
- Run
npm install
onuserinterface
folder to install packages. - Run
npm start ./bin/www
and it will run on port 3000. - Goto URL:
http://localhost:3000
on browser to test the interface.
The historical data that we use is available in the data&graphs/stats_raw.xlsx
file.
This data was gathered over 10 runs for each server and measured CPU usage and memory usage.
Usages were polled once every second.
- Use
data&graphs/gathered_historical_data.sql
to insert historical data into the database.