Author:
- Alessandro Conti: AlessandroConti11
- Luca Bancale: sbancuz
License: MIT license
Tags: #Apache-Flink
, #big-data
, #computer_engineering
, #distributed_system
, #fault-tollerance
, #java
, #map_reduce
, #protobuf
, #polimi
.
Politecnico di Milano.
Academic Year: 2023/2024.
090950 - Distributed Systems - professor Cugola Giampaolo Saverio - optional project.
Specification overview:
Implement a distributed dataflow platform for processing large amount (big-data) of key-value pairs, where keys and values are integers.
The platform includes a coordinator and multiple workers running on multiple nodes of a distributed system.
The coordinator accepts dataflow programs specified as an arbitrarily long sequence of the above operators.
Full specification are in the Specification/projects_2023-2024
The steps specified below are suitable for a Unix environment.
- set environment variables in the .env file
- INET_IFACE
- FAULTY_THREADS
- FAULTY_THREADS_SECS_INTERVAL
- FAULT_PROBABILITY
- compile the proto message
./run proto
- run the allocator
- allocates the WorkerManagers
- allocates the Coordinator
./run alloc
- run the client
Optionally, it is possible to specify the operations and files to be executed.
./run client ADDRESS_COORDINATOR_ALLOC OTHER_ALLOC_ADDRESSES
Final Evaluation: 4/4