A distributed storage system implementing the Raft consensus algorithm for fault tolerance and high availability.
This project implements a distributed file system that is designed to be fault-tolerant. It leverages the Raft consensus algorithm to ensure data consistency and availability even when some nodes in the system fail.
(To be filled in - e.g., specific capabilities of the file system)
- Fault tolerance via Raft consensus
- Distributed storage
- ...
- Python 3.8 or higher
- pip (Python package installer)
- Create and activate a virtual environment:
# Create virtual environment
python -m venv venv
# Activate virtual environment
# On macOS/Linux:
source venv/bin/activate
# On Windows:
.\venv\Scripts\activate- Install dependencies:
pip install -r requirements.txtThe cluster is managed using the manage_cluster.py script. Here are the available commands:
To start all nodes in the cluster:
python scripts/manage_cluster.py start-allTo start a specific node:
python scripts/manage_cluster.py start <node_id>To stop all nodes in the cluster:
python scripts/manage_cluster.py stop-allTo stop a specific node:
python scripts/manage_cluster.py stop <node_id>You can also specify the signal to use when stopping nodes:
python scripts/manage_cluster.py stop-all --sig TERM # SIGTERM (default)
python scripts/manage_cluster.py stop-all --sig INT # SIGINT
python scripts/manage_cluster.py stop-all --sig KILL # SIGKILLTo check the status of all nodes:
python scripts/manage_cluster.py statusTo view logs for a specific node:
python scripts/manage_cluster.py logs <node_id>To view a specific number of lines:
python scripts/manage_cluster.py logs <node_id> -n 50Use the run_client.py script to interact with the cluster. The client supports three operations: PUT, GET, and DELETE.
- PUT - Store a value:
python scripts/run_client.py <server_address> put <key> <value>- GET - Retrieve a value:
python scripts/run_client.py <server_address> get <key>- DELETE - Remove a value:
python scripts/run_client.py <server_address> delete <key>- Store a value:
python scripts/run_client.py localhost:8001 put mykey "Hello, World!"- Retrieve a value:
python scripts/run_client.py localhost:8001 get mykey- Delete a value:
python scripts/run_client.py localhost:8001 delete mykey- Store binary data from a file:
cat myfile.bin | python scripts/run_client.py localhost:8001 put myfile -The cluster configuration is stored in cluster_config.json. This file contains settings for:
- Number of nodes
- Base port (starts from 8001)
- Host address
- Data directory locations
- Log directory locations
- PID file locations
- The client will automatically handle leader redirection if it connects to a follower node
- Each command has a timeout of 15 seconds
- The client will retry up to 5 times if redirected to the leader
- Logs for each node are stored in the configured log directory
- PID files are used to track running nodes and are stored in the configured PID directory
- Node ports start from 8001 and increment for each additional node (8001, 8002, 8003, etc.)
cursor-pyds/
├── proto/ # Protocol buffer definitions
├── pyproject.toml # Project metadata and build configuration
├── requirements.txt # Project dependencies
├── scripts/ # Utility scripts
├── src/ # Source code
│ └── distributed_fs/ # Main package for the distributed file system
│ ├── generated/ # Generated code from .proto files
│ ├── raft/ # Raft consensus algorithm implementation
│ └── ... # Other modules
├── tests/ # Unit and integration tests
└── venv/ # Python virtual environment (if created)
(Information for developers wanting to contribute or understand the codebase)
Beyond the standard installation, for development you might need additional tools, especially for gRPC code generation.
- Ensure
grpcio-toolsis installed. You can typically install it via pip:Consider adding this to apip install grpcio-tools
requirements-dev.txtif you create one.
(Assuming pytest is used, update if different)
pytest tests/The project uses gRPC and Protocol Buffers. Interface definitions are located in .proto files within the proto/ directory. If you modify these files, you must regenerate the corresponding Python gRPC stubs and message classes. These generated files are placed in src/distributed_fs/generated/.
Command for regenerating gRPC code:
You'll need grpcio-tools installed (see above). The following command can be run from the root of the project:
python -m grpc_tools.protoc \
-I./proto \
--python_out=./src/distributed_fs/generated \
--pyi_out=./src/distributed_fs/generated \
--grpc_python_out=./src/distributed_fs/generated \
./proto/*.proto-I./proto: Specifies the directory where your.protofiles are located.--python_out: Specifies the directory for the generated Python message classes.--pyi_out: Specifies the directory for the generated Python type stub files (.pyi).--grpc_python_out: Specifies the directory for the generated Python gRPC client and server stubs../proto/*.proto: Specifies all.protofiles in theprotodirectory to be processed.
Consider adding this command to a helper script in the scripts/ directory for convenience.
Team Members
Name Email Student ID
D.T.Jayakody it23286382@my.sliit.lk it23286382
Salah I it23195752@my.sliit.lk it23195752
Samarajeewa B D G M M it23279070@my.sliit.lk it23279070
Apiram R it23444782@my.sliit.lk it23444782