Skip to content

This project implements a distributed system for computing TF-IDF values efficiently, with a user-friendly web interface.

Notifications You must be signed in to change notification settings

kheder-hassoun/My-Tf-IDF-Distributed-System-Web-based-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TF-IDF-System

Introduction

This project implements a distributed system for computing TF-IDF values efficiently, with a user-friendly web interface. The design elegantly handles file management and communication between nodes, while also addressing challenges such as leader changes during runtime. To include your GitHub repository URL in a cool and visually appealing way in your README file, you can use badges, links, or custom sections. Here's an example of how to make it stand out:


🔗 Repository Link

Visit Repository


💻 Clone the Repository

git clone https://github.com/kheder-hassoun/My-Tf-IDF-Distributed-System-Web-based-.git

📂 Quick Access



Key Points of the Project

  1. Worker-Specific Files:
    Each worker has its own set of files. This is achieved by specifying the file directory path in the application.properties file.

    • When launching each worker instance, the path to its files must be defined to ensure proper data separation.
  2. Leader (Coordinator):
    The leader does not handle or access the files directly. Its primary role is to forward the query to all available workers for processing.

  3. Communication Between the Leader and Web Server:
    The issue of leader changes (and the associated change in the port number) is addressed using Zookeeper.

    • A Zookeeper node named leader_info is created to store the port information of the current leader.
    • When the search button is clicked on the web interface, the web server contacts this node to retrieve the current leader's connection details and then sends the query to it.

How to Run

  1. Set Up Worker Files:

    • Define the file directory path for each worker in the application.properties file.
  2. Start the Leader and Workers:

    • Start the leader first, followed by the workers, ensuring each worker has its file directory configured.
  3. Use the Web Interface:

    • Once the system is running, use the web interface to submit queries.
    • When a search query is submitted, the web server connects to the Zookeeper node to fetch the current leader's information and forwards the query to the leader for processing.

Additional Notes

  • Ensure Zookeeper is set up and running correctly before launching the system.
  • The system is designed for horizontal scalability, allowing you to add more workers as needed.

kheder hassoun

  • For more details about configuring the system, refer to the configuration files.

kheder hassoun

About

This project implements a distributed system for computing TF-IDF values efficiently, with a user-friendly web interface.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published