Skip to content

pratikbarjatya/multi-document-summarizer

Repository files navigation

Multi-Document Summarizer As a Service

Powered by

image

Table of Contents

About the App

Multi-document Summarizer as a Service is a cloud-based platform that empowers users to efficiently summarize multiple documents, leveraging AI, into concise, coherent summaries.

✅ If you're a busy professional who wants to summarize the news, don't worry; you can use our news model to import and summarize your daily information.

✅ If you're a student with numerous PDFs and documents and don't have time to go through them all, you can use our multi-document summarization model.

✅ If you simply want to save time and avoid unnecessary text, no problem – we've got your back.

Key Features:

  • Source Selection: Choose from a variety of reputable news sources, including The Wall Street Journal, BBC, Europa Press, and more.

  • Article Summaries: View concise summaries of news articles, making it easy to stay informed quickly.

  • Save to Documents: Easily save interesting articles as documents for future reference.

  • Responsive Design: Access the app seamlessly on various devices, from desktops to smartphones.

Technologies Used

Outline the technologies and tools used in your project. Include programming languages, frameworks, libraries, databases, and any other important tech stack components.

  • Python
  • Flask
  • PostgreSQL
  • JavaScript
  • Docker
  • News API
  • HTML/CSS

Project Structure

The current project directory is organized as follows:

root/
├── run.py # Flask application script for startup
├── app.py # Factory pattern
├── requirements.txt # All needed libs
├── blueprints/
├── models.py # Models used in the app
├── docker-compose.yml # Docker configuration
├── summarizer/ # Summarization module
│ └── summarizer.py # Summarization logic
├── templates/ # HTML templates
│ └── auth/ # Login and Register UI
│ └── main/ # Dashboard and main part of the App
├── tests/
├── static/
├── config/ # Factory pattern settings

Getting Started

To run the application, follow these steps:

  1. Ensure you have Flask installed. If not, you can install it using
pip install -r requirements.txt
  1. Modify the summarization logic in summarizer_nltk/summarizer.py to suit your summarization requirements. [Current model used is nltk, more models will be added in the future]

  2. Start the Flask application by running the following command in the root directory:

python run.py
  1. Access the homepage at http://127.0.0.1:5000/ or http://localhost:5000/ and use the register to start.

API Documentation

Table of Contents

  1. Authentication

  2. Documents

  3. News

  4. Summarizer

Authentication

Login

  • URL: /login
  • Method: POST
  • Description: Log in with your username and password.
  • Request:
    • username (string): Your username.
    • password (string): Your password.
  • Response:
    • If successful, it redirects to /dashboard.
    • If unsuccessful, returns an error message with status 401.

Logout

  • URL: /logout
  • Method: GET
  • Description: Log out the user.
  • Response:
    • Clears the user session and redirects to /login.

Register

  • URL: /register
  • Method: POST
  • Description: Register a new user.
  • Request:
    • username (string): Desired username.
    • password (string): Password.
    • email (string): Email address.
  • Response:
    • If successful, redirects to /login?success=true.
    • If the username is already taken, returns an error message with status 400.

Documents

Get All Documents

  • URL: /get-all-documents
  • Method: GET
  • Authentication: Login required.
  • Description: Get all documents for the current user.
  • Response:
    • JSON array of documents with fields:
      • name (string): Document title.
      • dateUploaded (string): Date of upload.
      • id (int): Document ID.
      • content (string): Document content.
      • user (string): Username of the owner.

Delete Document

  • URL: /delete-document/<int:document_id>
  • Method: DELETE
  • Authentication: Login required.
  • Description: Delete a document by its ID for the current user.
  • Response:
    • If successful, returns a success message.
    • If the document does not belong to the user, returns an error message with status 403.
    • If the document is not found, returns an error message with status 404.

Get Document Content

  • URL: /get-document-content/<int:document_id>
  • Method: GET
  • Authentication: Login required.
  • Description: Get the content of a document by its ID for the current user.
  • Response:
    • If successful and the document belongs to the user, returns the document content.
    • If unauthorized, returns an error message with status 403.
    • If the document is not found, returns an error message with status 404.

Upload Document

  • URL: /upload-document
  • Method: POST
  • Authentication: Login required.
  • Description: Upload a document for the current user.
  • Request:
    • Multiple files with names like content0, content1, etc.
    • Optional title for each document: title0, title1, etc.
  • Response:
    • If successful, returns a JSON object with document IDs and a success message.
    • If no valid documents are uploaded, returns an error message with status 400.
    • If no content is provided, returns an error message with status 400.

Upload Document from Link

  • URL: /upload-document-link
  • Method: POST
  • Authentication: Login required.
  • Description: Upload a document from a link for the current user.
  • Request:
    • link (string): URL of the document.
    • Optional title (string): Title of the document (default is "Untitled").
  • Response:
    • If successful, returns a JSON object with the document ID and a success message.
    • If failed to fetch the document content, returns an error message with status 400.

Upload Document from News

  • URL: /upload-document-news
  • Method: POST
  • Authentication: Login required.
  • Description: Upload a document from news for the current user.
  • Request:
    • title (string): Title of the news article.
    • content (string): Content of the news article.
  • Response:
    • If successful, returns a JSON object with the document ID and a success message.
    • If failed to fetch the document content, returns an error message with status 400.

News

Fetch Articles for Outlet

  • URL: /get-news/<string:source>
  • Method: GET
  • Description: Fetch articles for a news outlet.
  • Request:
    • source (string): News outlet source.
  • Response:
    • If successful, returns information about an article:
      • title (string): Article title.
      • description (string): Article content.
      • url (string): Article URL.
      • date (string): Article publication date.

Summarizer

Get All Summaries

  • URL: /get-all-summaries
  • Method: GET
  • Authentication: Login required.
  • Description: Get all summaries for the current user.
  • Response:
    • JSON array of summaries with fields:
      • summary_id (int): Summary ID.
      • generated_summary (string): Generated summary.
      • method (string): Summarization method.

Get Summary Content

  • URL: /get-summary-content/<int:summary_id>
  • Method: GET
  • Authentication: Login required.
  • Description: Get the content of a summary by its ID for the current user.
  • Response:
    • If successful and the summary belongs to the user, returns the summary content.
    • If unauthorized, returns an error message with status 403.
    • If the summary is not found, returns an error message with status 404.

Delete Summary

  • URL: /delete-summary/<int:summary_id>
  • Method: DELETE
  • Authentication: Login required.
  • Description: Delete a summary by its ID for the current user.
  • Response:
    • If successful, returns a success message.
    • If the summary does not belong to the user, returns an error message with status 403.
    • If the summary is not found, returns an error message with status 404.

Summarize Documents

  • URL: /summarize
  • Method: POST
  • Authentication: Login required.
  • Description: Generate a summary for a list of documents.
  • Request:
    • documents (list of strings): List of document content for summarization.
    • summarization_method (string): Chosen summarization method.
  • Response:
    • If successful, returns a success message.

Docker Instructions

Instructions for setting up the app with Docker

Run a terminal inside a folder with Dockerfile and docker-compose.yml

First, make sure you have Docker installed on your machine and that the Dockerfile and docker-compose.yaml, which are provided, are in a directory of your choice.

Run the terminal inside the said directory.

Login to your Docker account and pull images

Open Docker Desktop and run the command:

docker login

To log into your Docker account. Once logged in, run the command:

docker-compose pull

This will pull images from the Docker Hub onto your machine.

Build a Docker container and run it on your local machine

After pulling the images, run the command:

docker-compose up

This will start the application and the proccess is complete

** NOTE: Both images are required for the application to work! Also, make sure that Dockerfile and docker-compose.yaml are in the same directory from which you are running the commands provided in this file **

License

This project is licensed under the Apache License. See LICENSE for more details.

About

Multi-Document Summarizer As a Service

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages