This is a Flask app that uses a fine-tuned BART model (trained on arXiv datasets) to summarize research papers. The goal was to create a simple, easy-to-use tool for researchers to get concise summaries of lengthy papers.
The project is completed! Here's what it includes:
- A functional Flask app to upload research papers and generate summaries.
- A fallback mechanism that formats outputs via a T5 base model for improved readability.
- Python 3.8 or above.
- Virtual environment tools like
conda
orvenv
are highly recommended.
-
Clone the repository:
git clone https://github.com/Firojpaudel/arXiv_Summarizer.git cd arXiv_Summarizer
-
Set up a virtual environment (recommended):
Using
conda
:conda create -n arxiv_summarizer python=3.10 conda activate arxiv_summarizer
Or using
venv
:python -m venv venv source venv/bin/activate (Linux/Mac) venv\Scripts\activate (Windows)
-
Install dependencies:
pip install -r requirements.txt
-
Download the fine_tuned_bart model:
The model is not included here due to its size. Please download it from the following link and place it inside the directory:
-
Run the app:
python app.py
The app will be hosted locally at
http://127.0.0.1:5000/
Here’s a quick preview of the app in action:
🏠 Homepage:
📝 Summarization in Action:
-
Once your paper is uploaded, the app gets to work, breaking down complex research into digestible summaries.
![Summarization in Action](/Firojpaudel/arXiv_Summarizer/raw/main/README_images/Summarize.png)
📄 Final Summary Output
-
A clear, concise summary is generated for your research paper, formatted beautifully in markdown for readability.
![Generated Summary](/Firojpaudel/arXiv_Summarizer/raw/main/README_images/Summary_generated.jpg)
- While BART works well for summarization, certain outputs lacked consistency due to variations in paper formats.
- The project demonstrated the limitations of fine-tuning on older datasets with diverse formatting styles.
- Future iterations could use a hybrid approach or train on more specialized datasets for better results.
arXiv_Summarizer/
│
├── app.py # Main Flask app
├── fine_tuned_bart/ # Directory to store the fine_tuned_bart model (needed to be downloaded from drive)
├── templates/ # HTML templates for the app
├── static/ # Static files (CSS, JS, images)
├── README_images/ # Directory for README images
├── requirements.txt # Dependencies
└── README.md # Project documentation (this file)
This project is licensed under the MIT License. Feel free to use, modify, and distribute it as per the license terms.
We welcome all contributors who want to add value to this project! Whether it's improving summarization quality, refining the interface, or optimizing performance, your contribution matters.
To contribute, follow these steps:
-
Fork the repository.
-
Create your feature branch:
git checkout -b feature-name
-
Commit your changes:
git commit -m "Added a cool new feature"
-
Push to the branch:
git push origin feature-name
-
Open a pull request.