Persian Author Classification

Project Overview

This project aims to classify texts by various Persian authors using machine learning techniques. The goal is to accurately predict the author of a given text based on its content.

Models and Techniques

Text Preprocessing: Tokenization, normalization, and vectorization of Persian text.
Machine Learning Models: Utilization of models such as SVM, Naive Bayes, and Random Forest for classification.
Evaluation: Accuracy, Precision, Recall, and F1 Score metrics are used to evaluate the models.

Results

The project achieved an accuracy of 70% on the testing set, with detailed performance metrics available in the report.pdf.

Repository Contents

persian_authors_classification.ipynb: Jupyter notebook with the main classification algorithms and model evaluations.
scrapper.ipynb: Jupyter notebook used for scraping textual data from various online sources.
report.pdf: A comprehensive report detailing the methodology, analysis, and results of the project.
src/: Directory containing additional source code and utility scripts supporting the project.

Installation

To set up the project environment:

git clone https://github.com/Amir-Entezari/persian-literature-classifier.git
cd persian_author_classification
pip install -r requirements.txt

Usage

To run the classification notebook:

jupyter notebook persian_authors_classification.ipynb

To execute the scraper:

jupyter notebook scrapper.ipynb

Contribution

Contributions to the project are welcome. To contribute, please fork the repository, make your changes, and submit a pull request.

Contact

For questions or feedback, please open an issue in the GitHub repository or contact amirh.entezari@ut.ac.ir .

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
datasets		datasets
src		src
.gitignore		.gitignore
README.md		README.md
classical-mls.ipynb		classical-mls.ipynb
persian_authors_classification.ipynb		persian_authors_classification.ipynb
report.pdf		report.pdf
requirements.txt		requirements.txt
scrapper.ipynb		scrapper.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Persian Author Classification

Project Overview

Models and Techniques

Results

Repository Contents

Installation

Usage

Contribution

Contact

About

Releases

Packages

Languages

Amir-Entezari/persian-literature-classifier

Folders and files

Latest commit

History

Repository files navigation

Persian Author Classification

Project Overview

Models and Techniques

Results

Repository Contents

Installation

Usage

Contribution

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages