Skip to content

Data analysis of Goodreads books dataset using Python and Jupyter Notebook.

Notifications You must be signed in to change notification settings

ShadenAli95/Goodreads-Books-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Goodreads-Books-Analysis

This project focuses on exploring and analyzing the Goodreads Best Books dataset, which contains information about thousands of books, including ratings, popularity, genres, authors, and publishers. The goal is to clean and preprocess the data, handle missing values, remove unnecessary columns, and extract valuable insights that reflect readers’ interests and publishing trends. Additionally, I analyze the distribution of numeric features (such as ratings, votes, and pages) to better understand reading behaviors and book characteristics.

Objectives

  • The main objectives of this analysis are to:
  • Prepare the data for analysis through cleaning and preprocessing.
  • Explore relationships between variables such as ratings, popularity, genres, and publishers.-
  • Visualize key trends to answer meaningful business questions.

Key Business Questions

  • What price are people most willing to pay for a book, especially for books with higher BBE scores?
  • How many pages are readers typically willing to read, based on interest or popularity?
  • What are the top 5 most frequent languages, authors, genres, awards, and publishers?
  • How many books are part of a series versus standalone books?
  • Which books rank in the top 10 based on popularity, weighted rating, number of ratings, and votes?
  • Which genres and publishers have the highest number of authors in the dataset?
  • Which author has the highest popularity, weighted rating, votes, number of ratings, and five-star reviews?
  • Which publishers contribute the most to the overall BBE score and have the greatest influence in popularity and ratings?
  • What is the most common publisher within each genre and for each award?
  • Which books and publishers have received the most awards?
  • Which genres correlate the most with number of ratings, BBE score, and BBE votes?

Tools & Skills Used

  • Python (Pandas, NumPy, Matplotlib, Seaborn)
  • Data Cleaning & Preprocessing
  • Exploratory Data Analysis (EDA)
  • Data Visualization
  • Power BI Dashboard Creation

About

Data analysis of Goodreads books dataset using Python and Jupyter Notebook.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published