Skip to content

jpreyesm03/ExtraaLearn-Machine-Learning-Models

Repository files navigation

πŸš€ ExtraaLearn: Predicting Potential Customers

Content

This Data Science and Machine Learning project uses tools such as Decision Tree, Random Forest, GridSearchCV, and Confusion Matrix to generate a model to predict which leads could become paid customers for the company ExtraaLearn. In particular, Recall should be as high as possible to lower the number of False Negatives. Moreover, recommendations and business insights are given based on both, the Exploratory Data Analysis and the best model. The dataset was given by MIT IDSS.
See Project on a Website


🎯 Objectives

  1. 🧠 Develop a predictive model to identify leads likely to convert to paid customers.
  2. πŸ” Understand factors influencing lead conversion.
  3. πŸ“ Give actionable business recommendations.

πŸ“Š Dataset

The dataset includes detailed information about leads and their interactions with ExtraaLearn. Key features:

  • πŸ‘€ Demographics: Age, current occupation.
  • πŸ”— Interactions: Website visits, time spent, page views, last activity, first interaction.
  • πŸ“’ How did the lead find the company: Digital, print, or referral channels.
  • 🎯 Target: Binary classification of lead conversion status.

πŸ› οΈ Workflow

  1. βš™οΈ Data Preprocessing:

    • 🧹 Clean data (remove duplicates, handle null values).
  2. πŸ“ˆ Exploratory Data Analysis (EDA):

    • πŸ“Š Visualize trends, correlations, and distributions.
    • πŸ” Identify key features influencing lead conversion.
  3. πŸ€– Model Development:

    • Models used:
      • Decision Trees and Random Forest.
    • Evaluate models using metrics such as:
      • Accuracy
      • Precision
      • Recall
      • F1 score πŸ”΅
  4. πŸ”§ Hyperparameter Tuning:

    • Use GridSearchCV to find optimal parameters.
  5. πŸ“‹ Insights & Reporting:

    • Highlight key factors driving conversions.
    • Recommend actionable strategies for improving lead conversion rates.

πŸ“š Libraries and Tools

  • πŸ› οΈ Data Manipulation: Pandas, NumPy
  • πŸ“Š Visualization: Matplotlib, Seaborn
  • πŸ€– Machine Learning: scikit-learn, statsmodels
  • πŸ“ˆ Evaluation: Confusion Matrix, Classification Report (Recall)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published