A comprehensive data science project that analyzes SaaS business metrics to determine optimal pricing strategies that maximize customer lifetime value (LTV) while maintaining sustainable customer acquisition costs (CAC) and churn rates.
This project leverages machine learning and statistical analysis to optimize SaaS pricing strategies by:
- Predicting customer churn using ensemble methods
- Forecasting revenue based on pricing scenarios
- Segmenting customers by behavior and value
- Calculating price elasticity of demand
- Recommending tiered pricing structures
Key Achievement: Identified optimal pricing strategy that achieves a 4.76 LTV/CAC ratio while maintaining acceptable churn rates.
SaaS companies face the challenge of setting prices that:
- Maximize revenue and profitability
- Minimize customer churn
- Optimize the LTV-to-CAC ratio
- Remain competitive in the market
This project provides a data-driven framework to solve this multi-objective optimization problem.
- Optimal Price Point: $1,753.88 (34.7% decrease from baseline)
- Expected LTV/CAC Ratio: 4.76 (exceeds healthy threshold of 3.0)
- Price Elasticity: -2.191 (elastic demand)
- Customer Segments: 4 distinct segments identified
- Projected MRR: $8,958,995
โโโ Cleaned Data/ # Processed datasets
โ โโโ cac_ltv_cleaned.csv
โ โโโ ravenstack_cleaned.csv
โ โโโ saas_businesses_cleaned.csv
โ
โโโ Optimization Results/ # Model outputs and recommendations
โ โโโ model_dataset_with_predictions.csv
โ โโโ optimization_summary.csv
โ โโโ price_scenarios.csv
โ โโโ segment_pricing_recommendations.csv
โ
โโโ Visualizations/ # Charts and graphs
โ โโโ customer_segmentation.png
โ โโโ final_recommendation.png
โ โโโ model_performance.png
โ โโโ price_optimization_curves.png
โ
โโโ Raw Project Datasets/ # Source data
โ โโโ Business Startups Data on SAAS products/
โ โโโ CAC-LTV Model Analysis for SaaS Business Insights/
โ โโโ SaaS Subscription & Churn Analytics Dataset/
โ
โโโ Notebooks/
โ โโโ 01_cleaning.ipynb # Data cleaning and preprocessing
โ โโโ 02_modeling.ipynb # Model development and optimization
โ
โโโ business_narrative.txt # Executive summary of findings
โโโ requirements.txt # Python dependencies
โโโ README.md # This file
- Python 3.x - Primary programming language
- Jupyter Notebooks - Interactive development environment
- pandas - Data manipulation and analysis
- numpy - Numerical computing
- scikit-learn - Machine learning models
- matplotlib / seaborn - Data visualization
- Random Forest - Churn prediction (AUC: 0.940)
- Linear Regression - Revenue forecasting (Rยฒ: 1.000)
- K-Means Clustering - Customer segmentation (k=4)
- Log-log Regression - Price elasticity estimation
- Integrated multiple SaaS business datasets
- Cleaned and standardized 4,222 customer records
- Engineered features for model training
- Analyzed pricing distributions and customer behavior
- Identified correlations between price, churn, and LTV
- Visualized key business metrics
- Churn Prediction: Built Random Forest classifier to predict customer churn
- Revenue Forecasting: Developed regression model for revenue projection
- Customer Segmentation: Applied K-Means clustering to identify customer groups
- Price Elasticity: Calculated demand sensitivity to price changes
- Formulated multi-objective optimization problem
- Simulated various pricing scenarios
- Identified optimal price points for each customer segment
- Developed tiered pricing structure
- Created implementation roadmap
- Defined monitoring KPIs
Python 3.8 or higher
pip package manager- Clone the repository:
git clone https://github.com/sanskritidabhade/SaaS-Pricing-Model-Selection-Using-Data-Science-Techniques.git
cd SaaS-Pricing-Model-Selection-Using-Data-Science-Techniques- Create a virtual environment (recommended):
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate- Install required packages:
pip install -r requirements.txt-
Data Cleaning:
- Open
Notebooks/01_cleaning.ipynb - Run all cells to process raw data
- Open
-
Model Training & Optimization:
- Open
Notebooks/02_modeling.ipynb - Execute cells to train models and generate recommendations
- Open
-
View Results:
- Check
Optimization Results/for CSV outputs - Review
Visualizations/for charts and graphs - Read
business_narrative.txtfor executive summary
- Check
| Segment | Size | Characteristics | Recommended Pricing |
|---|---|---|---|
| Premium | 219 | High LTV/CAC, low churn | $2,630.83 |
| Standard | 760 | Balanced metrics | $1,753.88 |
| Growth | 1,538 | High potential | $1,403.11 |
| At-Risk | 1,705 | High churn risk | $1,052.33 |
- Churn Rate Target: โค20%
- LTV/CAC Ratio Target: โฅ3.0
- Model Performance:
- Churn Prediction AUC: 0.940
- Revenue Forecast Rยฒ: 1.000
Implement a four-tier pricing structure tailored to customer segments:
- Premium Tier: $2,630.83 for high-value customers
- Standard Tier: $1,753.88 as primary offering
- Growth Tier: $1,403.11 for price-sensitive market
- Retention Tier: $1,052.33 for at-risk customers
- Months 1-2: A/B test with 20% of new customers
- Months 3-4: Monitor churn, LTV, and conversion metrics
- Months 5-6: Full rollout based on results
- Ongoing: Continuous monitoring and quarterly reviews
- Analysis assumes price as primary driver; other factors (features, support, UX) also influence outcomes
- Real-world A/B testing required for validation
- External market factors not fully captured
- Customer survey validation recommended
- Incorporate competitive pricing data
- Add feature-based value pricing analysis
- Implement real-time churn prediction API
- Develop automated pricing recommendation dashboard
- Include seasonality and market trend analysis
This project is available for educational and research purposes.
Sanskriti Dabhade
- GitHub: @sanskritidabhade
- Data sources: Multiple SaaS business datasets
- Inspired by modern SaaS pricing optimization practices
- Built using open-source data science tools
Note: This analysis provides data-driven recommendations but should be validated through real-world testing before full implementation. Always consider market conditions, competitive landscape, and customer feedback when making pricing decisions.