Repository: Linear Regression Ecommerce
Notebook: Linear Regression Ecommerce.ipynb
Predict Yearly Amount Spent by e-commerce customers using behavioral and membership features. Perform exploratory data analysis, fit a regression model, validate results, and provide business recommendations.
- Model performance (test set):
- R² = 0.981
- RMSE ≈ 10.19
- MAE ≈ 8.43
- Top drivers of spend:
Length of Membership(strongest predictor)Time on AppAvg. Session LengthTime on Website→ negligible impact
- Business implication: Loyalty (membership) and mobile app engagement are the biggest levers to grow revenue.
- Algorithm: Linear Regression (
scikit-learn) - Features used:
['Avg. Session Length', 'Time on App', 'Time on Website', 'Length of Membership'] - Target:
Yearly Amount Spent - Train/Test split: 70/30, random_state=42
- Retention drives revenue → Longer memberships strongly correlate with higher spend.
- Mobile app is key → App engagement is a bigger driver than website usage.
- Optimize website funnel → Direct web users toward app for higher conversions.
- Target high-value users → Use the model to segment and prioritize premium offers.
- Dataset is small (500 rows) — may not generalize.
- Coefficients show correlation, not causation.
- Multicollinearity possible (website/app/session features).
- Clone this repository.
- Ensure
Ecommerce.csvand the notebook are in the root directory. - Install dependencies:
pip install pandas matplotlib seaborn scikit-learn scipy jupyter- Open and run
Linear Regression Ecommerce.ipynb.
Linear Regression Ecommerce.ipynb— full analysis notebookDataset/— Datasetimages/— visualizations for READMEEcommerce.csv— dataset (500 rows)
- Built and validated regression model to predict annual customer spend; achieved R² = 0.981 and RMSE ≈ $10.
- Delivered actionable insights: retention and app engagement as primary drivers of revenue.
- Tools: Python (pandas, scikit-learn, seaborn), EDA, regression modeling, and diagnostics.



