Author: Eric Traccitto
Master’s Student, Information Technology
York University, Canada 🇨🇦
erictrac@my.yorku.ca
This project presents a comprehensive analysis of GenAI adoption and its impact on enterprise productivity across various industries and implementation strategies. The study evaluates the effectiveness of different machine learning approaches for predicting productivity outcomes based on GenAI adoption factors.
The research integrates multiple analytical techniques including exploratory data analysis, clustering, regression modeling, and classification to identify the most effective predictive approach for enterprise productivity forecasting.
Three primary machine learning approaches were benchmarked:
| Approach | Model Type | Key Features | Outcome |
|---|---|---|---|
| Regression Analysis | Linear Regression, Random Forest, XGBoost | Predicts a continuous target variable | Not Suitable |
| Clustering | K-Means | Unsupervised pattern discovery | Useful for Class Creation |
| Classification | Logistic Regression, Random Forest, XGBoost | Multi-class prediction | Most Effective |
Each system was evaluated on:
-
Predictive accuracy
-
Model interpretability
-
Performance with imbalanced data
The dataset consists of enterprise GenAI adoption metrics across multiple industries and countries
| File | Description |
|---|---|
Enterprise_GenAI_Adoption_Impact.csv |
Raw enterprise adoption data |
ITEC6310 Final Project/
├── data/
│ ├── Enterprise_GenAI_Adoption_Impact.csv
│ ├── GenAI_Adoption_ML_Ready_CLEAN.csv
│ ├── regression_data_prepared.csv
│ └── [grid search results]
├── notebooks/
│ ├── Benchmarking.ipynb # EDA, clustering, visualizations
│ ├── prediction_analysis.ipynb # Full ML pipeline
│ ├── Tester.ipynb # Additional testing
│ └── [connection utilities]
├── results/
│ ├── lr_grid_search_results.csv
│ ├── rf_grid_search_results.csv
│ ├── xgb_grid_search_results.csv
│ └── [performance metrics]
├── requirements.txt
└── README.md
- Comprehensive feature engineering including efficiency ratios and combination features
- Handling of categorical variables through encoding strategies
- Creation of interaction terms and business-relevant metrics
- Initial Regression Analysis - Testing continuous prediction models
- Clustering Analysis - Identifying natural groupings in productivity impact
- Classification Modeling - Multi-class prediction of productivity clusters
- Regression: R-squared, RMSE
- Classification: F1-macro, precision, recall, accuracy
- Clustering: Cluster interpretability
- Results were plotted using Matplotlib and Pandas.
- Productivity patterns exhibit categorical rather than continuous behavior
- Non-linear relationships dominate the feature space
- Class imbalance significantly impacts model performance
git clone https://github.com/erictraccitto/genai-adoption-benchmarking.git
2️ - Create a Virtual Environment
python -m venv .venv
source .venv/bin/activate # Mac/Linux
.venv\Scripts\activate # Windows
3️ Install Dependencies
pip install -r requirements.txt
(Optional) Freeze Exact Versions
pip freeze > requirements.txt
This study provides a comparative foundation for understanding how different machine learning approaches perform in predicting enterprise productivity outcomes from GenAI adoption data.
Key Results Best Performing Model: XGBoost Classifier with class weighting
Primary Metric: F1-macro score of ~0.30
Key Insight: Classification outperforms regression for this problem domain
Business Impact: Ability to predict enterprise success categories from adoption patterns
Expand feature set with additional enterprise metrics
Test ensemble methods and advanced classification techniques
Incorporate time-series analysis for longitudinal studies
Explore interpretable AI methods for business decision support
Extend to different industry verticals and geographic regions
© 2025 Eric Traccitto — York University