Skip to content

Data-driven strategy for GenAI integration. Analysis of 100k companies reveals that strategic alignment, not budget size. drives successful AI adoption. Features ML clustering, XGBoost classification, and BERT sentiment analysis

Notifications You must be signed in to change notification settings

erictracc/ITEC6310-Final-Project

Repository files navigation

ITEC6310 Final Project: GenAI Adoption Benchmarking

Author: Eric Traccitto
Master’s Student, Information Technology
York University, Canada 🇨🇦
erictrac@my.yorku.ca


Overview

This project presents a comprehensive analysis of GenAI adoption and its impact on enterprise productivity across various industries and implementation strategies. The study evaluates the effectiveness of different machine learning approaches for predicting productivity outcomes based on GenAI adoption factors.

The research integrates multiple analytical techniques including exploratory data analysis, clustering, regression modeling, and classification to identify the most effective predictive approach for enterprise productivity forecasting.

Screenshot 2025-11-16 230011

Systems and Methods Evaluated

Three primary machine learning approaches were benchmarked:

Model Comparison Summary

Approach Model Type Key Features Outcome
Regression Analysis Linear Regression, Random Forest, XGBoost Predicts a continuous target variable Not Suitable
Clustering K-Means Unsupervised pattern discovery Useful for Class Creation
Classification Logistic Regression, Random Forest, XGBoost Multi-class prediction Most Effective

Each system was evaluated on:

  • Predictive accuracy

  • Model interpretability

  • Performance with imbalanced data


Dataset

The dataset consists of enterprise GenAI adoption metrics across multiple industries and countries

File Description
Enterprise_GenAI_Adoption_Impact.csv Raw enterprise adoption data

Repository Structure

ITEC6310 Final Project/
├── data/
│   ├── Enterprise_GenAI_Adoption_Impact.csv
│   ├── GenAI_Adoption_ML_Ready_CLEAN.csv
│   ├── regression_data_prepared.csv
│   └── [grid search results]
├── notebooks/
│   ├── Benchmarking.ipynb          # EDA, clustering, visualizations
│   ├── prediction_analysis.ipynb   # Full ML pipeline
│   ├── Tester.ipynb                # Additional testing
│   └── [connection utilities]
├── results/
│   ├── lr_grid_search_results.csv
│   ├── rf_grid_search_results.csv
│   ├── xgb_grid_search_results.csv
│   └── [performance metrics]
├── requirements.txt
└── README.md

Methodology

Data Preprocessing

  • Comprehensive feature engineering including efficiency ratios and combination features
  • Handling of categorical variables through encoding strategies
  • Creation of interaction terms and business-relevant metrics

Analytical Approach

  • Initial Regression Analysis - Testing continuous prediction models
  • Clustering Analysis - Identifying natural groupings in productivity impact
  • Classification Modeling - Multi-class prediction of productivity clusters

Evaluation Metrics

  • Regression: R-squared, RMSE
  • Classification: F1-macro, precision, recall, accuracy
  • Clustering: Cluster interpretability

Visualization & Analysis

  • Results were plotted using Matplotlib and Pandas.

Key Findings

  • Productivity patterns exhibit categorical rather than continuous behavior
  • Non-linear relationships dominate the feature space
  • Class imbalance significantly impacts model performance
Screenshot 2025-11-18 152855

Installation

1️- Clone the Repository

git clone https://github.com/erictraccitto/genai-adoption-benchmarking.git

2️ - Create a Virtual Environment

python -m venv .venv
source .venv/bin/activate       # Mac/Linux
.venv\Scripts\activate          # Windows

3️ Install Dependencies

pip install -r requirements.txt

(Optional) Freeze Exact Versions

pip freeze > requirements.txt

Research Contribution

This study provides a comparative foundation for understanding how different machine learning approaches perform in predicting enterprise productivity outcomes from GenAI adoption data.

Key Results Best Performing Model: XGBoost Classifier with class weighting

Primary Metric: F1-macro score of ~0.30

Key Insight: Classification outperforms regression for this problem domain

Business Impact: Ability to predict enterprise success categories from adoption patterns

Future Directions

Expand feature set with additional enterprise metrics

Test ensemble methods and advanced classification techniques

Incorporate time-series analysis for longitudinal studies

Explore interpretable AI methods for business decision support

Extend to different industry verticals and geographic regions

© 2025 Eric Traccitto — York University

About

Data-driven strategy for GenAI integration. Analysis of 100k companies reveals that strategic alignment, not budget size. drives successful AI adoption. Features ML clustering, XGBoost classification, and BERT sentiment analysis

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published