Skip to content

SAS analytics projects showcasing end-to-end data analysis, visualization, regression modeling, statistical testing, PCA, and multicollinearity handling using SASHELP and real-world datasets.

Notifications You must be signed in to change notification settings

markusndco/Analysis-SAS

Repository files navigation

📊 SAS Data Analysis & Modeling Projects

This repository contains a collection of end-to-end SAS analytics projects showcasing skills in data preparation, visualization, statistical analysis, and predictive modeling.
Each project applies SAS programming to real-world-style datasets from SASHELP and external sources, covering topics from operational HR analytics to sports statistics, retail data, and nutritional analysis.


📁 Projects Overview

🔹 End-to-End Data Analytics Project

  • Multi-dataset analysis demonstrating SAS statistical capabilities
  • Cholesterol-by-gender visualization using vertical bar charts
  • BMI computation and distribution analysis
  • Correlation analysis for sports performance metrics
  • Simple and multiple regression modeling with diagnostics
  • Multicollinearity detection and backward elimination for model refinement

🔹 Firefighter Overtime Pay Analysis

  • Created categorical OT pay variables
  • Conducted binomial proportion tests
  • Performed Chi-square tests for independence
  • Used one-way ANOVA with Tukey’s post-hoc comparisons

🔹 Baseball Player Salary Modeling (SASHELP.Baseball)

  • Built simple and multiple regression models for salary prediction
  • Evaluated diagnostics (residuals, VIF)
  • Applied stepwise regression for optimal predictor selection

🔹 Car Price Prediction (SASHELP.CARS)

  • Developed multiple regression models for MSRP
  • Identified and addressed multicollinearity
  • Applied backward elimination to improve model stability

🔹 SASHELP.Shoes Sales Visualization

  • Created product sales bar charts and sales distribution boxplots
  • Designed 3D pie charts for inventory composition
  • Enhanced graphical output with sorting, color customization, and labels

🔹 PCA Analysis – Pizza Nutritional Data

  • Performed multicollinearity checks
  • Applied Principal Components Analysis (PCA)
  • Compared full vs reduced models for efficiency and interpretability

🧠 Key Skills Demonstrated

  • Data Preparation: DATA step transformations, recoding variables
  • Statistical Testing: ANOVA, Chi-square, Binomial tests
  • Regression Modeling: simple, multiple, and stepwise regression
  • Dimensionality Reduction: PCA for multicollinearity mitigation
  • Visualization: PROC SGPLOT, PROC GCHART, PROC BOXPLOT
  • Diagnostics & Model Selection: R², p-values, VIF, residual plots

⚙️ Tools & Resources

  • SAS 9.4 / SAS Studio
  • Built-in datasets from the SASHELP library
  • Custom-created datasets for applied problem solving

👤 Author

Aryan Sharma
Data Analytics | SAS Programming | Predictive Modeling


📜 License

This repository is intended for educational and portfolio purposes.
If you use or adapt this work, please give proper credit.

About

SAS analytics projects showcasing end-to-end data analysis, visualization, regression modeling, statistical testing, PCA, and multicollinearity handling using SASHELP and real-world datasets.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages