This repository contains a collection of end-to-end SAS analytics projects showcasing skills in data preparation, visualization, statistical analysis, and predictive modeling.
Each project applies SAS programming to real-world-style datasets from SASHELP and external sources, covering topics from operational HR analytics to sports statistics, retail data, and nutritional analysis.
- Multi-dataset analysis demonstrating SAS statistical capabilities
- Cholesterol-by-gender visualization using vertical bar charts
- BMI computation and distribution analysis
- Correlation analysis for sports performance metrics
- Simple and multiple regression modeling with diagnostics
- Multicollinearity detection and backward elimination for model refinement
- Created categorical OT pay variables
- Conducted binomial proportion tests
- Performed Chi-square tests for independence
- Used one-way ANOVA with Tukey’s post-hoc comparisons
- Built simple and multiple regression models for salary prediction
- Evaluated diagnostics (residuals, VIF)
- Applied stepwise regression for optimal predictor selection
- Developed multiple regression models for MSRP
- Identified and addressed multicollinearity
- Applied backward elimination to improve model stability
- Created product sales bar charts and sales distribution boxplots
- Designed 3D pie charts for inventory composition
- Enhanced graphical output with sorting, color customization, and labels
- Performed multicollinearity checks
- Applied Principal Components Analysis (PCA)
- Compared full vs reduced models for efficiency and interpretability
- Data Preparation:
DATAstep transformations, recoding variables - Statistical Testing: ANOVA, Chi-square, Binomial tests
- Regression Modeling: simple, multiple, and stepwise regression
- Dimensionality Reduction: PCA for multicollinearity mitigation
- Visualization:
PROC SGPLOT,PROC GCHART,PROC BOXPLOT - Diagnostics & Model Selection: R², p-values, VIF, residual plots
- SAS 9.4 / SAS Studio
- Built-in datasets from the
SASHELPlibrary - Custom-created datasets for applied problem solving
Aryan Sharma
Data Analytics | SAS Programming | Predictive Modeling
This repository is intended for educational and portfolio purposes.
If you use or adapt this work, please give proper credit.