This project processes financial transaction data using PySpark, cleans and enriches it, and prepares a dataset for visualization in Power BI.
It simulates missing amounts, calculates KPIs (cost, profit, profit margin), assigns risk segments, and outputs a clean CSV ready for business intelligence dashboards.

- Automated cleaning and enrichment of messy financial data.
- Simulates missing transaction amounts for realistic KPI calculations.
- Calculates KPIs: Total Sales, Profit, Profit Margin.
- Assigns transactions to risk segments (Low, Medium, High Risk).
- Prepares the dataset for Power BI visualization with a real-world business intelligence layout.
- Ready for GitHub as a Jupyter Notebook with Markdown documentation.
├── Financial_KPI_Reporting.ipynb # Main notebook
├── README.md # Project documentation
├── requirements.txt # Python dependencies
├── Dashboard_image.png # Static Power BI Dashboard image
└── Dashboard_demo.md # Power BI dashboard demo
- Clone this repository:
git clone https://github.com/yourusername/financial-kpi-reporting.git
cd financial-kpi-reporting- Create a virtual environment & install dependencies:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt- Launch Jupyter Notebook:
jupyter notebook- Open
Financial_KPI_Reporting_Cleaned_GitHub.ipynb. - Follow the cells step-by-step to process and clean your data.
- Export the final dataset and load it into Power BI for visualization.
The prepared dataset can be visualized in Power BI to create a Financial Performance & Risk Intelligence Dashboard, featuring:
- KPI Cards: Total Sales, Profit, Avg Profit Margin, Fraud Rate
- Risk & Fraud Analysis
- Profitability & Business Drivers
- Sales Forecasting with What-If Analysis
- Transaction-Level Drilldown
- Python 3.10
- PySpark
- Pandas
- Jupyter Notebook
- Power BI
This project used "Financial Transactions Dataset: Analytics" dataset from Kaggle, licensed under the Apache 2.0 Dataset link: https://www.kaggle.com/datasets/computingvictor/transactions-fraud-datasets/data