This project is a general-purpose anomaly detection tool using unsupervised machine learning, designed to work on any CSV dataset. Whether it’s detecting fraud in financial transactions or identifying abnormal behavior in refinery equipment, this tool lets you clean data, apply Isolation Forest, and detect anomalies — all without needing labeled data.
✅ Bonus: Includes a Streamlit GUI that allows users to upload their own datasets and detect anomalies interactively.
- 📌 Goal: Automatically detect anomalies (outliers) in tabular data where labels like “Normal” or “Faulty” are not available
- 🧠 ML Algorithm: Isolation Forest (Unsupervised Learning)
- 🛠️ Use Cases:
- Industrial asset monitoring (refinery sensor logs)
- Credit card fraud detection
- Network/server failure prediction
- IoT device anomaly tracking
- ⚙️ Tech Stack: Python, Pandas, Scikit-learn, Seaborn, Streamlit, Jupyter Notebook
unsupervised-anomaly-detector/ ├── streamlit_app.py ├── requirements.txt # Dependencies └── README.md # Project documentation
- User Input: Upload any tabular
.csv
dataset (e.g., equipment logs, transaction records, sensor data). - Preprocessing: Cleans missing values, scales numerical data.
- Anomaly Detection: Applies Isolation Forest to detect anomalies based on data behavior.
- Output: Labels rows as
"Normal"
or"Anomaly"
and optionally exports the result.
Tool | Role |
---|---|
Python | Core programming language |
Pandas | Data manipulation |
Scikit-learn | ML algorithm (Isolation Forest) |
Seaborn | Visualizations |
Streamlit | GUI for interactive anomaly detection |
Jupyter | Prototyping and step-by-step learning |
This app allows non-technical users to upload any CSV file and run anomaly detection interactively.
- Upload any
.csv
file - Auto-cleaning of missing/null values
- Scaling of numeric features
- Anomaly detection using Isolation Forest
- Visual summary of results
- Downloadable output with anomaly labels
cd streamlit_app
streamlit run app.py
- Clone the Repository
git clone https://github.com/ArshSharan/Unsupervised-Anomaly-Detection.git
cd unsupervised-anomaly-detector
- Install Dependencies
pip install -r requirements.txt
- Run the Notebook
jupyter notebook
# Open `Unsupervised_Detection.ipynb`
- Launch Streamlit App
cd streamlit_app
streamlit run app.py
Domain | Example Use |
---|---|
🔧 Oil & Gas | Detect abnormal equipment behavior using sensor logs |
💳 Finance | Spot fraudulent transactions |
📶 IoT Devices | Monitor for unexpected spikes |
🏭 Manufacturing | Detect process anomalies |
-
Add One-Class SVM and Autoencoders for comparison
-
Feature importance and SHAP visualizations
-
Live data ingestion with MQTT or REST APIs
-
Deployment on cloud (Heroku, Streamlit Cloud, etc.)
-
Data science learners.
-
ML engineers building tools.
-
Interns working on anomaly detection projects.
Give it a ⭐ if it helped you or inspired you to build something cool!