This repository contains the full preprocessing pipeline for extracting and transforming Whale Alert transaction data to be used in cryptocurrency price prediction models. It includes filtering, cleaning, labeling sentiment, and feature engineering.
- Filter large transactions (over $500k)
- Normalize and clean transaction logs
- Heuristically label each transaction as Positive, Negative, or Neutral
- Add engineered features such as:
- USD amount
- Transfer direction (Inflow/Outflow)
- Token type (BTC/ETH)
- Time since last transfer
- Daily whale count
- Prepare final dataset for use in ML models
- Python 3.8+
- pandas
Install dependencies:
pip install pandas