A Python application that fetches real-time cryptocurrency prices from CoinGecko API and stores them in a MySQL database using an incremental loading strategy.
- Real-time Price Fetching: Retrieves current cryptocurrency prices from CoinGecko API
- Incremental Data Loading: Smart data loading that updates existing records or inserts new ones
- Minute-level Precision: Stores prices with minute-level timestamps
- Duplicate Prevention: Uses UNIQUE constraints to prevent duplicate entries
- Automated Scheduling: Runs continuous price updates with configurable intervals
- Python 3.7+
- MySQL Server 5.7+
pip install mysql-connector-python requests pandas- Install MySQL Server
- Create a database named
crypto_data - Update connection credentials in the script
-
Clone or download the script
-
Install required packages:
pip install mysql-connector-python requests pandas
-
Configure database connection:
conn = mysql.connector.connect( host="localhost", user="your_username", password="your_password", database="crypto_data" )
-
Get CoinGecko API Key (optional but recommended):
- Visit CoinGecko API
- Replace the API_KEY in the script with your key
The application creates a crypto_prices table with the following structure:
| Column | Type | Description |
|---|---|---|
id |
INT AUTO_INCREMENT | Primary key |
symbol |
VARCHAR(20) | Cryptocurrency symbol (e.g., 'bitcoin') |
ts |
DATETIME(3) | Timestamp with millisecond precision |
price_usd |
DECIMAL(18,8) | Price in USD with high precision |
Unique Constraint: UNIQUE(symbol, ts) prevents duplicate entries for the same symbol at the same minute.
Incremental loading is a data integration pattern that only processes new or changed data since the last update, rather than reloading the entire dataset. This approach offers several benefits:
- Efficiency: Reduces processing time and resource usage
- Data Integrity: Maintains historical data while updating current values
- Scalability: Handles growing datasets without performance degradation
- Cost-Effective: Minimizes API calls and database operations
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Fetch Price │───▶│ Check if Exists │───▶│ Update/Insert │
│ from API │ │ (symbol + ts) │ │ Decision │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│ │
▼ ▼
┌──────────────────┐ ┌─────────────────┐
│ Record Exists? │ │ │
│ │ │ │
│ YES ──────────────▶ │ UPDATE │
│ │ │ existing │
│ NO ───────────────▶ │ INSERT new │
└──────────────────┘ └─────────────────┘
-
Timestamp Normalization:
ts_minute = datetime.now().replace(second=0, microsecond=0)
- Rounds timestamp to the nearest minute
- Ensures consistent time-based grouping
-
Existence Check:
cursor.execute("SELECT id FROM crypto_prices WHERE symbol = %s AND ts = %s", (symbol, ts_minute_str))
- Checks if a record already exists for the symbol at the current minute
-
Conditional Logic:
- If EXISTS: Updates the price (handles price changes within the same minute)
- If NOT EXISTS: Inserts a new record
- API Efficiency: Doesn't need to fetch historical data repeatedly
- Storage Optimization: Only stores one price per minute per cryptocurrency
- Real-time Updates: Latest prices overwrite older prices within the same minute
- Data Consistency: Maintains a clean, deduplicated dataset
python crypto_tracker.py-
Change Cryptocurrency:
incremental_load("ethereum") # Instead of bitcoin
-
Adjust Update Frequency:
time.sleep(60) # Update every minute instead of 20 seconds
-
Monitor Multiple Cryptocurrencies:
cryptos = ["bitcoin", "ethereum", "cardano"] for crypto in cryptos: incremental_load(crypto)
--- Starting Incremental Load Demo ---
Inserted: bitcoin at 2024-01-15 10:30:00.000 = $42150.25
Updated: bitcoin at 2024-01-15 10:30:00.000 = $42155.50 (id 1)
Updated: bitcoin at 2024-01-15 10:30:00.000 = $42148.75 (id 1)
Inserted: bitcoin at 2024-01-15 10:31:00.000 = $42160.00
Latest rows in DB:
id symbol ts price_usd
0 2 bitcoin 2024-01-15 10:31:00.000 42160.00000000
1 1 bitcoin 2024-01-15 10:30:00.000 42148.75000000
- Host: MySQL server location
- Port: Default 3306
- Timeout: Connection timeout settings
- Rate Limiting: CoinGecko allows 10-50 calls/minute depending on plan
- Retry Logic: Add exponential backoff for failed requests
- Multiple Endpoints: Support different cryptocurrency data sources
-
Network Issues:
try: resp = requests.get(url, params=params, headers=HEADERS, timeout=10) resp.raise_for_status() except requests.exceptions.RequestException as e: print(f"API request failed: {e}") return None
-
Database Connection:
try: conn.ping(reconnect=True, attempts=3, delay=5) except mysql.connector.Error as e: print(f"Database connection lost: {e}")
- Add support for multiple cryptocurrencies
- Implement data validation and cleansing
- Add logging with different severity levels
- Create data visualization dashboard
- Implement alert system for price thresholds
- Add data export functionality (CSV, JSON)
- Implement data retention policies