Skip to content

It fetches real-time cryptocurrency prices from CoinGecko API and stores them in a MySQL database using an incremental loading strategy.

Notifications You must be signed in to change notification settings

abshek7/LOAD_DEMO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Crypto Price Tracker with Incremental Data Loading

A Python application that fetches real-time cryptocurrency prices from CoinGecko API and stores them in a MySQL database using an incremental loading strategy.

Features

  • Real-time Price Fetching: Retrieves current cryptocurrency prices from CoinGecko API
  • Incremental Data Loading: Smart data loading that updates existing records or inserts new ones
  • Minute-level Precision: Stores prices with minute-level timestamps
  • Duplicate Prevention: Uses UNIQUE constraints to prevent duplicate entries
  • Automated Scheduling: Runs continuous price updates with configurable intervals

Requirements

Software Dependencies

  • Python 3.7+
  • MySQL Server 5.7+

Python Packages

pip install mysql-connector-python requests pandas

Database Setup

  1. Install MySQL Server
  2. Create a database named crypto_data
  3. Update connection credentials in the script

Installation

  1. Clone or download the script

  2. Install required packages:

    pip install mysql-connector-python requests pandas
  3. Configure database connection:

    conn = mysql.connector.connect(
        host="localhost",
        user="your_username",
        password="your_password",     
        database="crypto_data"
    )
  4. Get CoinGecko API Key (optional but recommended):

    • Visit CoinGecko API
    • Replace the API_KEY in the script with your key

Database Schema

The application creates a crypto_prices table with the following structure:

Column Type Description
id INT AUTO_INCREMENT Primary key
symbol VARCHAR(20) Cryptocurrency symbol (e.g., 'bitcoin')
ts DATETIME(3) Timestamp with millisecond precision
price_usd DECIMAL(18,8) Price in USD with high precision

Unique Constraint: UNIQUE(symbol, ts) prevents duplicate entries for the same symbol at the same minute.

Incremental Data Loading Concept

What is Incremental Loading?

Incremental loading is a data integration pattern that only processes new or changed data since the last update, rather than reloading the entire dataset. This approach offers several benefits:

  • Efficiency: Reduces processing time and resource usage
  • Data Integrity: Maintains historical data while updating current values
  • Scalability: Handles growing datasets without performance degradation
  • Cost-Effective: Minimizes API calls and database operations

How It Works in This Application

┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   Fetch Price   │───▶│  Check if Exists │───▶│  Update/Insert  │
│  from API       │    │  (symbol + ts)   │    │   Decision      │
└─────────────────┘    └──────────────────┘    └─────────────────┘
                                 │                        │
                                 ▼                        ▼
                       ┌──────────────────┐    ┌─────────────────┐
                       │   Record Exists? │    │                 │
                       │                  │    │                 │
                       │   YES ──────────────▶ │   UPDATE        │
                       │                  │    │   existing      │
                       │   NO ───────────────▶ │   INSERT new    │
                       └──────────────────┘    └─────────────────┘

Implementation Details

  1. Timestamp Normalization:

    ts_minute = datetime.now().replace(second=0, microsecond=0)
    • Rounds timestamp to the nearest minute
    • Ensures consistent time-based grouping
  2. Existence Check:

    cursor.execute("SELECT id FROM crypto_prices WHERE symbol = %s AND ts = %s", 
                   (symbol, ts_minute_str))
    • Checks if a record already exists for the symbol at the current minute
  3. Conditional Logic:

    • If EXISTS: Updates the price (handles price changes within the same minute)
    • If NOT EXISTS: Inserts a new record

Benefits in This Context

  • API Efficiency: Doesn't need to fetch historical data repeatedly
  • Storage Optimization: Only stores one price per minute per cryptocurrency
  • Real-time Updates: Latest prices overwrite older prices within the same minute
  • Data Consistency: Maintains a clean, deduplicated dataset

Usage

Basic Execution

python crypto_tracker.py

Customization Options

  1. Change Cryptocurrency:

    incremental_load("ethereum")  # Instead of bitcoin
  2. Adjust Update Frequency:

    time.sleep(60)  # Update every minute instead of 20 seconds
  3. Monitor Multiple Cryptocurrencies:

    cryptos = ["bitcoin", "ethereum", "cardano"]
    for crypto in cryptos:
        incremental_load(crypto)

Sample Output

--- Starting Incremental Load Demo ---
Inserted: bitcoin at 2024-01-15 10:30:00.000 = $42150.25
Updated: bitcoin at 2024-01-15 10:30:00.000 = $42155.50 (id 1)
Updated: bitcoin at 2024-01-15 10:30:00.000 = $42148.75 (id 1)
Inserted: bitcoin at 2024-01-15 10:31:00.000 = $42160.00

Latest rows in DB:
   id   symbol                     ts    price_usd
0   2  bitcoin  2024-01-15 10:31:00.000  42160.00000000
1   1  bitcoin  2024-01-15 10:30:00.000  42148.75000000

Configuration Options

Database Configuration

  • Host: MySQL server location
  • Port: Default 3306
  • Timeout: Connection timeout settings

API Configuration

  • Rate Limiting: CoinGecko allows 10-50 calls/minute depending on plan
  • Retry Logic: Add exponential backoff for failed requests
  • Multiple Endpoints: Support different cryptocurrency data sources

Error Handling Recommendations

  1. Network Issues:

    try:
        resp = requests.get(url, params=params, headers=HEADERS, timeout=10)
        resp.raise_for_status()
    except requests.exceptions.RequestException as e:
        print(f"API request failed: {e}")
        return None
  2. Database Connection:

    try:
        conn.ping(reconnect=True, attempts=3, delay=5)
    except mysql.connector.Error as e:
        print(f"Database connection lost: {e}")

Future Enhancements

  • Add support for multiple cryptocurrencies
  • Implement data validation and cleansing
  • Add logging with different severity levels
  • Create data visualization dashboard
  • Implement alert system for price thresholds
  • Add data export functionality (CSV, JSON)
  • Implement data retention policies

About

It fetches real-time cryptocurrency prices from CoinGecko API and stores them in a MySQL database using an incremental loading strategy.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages