-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
298efad
commit 5628fa0
Showing
11 changed files
with
807 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,49 @@ | ||
# dotlake-sidecar | ||
# dotlake-sidecar | ||
|
||
This repository contains the necessary components to set up and run a data ingestion pipeline for Polkadot-based blockchains using Substrate API Sidecar and a custom block ingest service. | ||
|
||
## Overview | ||
|
||
The dotlake-sidecar project facilitates the extraction and processing of blockchain data from Polkadot-based networks. It uses two main components: | ||
|
||
1. Substrate API Sidecar | ||
2. Custom Block Ingest Service | ||
|
||
These components work together to provide a robust solution for capturing and processing blockchain data. | ||
|
||
## Components | ||
|
||
### 1. Substrate API Sidecar | ||
|
||
[Substrate API Sidecar](https://github.com/paritytech/substrate-api-sidecar) is an open-source REST service that runs alongside Substrate nodes. It provides a way to access blockchain data through a RESTful API, making it easier to query and retrieve information from the blockchain. | ||
|
||
### 2. Custom Block Ingest Service | ||
|
||
The custom block ingest service is a Docker container that processes and stores blockchain data. It is designed to work in conjunction with the Substrate API Sidecar to ingest blocks and related information from the specified blockchain. The service utilizes DuckDB, an embedded analytical database, as part of its data flow: | ||
|
||
1. The service ingests blockchain data from the Substrate API Sidecar. | ||
2. The ingested data is then processed and transformed as needed. | ||
3. The processed data is stored in DuckDB. | ||
|
||
## How It Works | ||
|
||
The system operates using the following workflow: | ||
|
||
1. The Substrate API Sidecar connects to a specified WebSocket endpoint (WSS) of a Substrate-based blockchain node. | ||
2. The Sidecar exposes a RESTful API on port 8080, allowing easy access to blockchain data. | ||
3. The custom block ingest service connects to the same WSS endpoint and begins processing blocks. | ||
4. The ingest service stores the processed data, making it available for further analysis or querying. | ||
|
||
## Usage | ||
|
||
To run the dotlake-sidecar, use the provided `dotlakeIngest.sh` script. This script sets up both the Substrate API Sidecar and the custom block ingest service as Docker containers. | ||
|
||
Example usage: | ||
|
||
If you wish to start the ingest service for polkadot, you can use the following command: | ||
|
||
``` | ||
sh dotlakeIngest.sh --chain polkadot --relay-chain polkadot --wss wss://rpc.ibp.network/polkadot | ||
``` | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
#!/bin/bash | ||
|
||
# Parse command line arguments | ||
while [[ $# -gt 0 ]]; do | ||
key="$1" | ||
case $key in | ||
--chain) | ||
CHAIN="$2" | ||
shift | ||
shift | ||
;; | ||
--relay-chain) | ||
RELAY_CHAIN="$2" | ||
shift | ||
shift | ||
;; | ||
--wss) | ||
WSS="$2" | ||
shift | ||
shift | ||
;; | ||
*) | ||
echo "Unknown option: $1" | ||
exit 1 | ||
;; | ||
esac | ||
done | ||
|
||
# Check if required arguments are provided | ||
if [ -z "$CHAIN" ] || [ -z "$RELAY_CHAIN" ] || [ -z "$WSS" ]; then | ||
echo "Usage: $0 --chain <chain> --relay-chain <relay_chain> --wss <wss_endpoint>" | ||
exit 1 | ||
fi | ||
|
||
# Start Substrate API Sidecar | ||
echo "Starting Substrate API Sidecar..." | ||
docker run -d --rm --read-only -e SAS_SUBSTRATE_URL="$WSS" -p 8080:8080 parity/substrate-api-sidecar:latest | ||
if [ $? -eq 0 ]; then | ||
echo "Substrate API Sidecar started successfully." | ||
else | ||
echo "Failed to start Substrate API Sidecar." | ||
exit 1 | ||
fi | ||
|
||
# Start Block Ingest Service | ||
echo "Starting Block Ingest Service..." | ||
docker run -d --rm -e CHAIN="$CHAIN" -e RELAY_CHAIN="$RELAY_CHAIN" -e WSS="$WSS" -p 8501:8501 eu.gcr.io/parity-data-infra-evaluation/block-ingest:0.1 | ||
if [ $? -eq 0 ]; then | ||
echo "Block Ingest Service started successfully." | ||
else | ||
echo "Failed to start Block Ingest Service." | ||
exit 1 | ||
fi | ||
|
||
echo "Both services are now running. You can access Substrate API Sidecar at http://localhost:8080 and Block Ingest Service at http://localhost:8501" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# Use an official Python runtime as a parent image | ||
FROM python:3.9-slim | ||
|
||
# Set the working directory in the container | ||
WORKDIR /app | ||
|
||
# Copy the current directory contents into the container at /app | ||
COPY . /app | ||
|
||
# Install any needed packages specified in requirements.txt | ||
COPY requirements.txt /app/ | ||
RUN pip install -r requirements.txt | ||
|
||
# Make sure start-ingest.sh is executable | ||
RUN chmod +x start-ingest.sh | ||
|
||
# Run start-ingest.sh when the container launches | ||
CMD ["./start-ingest.sh"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,96 @@ | ||
import streamlit as st | ||
import argparse | ||
import duckdb | ||
import pandas as pd | ||
import time | ||
from datetime import datetime | ||
from streamlit_autorefresh import st_autorefresh | ||
|
||
# Function to connect to DuckDB | ||
def connect_to_db(db_path='/Users/pranaypatil/data_eng_repos/data-applications/blocks.db'): | ||
return duckdb.connect(db_path, read_only=True) | ||
|
||
def parse_arguments(): | ||
parser = argparse.ArgumentParser(description="Block ingestion script for Substrate-based chains") | ||
parser.add_argument("--chain", required=True) | ||
parser.add_argument("--relay_chain", required=True) | ||
parser.add_argument("--db_path", required=True) | ||
return parser.parse_args() | ||
|
||
|
||
args = parse_arguments() | ||
|
||
st.set_page_config( | ||
page_title="Dotlake Explorer", | ||
page_icon="🚀", | ||
) | ||
|
||
# Run the autorefresh about every 2000 milliseconds (2 seconds) and stop | ||
# after it's been refreshed 100 times. | ||
# count = st_autorefresh(interval=30000, limit=100, key="fizzbuzzcounter") | ||
|
||
placeholder = st.empty() | ||
|
||
chain = args.chain | ||
relay_chain = args.relay_chain | ||
|
||
with placeholder.container(): | ||
|
||
try: | ||
# Connect to DuckDB | ||
conn = connect_to_db(args.db_path) | ||
|
||
query = f""" | ||
SELECT * | ||
FROM blocks_{relay_chain}_{chain} | ||
ORDER BY number DESC | ||
LIMIT 50 | ||
""" | ||
recent_blocks = conn.execute(query).fetchdf() | ||
recent_blocks['timestamp'] = recent_blocks['timestamp'].apply(lambda x: datetime.fromtimestamp(x).strftime("%Y-%m-%d %H:%M:%S") ) | ||
|
||
# Streamlit app | ||
st.title("Dotlake Explorer") | ||
st.sidebar.header("Home") | ||
|
||
# Display the chain and relay chain names | ||
st.markdown(f"**Chain**: *{chain}* **Relay Chain**: *{relay_chain}*") | ||
|
||
# Display the most recent block | ||
st.subheader("Most Recent Block") | ||
|
||
bn, ts = st.columns(2) | ||
latest_block = recent_blocks['number'].iloc[0] | ||
bn.metric("Block Number", latest_block) | ||
|
||
# Get and display the timestamp of the last block | ||
last_timestamp = recent_blocks['timestamp'].iloc[0] | ||
ts.metric("Block Timestamp", f"{last_timestamp} (+UTC)") | ||
|
||
# Create two columns for displaying extrinsics and events metrics | ||
extrinsics_col, events_col = st.columns(2) | ||
|
||
# Display the number of extrinsics in the most recent block | ||
num_extrinsics = len(recent_blocks['extrinsics'].iloc[0]) | ||
extrinsics_col.metric("Extrinsics", num_extrinsics) | ||
|
||
# Calculate the total number of events in the most recent block | ||
num_events = ( | ||
len(recent_blocks['onFinalize'].iloc[0]['events']) + | ||
len(recent_blocks['onInitialize'].iloc[0]['events']) + | ||
sum(len(ex['events']) for ex in recent_blocks['extrinsics'].iloc[0]) | ||
) | ||
|
||
# Display the total number of events | ||
events_col.metric("Events", num_events) | ||
|
||
# Display a table of the most recent blocks | ||
st.header("Recent Blocks") | ||
|
||
st.dataframe(recent_blocks[['number', 'timestamp', 'hash', 'finalized']]) | ||
|
||
# Close the connection | ||
conn.close() | ||
except Exception as e: | ||
st.error("Oops...something went wrong, please refresh the page") | ||
|
Oops, something went wrong.