Skip to content

coinbase/chainsformer

Repository files navigation

Table of Contents generated with DocToc

Overview

Chainsformer is an Apache Arrow Flight service built on top of ChainStorage as a stateless adaptor service. It currently supports batch data processing and micro batch data streaming from ChainStorage service to the Spark data processing platform.

It aims to provide a set of easy to use interfaces to support spark consumers to read and process ChainStorage Data on the Spark platform:

  • It defines a set of standardized block and transaction data schema for each asset class (i.e EVM assets or bitcoin).
  • It provides data transformation capability from protobuf to Arrow format.
  • It can be easily scaled up to support higher data throughput.
  • It can be easily integrated via the Chainsformer Spark Connector (https://github.com/coinbase/chainsformer-spark-source) for structured data streaming.

Quick Start

Make sure your local go version is 1.18 by running the following commands:

brew install go@1.18
brew unlink go
brew link go@1.18

brew install protobuf@3.21.12
brew unlink protobuf
brew link protobuf

To set up for the first time (only done once):

make bootstrap

Rebuild everything:

make build

Configuration

Environment Variables

Chainsformer depends on the following environment variables to resolve the path of the configuration. The directory structure is as follows: config/chainsformer/{blockchain}/{network}/{environment}.yml.

  • CHAINSFORMER_CONFIG: This env var, in the format of {blockchain}-{network}, determines the blockchain and network managed by the service. The naming is defined in chainstorage/protos/coinbase/c3/common/common.protp
  • CHAINSFORMER_ENVIRONMENT: This env var controls the {environment} in which the service is deployed. Possible values include production , development, and local (which is also the default value).

Service Configurations

Asset specific configurations are stored in the config directory under the Chainsformer service repo. The config folder structure follows the following form ./config/chainsformer/{blockchain}/{network}/base.yml

New Blockchain Configurations

  • Simply follow the config folder structure to add new configurations for any new blockchains or new networks of existing blockchains.
  • Add new tests in the config_test.go
  • Add new test configs in teh testapp.go

Development

Running Chainsformer Server

Clone the Chainsformer service repo:

git clone https://github.com/coinbase/chainsformer.git

Change directory to the Chainsformer service repo:

cd chainsformer

Setup Chainstorage SDK credentials

export CHAINSTORAGE_SDK_AUTH_HEADER=cb-nft-api-token
export CHAINSTORAGE_SDK_AUTH_TOKEN=****

To set up Chainsformer for the first time (only done once):

make bootstrap

Rebuild Chainsformer:

make build

Start the Chainsformer service with default CHAINSFORMER_CONFIG=ethereum-mainnet:

make server

Run test client

Query Chainsformer for a range of blocks

go run ./cmd/client --env local --blockchain ethereum --network mainnet --start 0 --end 10 --table blocks

Query Chainsformer for a range of block events

go run ./cmd/client --env local --blockchain ethereum --network mainnet --start 0 --end 10 --table streamed_blocks

Use grpcurl

Query Chainsformer for a range of blocks

Calling the GetSchema API

cmd=$(echo -n '{"table": "blocks"}' | base64)
grpcurl --plaintext -d '{"cmd":'"\"$cmd\""',"type":2}' localhost:9090 arrow.flight.protocol.FlightService.GetSchema

Calling the GetFlightInfo API to partition the data

cmd=$(echo -n '{"batch_query": {"start_height": 0, "end_height": 10, "table": "blocks"}}' | base64)
grpcurl --plaintext -d '{"cmd":'"\"$cmd\""',"type":2}' localhost:9090 arrow.flight.protocol.FlightService.GetFlightInfo

Take one of the ticket returned by the above command

...
"endpoint": [
    {
      "ticket": {
        "ticket": "eyJiYXRjaF9xdWVyeSI6eyJlbmRfaGVpZ2h0IjoiMTAiLCJ0YWJsZSI6ImJsb2NrcyJ9fQ=="
      }
    }
  ]
...

Calling the DoGet API to get data for one of the partition

grpcurl --plaintext -d '{"ticket": "eyJiYXRjaF9xdWVyeSI6eyJlbmRfaGVpZ2h0IjoiMTAiLCJ0YWJsZSI6ImJsb2NrcyJ9fQ=="}' localhost:9090 arrow.flight.protocol.FlightService.DoGet

Calling the DoGet API to get data of a specific partition

cmd=$(echo -n '{"batch_query":{"start_height":"1", "end_height":"2", "table":"blocks"}}' | base64)
grpcurl --plaintext -d '{"ticket": '"\"$cmd\""'}' localhost:9090 arrow.flight.protocol.FlightService.DoGet

Calling the DoAction API to get the tip in ChainStorage via Chainsformer

grpcurl --plaintext -d '{"type": "TIP"}' localhost:9090 arrow.flight.protocol.FlightService.DoAction | jq '.body | @base64d'

Query Chainsformer for a range of blocks events

Calling the GetSchema API

cmd=$(echo -n '{"table": "streamed_blocks"}' | base64)
grpcurl --plaintext -d '{"cmd":'"\"$cmd\""',"type":2}' localhost:9090 arrow.flight.protocol.FlightService.GetSchema

Calling the GetFlightInfo API to partition the data

cmd=$(echo -n '{"stream_query": {"start_sequence": 0, "end_sequence": 10, "table": "streamed_blocks"}}' | base64)
grpcurl --plaintext -d '{"cmd":'"\"$cmd\""',"type":2}' localhost:9090 arrow.flight.protocol.FlightService.GetFlightInfo

Take one of the ticket returned by the above command

...
"endpoint": [
    {
      "ticket": {
        "ticket": "eyJzdHJlYW1fcXVlcnkiOnsic3RhcnRfc2VxdWVuY2UiOiIxIiwiZW5kX3NlcXVlbmNlIjoiMTAiLCJ0YWJsZSI6InN0cmVhbWVkX2Jsb2NrcyJ9fQ=="
      }
    }
  ]
...

Calling the DoGet API to get data for one of the partition

grpcurl --plaintext -d '{"ticket": "eyJzdHJlYW1fcXVlcnkiOnsic3RhcnRfc2VxdWVuY2UiOiIxIiwiZW5kX3NlcXVlbmNlIjoiMTAiLCJ0YWJsZSI6InN0cmVhbWVkX2Jsb2NrcyJ9fQ=="}' localhost:9090 arrow.flight.protocol.FlightService.DoGet

Calling the DoGet API to get data of a specific partition

cmd=$(echo -n '{"stream_query":{"start_sequence":"1", "end_sequence":"2", "table":"streamed_blocks"}}' | base64)
grpcurl --plaintext -d '{"ticket": '"\"$cmd\""'}' localhost:9090 arrow.flight.protocol.FlightService.DoGet

Calling the DoAction API to get the tip in ChainStorage via Chainsformer

grpcurl --plaintext -d '{"type": "STREAM_TIP"}' localhost:9090 arrow.flight.protocol.FlightService.DoAction | jq '.body | @base64d'

Testing

Unit Test

# Run everything
make test

Integration Test

Under development

Functional Test

Under development

About

No description, website, or topics provided.

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages