This is an implementation of a sink connector from Apache Kafka to Google BigQuery, built on top of Apache Kafka Connect.
This connector was originally developed by WePay. In late 2020 the project moved to Confluent, with both companies taking on maintenance duties. In 2024, Aiven created its own fork based off the Confluent project in order to continue maintaining an open source, Apache 2-licensed version of the connector.
An example connector configuration, that reads records from Kafka with JSON-encoded values and writes their values to BigQuery:
{
"connector.class": "com.wepay.kafka.connect.bigquery.BigQuerySinkConnector",
"topics": "users, clicks, payments",
"tasks.max": "3",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"project": "kafka-ingest-testing",
"defaultDataset": "kcbq-example",
"keyfile": "/tmp/bigquery-credentials.json"
}
See here for a list of the connector's configuration properties.
Releases are available in the GitHub release tab.
In order to execute the integration specific environment variables must be set.
GOOGLE_APPLICATION_CREDENTIALS - the path to a json file that was download when the GCP account key was created..
KCBQ_TEST_BUCKET - the name of the bucket to use for testing,
KCBQ_TEST_DATASET - the name of the dataset to use for testing,
KCBQ_TEST_KEYFILE - same as the GOOGLE_APPLICATION_CREDENTIALS
KCBQ_TEST_PROJECT - the name of the project to use.
GCP_CREDENTIALS - the contents of a json file that was download when the GCP account key was created.
KCBQ_TEST_BUCKET - the bucket to use for the tests
KCBQ_TEST_DATASET - the data set to use for the tests.
KCBQ_TEST_PROJECT - the project to use for the tests.