This README provides instructions on how to setup and use a Dataflow template that integrates Google Cloud services with Meta's Conversions API.
The GCP to Meta CAPI template encapsulates the logic for reading data from a Google Cloud source (BigQuery or Google Cloud Storage) and sending that data to Meta using Conversions API. It was developed for events collected in batch (like Offline events) and is not suitable for real-time events. Below are the data processing steps managed by the template and the Meta CAPI connector:
- (1) Read offline events from a BigQuery table or Google Cloud Storage file;
- (2) Parse single record to the Conversions API Json;
- (3) Group the parsed records into micro-batches (e.g. 50 events per API call);
- (4) Make the API calls to Conversions API and store the output logs (success or failure) into a dead letter table defined by the advertiser.
- Google Cloud SDK (gcloud) installed and configured;
- Access to Google Cloud Platform (GCP) with necessary permissions.
Enable the necessary APIs by running the following command:
gcloud services enable dataflow compute_component logging storage_component storage_api cloudresourcemanager.googleapis.com artifactregistry.googleapis.com cloudbuild.googleapis.com
Set the environment variables for your GCP project:
export BUCKET="YOUR_TEMPLATE_LOCATION_BUCKET"
export REGION="YOUR_GCP_REGION"
export REPOSITORY="YOUR_GCP_CONTAINER_REPOSITORY"
export PROJECT="YOUR_GCP_PROJECT"
gcloud config set project $PROJECT
Create a Cloud Storage bucket to store the Dataflow template:
gsutil mb gs://$BUCKET
gcloud artifacts repositories create $REPOSITORY \
--repository-format=docker \
--location=$REGION
Build the Dataflow template using the command below. The resulting template file will be stored on the bucket defined above.
export TEMPLATE_NAME="bigquery_to_meta_conversions_api"
gcloud dataflow flex-template build gs://$BUCKET/$TEMPLATE_NAME.json \
--image-gcr-path "$REGION-docker.pkg.dev/$PROJECT/$REPOSITORY/$TEMPLATE_NAME:latest" \
--sdk-language "PYTHON" \
--flex-template-base-image "PYTHON3" \
--py-path "." \
--metadata-file "${TEMPLATE_NAME}_metadata.json" \
--env "FLEX_TEMPLATE_PYTHON_PY_FILE=$TEMPLATE_NAME.py" \
--env "FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE=requirements.txt"\
--env "FLEX_TEMPLATE_PYTHON_SETUP_FILE=setup.py"
export TEMPLATE_NAME="gcs_to_meta_conversions_api"
gcloud dataflow flex-template build gs://$BUCKET/$TEMPLATE_NAME.json \
--image-gcr-path "$REGION-docker.pkg.dev/$PROJECT/$REPOSITORY/$TEMPLATE_NAME:latest" \
--sdk-language "PYTHON" \
--flex-template-base-image "PYTHON3" \
--py-path "." \
--metadata-file "${TEMPLATE_NAME}_metadata.json" \
--env "FLEX_TEMPLATE_PYTHON_PY_FILE=$TEMPLATE_NAME.py" \
--env "FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE=requirements.txt"\
--env "FLEX_TEMPLATE_PYTHON_SETUP_FILE=setup.py"
- Access to Google Cloud Platform (GCP) with necessary permissions.
- Meta Conversions API access token (https://developers.facebook.com/docs/marketing-api/conversions-api/get-started/). If you serve multiple datasets from the same table, you should generate a system user access token rather than a dataset-scoped access token.
Below is the expected table source format, please consult following documentation for further details:
- Standard parameters: https://developers.facebook.com/docs/marketing-api/conversions-api/parameters/server-event/
- Custom data fields: https://developers.facebook.com/docs/marketing-api/conversions-api/parameters/custom-data
- User data fields (hashing and normalization requirements): https://developers.facebook.com/docs/marketing-api/conversions-api/parameters/customer-information-parameters
Field name | Type | Mandatory | Multivalue field | Description |
---|---|---|---|---|
data_set_id | INTEGER | Yes | No | Destination data_set_id (e.g. 350218134519384) |
event_name | STRING | Yes | No | Event name (e.g. Purchase) |
event_time | INTEGER | Yes | No | Unix timestamp in seconds indicating when the actual event occurred (eg. 1716764400) |
action_source | STRING | Yes | No | Should be “physical_store” for Offline events |
order_id | STRING | No | No | Unique transaction or order ID (e.g. order1234) |
value | FLOAT | Yes if event_name = ‘Purchase’ | No | Value of the purchase (e.g. 150.5). Optional for non-Purchase event |
currency | INTEGER | Yes if event_name = ‘Purchase’ | No | Currency (e.g. GBP), must be a valid ISO 4217 three-digit currency code. Optional for non-Purchase event |
content_ids | STRING | No | Yes | Multiple value, Content IDs/SKU associated with the events. For multiple values, use a separator (e.g. ABC|EFG) |
em | STRING | No | Yes | Multiple value, Hashed and normalized email. For multiple values, use a separator (e.g. 62a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f62f|90a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f33f) |
ph | STRING | No | Yes | Multiple value, Hashed and normalized email. For multiple values, use a separator (e.g. 62a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f62f|90a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f33f) |
ln | STRING | No | Yes | Multiple value, Hashed and normalized email. For multiple values, use a separator (e.g. 62a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f62f|90a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f33f) |
fn | STRING | No | Yes | Multiple value, Hashed and normalized email. For multiple values, use a separator (e.g. 62a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f62f|90a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f33f) |
db | STRING | No | Yes | Multiple value, Hashed and normalized email. For multiple values, use a separator (e.g. 62a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f62f|90a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f33f) |
ct | STRING | No | Yes | Multiple value, Hashed and normalized email. For multiple values, use a separator (e.g. 62a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f62f|90a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f33f) |
zp | STRING | No | Yes | Multiple value, Hashed and normalized email. For multiple values, use a separator (e.g. 62a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f62f|90a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f33f) |
st | STRING | No | Yes | Multiple value, Hashed and normalized email. For multiple values, use a separator (e.g. 62a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f62f|90a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f33f) |
country | STRING | No | Yes | Multiple value, Hashed and normalized email. For multiple values, use a separator (e.g. 62a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f62f|90a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f33f) |
external_id | STRING | No | Yes | Multiple value, Hashed and normalized email. For multiple values, use a separator (e.g. 62a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f62f|90a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f33f) |
madid | STRING | No | No | Mobile advertiser ID |
ge | STRING | No | Yes | Multiple value, Hashed and normalized email. For multiple values, use a separator (e.g. 62a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f62f|90a14e44f765419d10fea99367361a727c12365e2520f32218d505ed9aa0f33f) |
content_type | STRING | No | No | Should be either “product” or “product_group” depending on the type of contents send on the “content_ids” |
num_items | INTEGER | No | No | Use only with InitiateCheckout events. The number of items that a user tries to buy during checkout. |
Any non-standard property | STRING | No | No | Any non-standard parameters added by the advertisers; would be considered as a custom property |
Parameter | Type | Required | Scope | Default value | Description |
---|---|---|---|---|---|
BigQuery Source Table | GCP Table ID | Yes | Applicable only for BigQuery template | - | Source table, should be a GCP table ID (bigquery-project:dataset.input_table) |
Google Cloud Storage Source File | GCS Source File URI | Yes | Applicable only for GCS template | - | Source file, should be a GCS source file URI (Example: gs://bucket_name/path_to_file.csv) |
Conversions API access token | STRING | Yes | Applicable for both BigQuery & GCS templates | - | Access Token for accessing Conversions API. Check instructions here: https://developers.facebook.com/docs/marketing-api/conversions-api/get-started/#access-token |
BigQuery Dead Letter Queue table | GCP Table ID | Yes | Applicable for both BigQuery & GCS templates | - | Table where to store API outputs(success or/and error message and input), should be a GCP table ID (bigquery-project:dataset.output_table) |
BigQuery Dead Letter Queue filter output | ENUN: (ERROR_ONLY, SUCCESS_ONLY, ALL) | Yes | Applicable for both BigQuery & GCS templates | - | Type of output to logs: ERROR_ONLY: only logs output with errors; SUCCESS_ONLY: only output with success API response; ALL: logs all output including errors and success |
Multi value separator | STRING | No | Applicable for both BigQuery & GCS templates | | | Separator used for fields having multiple values such as emails (em), phones(ph) or content IDs (content_ids) |
Batch Size (number max of events per conversions api call) | INT | No | Applicable for both BigQuery & GCS templates | 50 | Number max of rows per single API call. Default value 50 If at least one record in a batch is invalid, the entire batch will be rejected |
Max Conversions API retry attempts | INT | No | Applicable for both BigQuery & GCS templates | 3 | Number max of retry in case of server error. Default 3 |
Exponential backoff factor for Conversions API retry | FLOAT | No | Applicable for both BigQuery & GCS templates | 2 | Number of times to wait in case of server error. Default, use an exponential factor of 2 |
Field name | Type | Value |
---|---|---|
metacapisink_timestamp | String | Processing timestamp |
metacapisink_status | String | Status for the event; could be SUCCESS if the event was sent successfully to Meta or ERROR in case of parsing error or if the event was rejected by Meta |
metacapisink_failure_pipeline_step | String | Step where the event failed; It can be _ParseRowToMetaConversionsAPIBody (if the event is missing a required field), _MakeMetaConversionsAPICalls (in case of reject by Meta), or null (in case of success) |
metacapisink_output_message | String | API response in case of success or error message for failures If at least one record in a micro-batch is invalid, the entire batch will be rejected and all records in the batch will receive the same error message. |
metacapisink_input_element | String | Original input record as a string |
gcp-to-conversions-api-dataflow is MIT licensed, as found in the LICENSE file.