Deployment Guide

This guide describes the intended GCP deployment model for the public version of the project.

Deployment Target

The reference deployment uses:

Cloud Run for execution
Cloud Scheduler for daily triggering
Secret Manager for GitHub token storage
BigQuery for storage and reporting

Required Environment Variables

The deployment scripts support the following variables:

Variable	Required	Description	Example
`PROJECT_ID`	yes	GCP project ID	`example-gcp-project`
`REGION`	yes	Cloud Run, Scheduler, and BigQuery location	`us-central1`
`DATASET`	yes	BigQuery dataset name	`github_access_audit`
`ORG`	yes	GitHub organization to export	`example-org`
`SERVICE_NAME`	no	Cloud Run service name	`github-access-sync`
`SECRET_NAME`	no	Secret Manager secret holding the GitHub token	`github-access-token`
`RUNTIME_SA_NAME`	no	Cloud Run runtime service account name	`github-access-sync-sa`
`SCHEDULER_SA_NAME`	no	Cloud Scheduler caller service account name	`github-access-sync-scheduler`
`SCHEDULER_JOB_NAME`	no	Scheduler job name	`github-access-sync-daily`
`SCHEDULER_TIMEZONE`	no	Scheduler timezone	`Etc/UTC`
`SCHEDULER_SCHEDULE`	no	Scheduler cron	`0 8 * * *`
`GH_PAT`	conditionally	GitHub PAT to seed Secret Manager	`github_pat_xxx`

IAM Model

Runtime Service Account

The Cloud Run runtime service account needs:

roles/bigquery.jobUser on the project
dataset write access on the target dataset
roles/logging.logWriter on the project
roles/secretmanager.secretAccessor on the GitHub token secret

Scheduler Caller Service Account

The Cloud Scheduler caller service account needs:

roles/run.invoker on the Cloud Run service

Cloud Scheduler Service Agent

The Cloud Scheduler service agent needs:

roles/iam.serviceAccountTokenCreator on the scheduler caller service account

This is required so Scheduler can mint the OIDC token used to call the private Cloud Run service.

Deployment Sequence

1. Authenticate GCP

gcloud auth login
gcloud config set project "${PROJECT_ID}"

2. Set Variables

export PROJECT_ID='example-gcp-project'
export REGION='us-central1'
export DATASET='github_access_audit'
export ORG='example-org'
export SERVICE_NAME='github-access-sync'
export SECRET_NAME='github-access-token'
export GH_PAT='your_pat'

3. Deploy

cd <repo-dir>
bash scripts/deploy.sh

The deployment flow:

enables required GCP services
creates the BigQuery dataset if it does not exist
creates service accounts if they do not exist
creates or updates the GitHub token secret
grants required IAM bindings
deploys the Cloud Run service from source
creates or updates the Cloud Scheduler job

Manual Validation

After deployment, validate the service and job configuration.

Check Cloud Run

gcloud run services describe "${SERVICE_NAME}" \
  --project "${PROJECT_ID}" \
  --region "${REGION}"

Trigger A Manual Sync

SERVICE_URL="$(gcloud run services describe "${SERVICE_NAME}" \
  --project "${PROJECT_ID}" \
  --region "${REGION}" \
  --format='value(status.url)')"

curl -X POST "${SERVICE_URL}/sync"

For a private service, use an authenticated call path rather than an anonymous request.

Check Scheduler

gcloud scheduler jobs describe "${SCHEDULER_JOB_NAME}" \
  --project "${PROJECT_ID}" \
  --location "${REGION}"

Check BigQuery Outputs

bq ls "${PROJECT_ID}:${DATASET}"
bq query --use_legacy_sql=false "SELECT * FROM \`${PROJECT_ID}.${DATASET}.sync_runs\` ORDER BY started_at DESC LIMIT 5"

Operational Recommendations

keep the service private
use a dedicated GitHub token for this workload
review sync_skipped_items after failures or unusual volume changes
monitor sync_runs.status, loaded_rows, and skipped_items
start with a test organization or non-sensitive dataset before production rollout

Common Failure Modes

Invalid GitHub Token

Symptoms:

exporter fails during token validation
no CSV outputs are produced

Action:

rotate the token
update the secret
rerun the sync

BigQuery Permission Failure

Symptoms:

load job errors
view refresh errors

Action:

verify runtime service account IAM
verify dataset-level access entries

Scheduler Invocation Failure

Symptoms:

job exists but sync never starts
HTTP auth errors in Scheduler execution logs

Action:

verify roles/run.invoker on the Cloud Run service
verify the scheduler service agent has token-creator access on the caller service account

GitHub API Throttling Or Availability Issues

Symptoms:

many skipped items
partial loads

Action:

inspect sync_skipped_items
verify token scope and rate limits
reduce concurrency only if you later redesign the exporter

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deployment Guide

Deployment Target

Required Environment Variables

IAM Model

Runtime Service Account

Scheduler Caller Service Account

Cloud Scheduler Service Agent

Deployment Sequence

1. Authenticate GCP

2. Set Variables

3. Deploy

Manual Validation

Check Cloud Run

Trigger A Manual Sync

Check Scheduler

Check BigQuery Outputs

Operational Recommendations

Common Failure Modes

Invalid GitHub Token

BigQuery Permission Failure

Scheduler Invocation Failure

GitHub API Throttling Or Availability Issues

FilesExpand file tree

deployment.md

Latest commit

History

deployment.md

File metadata and controls

Deployment Guide

Deployment Target

Required Environment Variables

IAM Model

Runtime Service Account

Scheduler Caller Service Account

Cloud Scheduler Service Agent

Deployment Sequence

1. Authenticate GCP

2. Set Variables

3. Deploy

Manual Validation

Check Cloud Run

Trigger A Manual Sync

Check Scheduler

Check BigQuery Outputs

Operational Recommendations

Common Failure Modes

Invalid GitHub Token

BigQuery Permission Failure

Scheduler Invocation Failure

GitHub API Throttling Or Availability Issues