Skip to content
This repository has been archived by the owner on Jan 29, 2022. It is now read-only.

Latest commit

 

History

History
64 lines (50 loc) · 3.93 KB

CONFIGURATION.md

File metadata and controls

64 lines (50 loc) · 3.93 KB

Airflow configuration

Some DAGs will require a few variables and connections to be configured in order to run properly. Below are listed all the variables available throughout the code base together with their descriptions.

Connections

Connection name Type Description
api_db postgres API database
explorer_db postgres OpenTrials Explorer database
warehouse_db postgres OpenTrials Warehouse database
datastore_http http URL of the S3 bucket to use
datastore_s3 s3 URL of the S3 bucket to use

Variables

Sources variables

These are variables used by their respective sources' collectors and processors. For example, HRA_URL is used by the hra collector. In simpler terms, these are generally related to collecting data.

Variable name Description
DOWNLOAD_DELAY Requests rate limit for scraping
COCHRANE_ARCHIVE_URL Location of the Cochrane data archive
HRA_ENV HRA environment (e.g. production)
HRA_PASS HRA password
HRA_URL HRA URL
HRA_USER HRA user name
ICTRP_PASS ICTRP password
ICTRP_USER ICTRP user name

Utilities

These variables are used for defining general system settings or running data processing tasks.

Variable name Description
DOCKER_API_VERSION Docker API version to use (for compatibility with older runtimes)
ENV Python environment to be used
DOCUMENTCLOUD_PASSWORD DocumentCloud password
DOCUMENTCLOUD_PROJECT DocumentCloud project identifier
DOCUMENTCLOUD_USERNAME DocumentCloud user name
AWS_ACCESS_KEY_ID AWS access key ID
AWS_SECRET_ACCESS_KEY AWS access key
AWS_S3_BUCKET S3 bucket
AWS_S3_REGION S3 region
AWS_S3_CUSTOM_DOMAIN S3 custom domain to use when creating public URLs for stored files

Logging and monitoring

The variables here relate to logging and error monitoring features of (some of) the DAGs.

Variable name Description
LOGGING_URL Papertrail endpoint for logging
COLLECTOR_SENTRY_DSN Sentry endpoint for Collectors
PROCESSOR_SENTRY_DSN Sentry endpoint for Processors