Minimal, production‑ready container image to run FastBCP (parallel export CLI). This setup targets FastBCP ≥ 0.28.0, which supports passing the license inline via --license "<content>"
Binary required for custom build
The FastBCP binary is not distributed in this repository. Request the Linux x64 build here:
https://www.arpe.io/get-your-fastbcp-trial/
unzip and place it at the repository root (next to theDockerfile), then build your own custom image.
- Prerequisites
- Get the binary
- Build
- Run FastBCP
- License (≥ 0.28.0)
- Volumes
- Examples
- Docker Compose
- Performance & networking
- Security tips
- Troubleshooting
- Notes
- Docker 24+ (or Podman)
- FastBCP Linux x64 ≥ 0.28.0 binary (for build only)
- Optional:
FastBCP_Settings.jsonto mount/copy into/configfor custom logging settings
- Request a trial: https://www.arpe.io/get-your-fastbcp-trial/
- Rename the downloaded file to
fastbcpand ensure it is executable if testing locally:chmod +x fastbcp
- Place it at the repository root (beside
Dockerfile).
docker build -t fastbcp:latest .
docker run --rm fastbcp:latest --versionThis container has ENTRYPOINT set to the fastbcp binary. Any arguments you pass to docker run are forwarded to FastBCP.
docker run --rm fastbcp:latest --helpSince 0.28.0, pass the license content directly via --license "…". Several
You can also use a prebuilt image on DockerHub that already include the binary. You must provide your own license at runtime.
- dockerhub versions/releases are aligned with the fastbcp versions/releases.
docker pull aetp/fastbcp:latestor
docker pull aetp/fastbcp:v0.28.3the docker image use as entrypoint the fastbcp binary, so you can run it directly with parameters like defined in the FastBCP documentation.
You can get the command line help using this
docker run --rm aetp/fastbcp:latestYou can get the version using this
docker run --rm aetp/fastbcp:latest --version- Export from sql server to parquet on S3:
export licenseContent = cat ./FastBCP.lic
docker run --rm \
-e AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} \
-e AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} \
-e AWS_REGION=${AWS_REGION} \
aetp/fastbcp:latest \
--connectiontype "mssql" \
--server "host.docker.internal,1433" \
--user "FastUser" \
--password "FastPassword" \
--database "tpch_test" \
--query "SELECT * FROM dbo.orders where year(o_orderdate)=1998" \
--fileoutput "orders.parquet" \
--directory "s3://aetpftoutput/dockertest/" \
--paralleldegree 12 \
--parallelmethod "Ntile" \
--distributekeycolumn "o_orderkey" \
--merge false \
--license $licenseContent Good practice: prefer --env-file, Docker/Compose/Kubernetes secrets, or managed identities for cloud credentials. Avoid leaving the license content in shell history.
/work– working directory (containerWORKDIR)/config– optional configuration directory (e.g. to storeFastBCP_Settings.jsonfor custom logging)/data– target source/exports/logs– logs directory (ensure thatFastBCP_Settings.jsonis configured to write logs to this directory)
The exact parameters depend on your source and target settings. The snippets below illustrate the call pattern from Docker in a linux shell.
1) Export from sql server to csv /data using a filtered query as source and Ntile method as parallelism
export licenseContent = cat ./FastBCP.lic
docker run --rm \
aetp/fastbcp:latest \
--connectiontype "mssql" \
--server "host.docker.internal,1433" \
--user "FastUser" \
--password "FastPassword" \
--database "tpch_test" \
--query "SELECT * FROM dbo.orders where year(o_orderdate)=1998" \
--fileoutput "orders.csv" \
--directory "/data/orders/csv" \
--delimiter "|" \
--decimalseparator "." \
--dateformat "yyyy-MM-dd HH:mm:ss" \
--paralleldegree 12 \
--parallelmethod "Ntile" \
--distributekeycolumn "o_orderkey" \
--merge false \
--license $licenseContent 2) PostgreSQL → partitioned Parquet to adls using a filtered query as source and Ctid method as parallelism
export licenseContent = $(cat ./FastBCP.lic)
export adlscontainer = "aetpadlseu"
docker run --rm \
-e AZURE_CLIENT_ID=${AZURE_CLIENT_ID} \
-e AZURE_TENANT_ID=${AZURE_TENANT_ID} \
-e AZURE_CLIENT_SECRET=${AZURE_CLIENT_SECRET} \
aetp/fastbcp:latest \
--connectiontype "pgcopy" \
--server "host.docker.internal:15432" \
--user "FastUser" \
--password "FastPassword" \
--database "tpch" \
--sourceschema "tpch_10" \
--sourcetable "orders" \
--query "SELECT * FROM tpch_10.orders where o_orderdate >= '1998-01-01' and o_orderdate < '1999-01-01'" \
--fileoutput "orders.parquet" \
--directory "abfss://${adlscontainer}.dfs.core.windows.net/fastbcpoutput/testdfs/orders" \
--paralleldegree -2 \
--parallelmethod "Ctid" \
--license $licenseContent 3) Oracle (oraodp) → partitioned Parquet to gcs using table only as source and Rowid method as parallelism auto (-2)
export licenseContent = $(cat ./FastBCP.lic)
export gcsbucket = "aetp-gcs-bucket"
// get GCP credentials JSON content from file, then copy to env var
export GOOGLE_APPLICATION_CREDENTIALS_JSON=$(cat ./gcp-credentials.json)
docker run --rm \
-e GOOGLE_APPLICATION_CREDENTIALS=${GOOGLE_APPLICATION_CREDENTIALS_JSON} \
aetp/fastbcp:latest \
--connectiontype "oraodp" \
--server "host.docker.internal:1521/FREEPDB1" \
--user "TPCH_IN" \
--password "TPCH_IN" \
--database "FREEPDB1" \
--sourceschema "TPCH_IN" \
--sourcetable "ORDERS" \
--fileoutput "orders.parquet" \
--directory "gs://${gcsbucket}/fastbcpoutput/testgs/orders" \
--parallelmethod "Rowid" \
--paralleldegree -2 \
--license $licenseContent Available starting from version v0.28.3
FastBCP supports custom logging configuration through an external Serilog settings file in JSON format. This allows you to control how and where logs are written — to the console, to files, or dynamically per run.
Custom settings files must be mounted into the container under the /config directory.
The following configuration is recommended for most production or Airflow environments. It writes:
- Logs to the console for real-time visibility
- Run summary logs to
/airflow/xcom/return.jsonfor Airflow integration - Per-run logs under
/logs, automatically named with{LogTimestamp}and{TraceId}
{
"Serilog": {
"Using": [
"Serilog.Sinks.Console",
"Serilog.Sinks.File",
"Serilog.Enrichers.Environment",
"Serilog.Enrichers.Thread",
"Serilog.Enrichers.Process",
"Serilog.Enrichers.Context",
"Serilog.Formatting.Compact"
],
"WriteTo": [
{
"Name": "Console",
"Args": {
"outputTemplate": "{Timestamp:yyyy-MM-ddTHH:mm:ss.fff zzz} -|- {Application} -|- {runid} -|- {Level:u12} -|- {fulltargetname} -|- {Message}{NewLine}{Exception}",
"theme": "Serilog.Sinks.SystemConsole.Themes.ConsoleTheme::None, Serilog.Sinks.Console",
"applyThemeToRedirectedOutput": false
}
},
{
"Name": "File",
"Args": {
"path": "/airflow/xcom/return.json",
"formatter": "Serilog.Formatting.Compact.CompactJsonFormatter, Serilog.Formatting.Compact"
}
},
{
"Name": "Map",
"Args": {
"to": [
{
"Name": "File",
"Args": {
"path": "/logs/{logdate}/{sourcedatabase}/log-{filename}-{LogTimestamp}-{TraceId}.json",
"formatter": "Serilog.Formatting.Compact.CompactJsonFormatter, Serilog.Formatting.Compact",
"rollingInterval": "Infinite",
"shared": false,
"encoding": "utf-8"
}
}
]
}
}
],
"Enrich": [
"FromLogContext",
"WithMachineName",
"WithProcessId",
"WithThreadId"
],
"Properties": {
"Application": "FastBCP"
}
}
}Important notes:
- If a target directory (such as
/logsor/airflow/xcom) does not exist, FastBCP automatically creates it. - The file
/airflow/xcom/return.jsonis designed to provide run summaries compatible with Airflow’s XCom mechanism.
You can use the following placeholders to dynamically generate log file names or directories:
| Token Name | Description |
|---|---|
{logdate} |
Current date in yyyy-MM-dd format |
{logtimestamp} |
Full timestamp of the log entry |
{sourcedatabase} |
Name of the source database |
{sourceschema} |
Name of the source schema |
{sourcetable} |
Name of the source table |
{filename} |
Name of the file being processed |
{runid} |
Run identifier provided in the command line |
{traceid} |
Unique trace identifier generated at runtime |
The Docker image declares several volumes to organize data and configuration:
VOLUME ["/config", "/data", "/work", "/logs"]Your Serilog configuration file (for example, FastBCP_Settings_Logs_To_Files.json) must be placed in /config,
either by mounting a local directory or by using a Docker named volume.
Example:
docker run --rm \
-v D:\FastBCP\config:/config \
-v fastbcp-data:/data \
-v fastbcp-logs:/logs \
aetp/fastbcp:latest \
--settingsfile "/config/FastBCP_Settings_Logs_To_Files.json" \
--connectiontype "mssql" \
--server "host.docker.internal,1433" \
--user "FastUser" \
--password "FastPassword" \
--database "tpch_test" \
--query "SELECT * FROM dbo.orders" \
--fileoutput "orders.csv" \
--directory "/data" \
--paralleldegree 12 \
--parallelmethod "Ntile" \
--distributekeycolumn "o_orderkey" \
--merge false If the --settingsfile argument is not provided, FastBCP will use its built-in default logging configuration.
| Volume Path | Description | Access Mode | Typical Usage |
|---|---|---|---|
/config |
Contains user-provided configuration files (e.g., Serilog settings) | Read-Only / Read-Many | Shared across multiple containers; not modified |
/data |
Input/output data directory | Read-Many/Write-Many | Stores imported or exported data files |
/work |
Temporary working directory | Read-Many/Write-Many | Used internally for temporary processing |
/logs |
Log output directory (per-run or aggregated logs) | Read-Many/Write-Many | Stores runtime and execution logs |
- Place
/dataon fast storage (NVMe) when exporting large datasets locally. - Tune
--parallelaccording to CPU and I/O throughput. - To reach a DB on the local host from Linux, add
--add-host=host.docker.internal:host-gateway(or theextra_hostsentry in Compose). - For high‑bandwidth object‑store targets (S3/ADLS/GCS), ensure consistent MTU settings end‑to‑end; consider jumbo frames where appropriate and if possible a dedicated endpoint.
- Never commit your license or cloud credentials to source control.
- Prefer Docker/Compose/Kubernetes secrets or environment files (
--env-file) and managed identities (IAM Role / IRSA / Workload Identity / Managed Identity). - FastBCP will try classic method to authenticate to cloud object stores (default profile, IAM Role, Env) if no explicit credentials are provided.
- Exec format error → ensure the binary is Linux x64 and executable (
chmod +x fastbcp). - Missing
libicu/libssl/zlib/krb5→ the image includeslibicu72,libssl3,zlib1g,libkrb5-3. If your build requires additional libs, add them viaapt. - Permission denied writing under
/data→ ensure the host directory permissions match the container UID (10001). - DB host not reachable → on Linux, use
--add-host=host.docker.internal:host-gatewayor the Composeextra_hostsequivalent.
- This image does embed the proprietary binary. you must provide a valid license (or request trial license) in order to work. Do not share your private license outside your company
- OCI labels are set for traceability (source, vendor, license).