This project builds a simple data pipeline for retail data. It does ETL (extract, data quality checks) using pandas. Airflow runs the pipeline. MinIO is used to store XCom objects (like pandas DataFrames). MySQL stores results.Postgresql used metadata.
The pipeline image is here: docs/images/pipline.png
- Clone the repo:
git clone <repo-url>
cd <repo-name>
-
Put the
.envfile in the project root (see sample below). -
(Optional) If you have SQL init files for MySQL, put them in mysql-init/ folder. Docker will run them automatically on MySQL start.
-
Start Airflow and other services:
- With Astro CLI:
astro dev start
- Or with Docker Compose:
docker-compose up -d
- Open Airflow UI:
- Run or check DAGs(before add airflow connections):
- Use Airflow UI or
astro dev run <dag_id>
We use these Airflow connections:
All connection screenshots are in /docs/images/ folder.
We use MinIO as the XCom backend. Pandas DataFrames and other objects are stored in MinIO. Airflow keeps only references in XCom. Screenshot: docs/images/xcom_minio.png
If you want MySQL to run SQL files on start, put *.sql files in mysql-init/. Docker mounts this folder to /docker-entrypoint-initdb.d automatically.
# MySQL
MYSQL_ROOT_PASSWORD=admin
MYSQL_USER=admin
MYSQL_PASSWORD=admin
MYSQL_DATABASE=retailflow
# MinIO / S3 Airflow connection
AIRFLOW_CONN_MINIO_LOCAL='{
"conn_type":"aws",
"extra":{
"aws_access_key_id":"minioadmin",
"aws_secret_access_key":"minioadmin",
"endpoint_url":"http://minio:9000"
}
}'
# Object Storage Custom XCom Backend
AIRFLOW__CORE__XCOM_BACKEND="airflow.providers.common.io.xcom.backend.XComObjectStorageBackend"
AIRFLOW__COMMON_IO__XCOM_OBJECTSTORAGE_PATH="s3://retail@retail/xcom"
AIRFLOW__COMMON_IO__XCOM_OBJECTSTORAGE_THRESHOLD="0"
AIRFLOW__COMMON_IO__XCOM_OBJECTSTORAGE_COMPRESSION="zip"
- You can change MYSQL_PASSWORD or other values.
- If you want another XCom connection, change AIRFLOW__COMMON_IO__XCOM_OBJECTSTORAGE_PATH.
- Pipeline image: docs/images/pipline.png
- MinIO connection screenshot: docs/images/retail_minio_conn.png
- MySQL connection screenshot: docs/images/retailflow_mysql_conn.png
- XCom MinIO screenshot: docs/images/xcom_minio.png
- MySQL init SQL files: mysql-init/





