Skip to content

Commit d6ab375

Browse files
authored
Merge pull request #1 from tuanavu/development
Create first tutorial dag
2 parents 03b7152 + a28b37c commit d6ab375

File tree

3 files changed

+198
-1
lines changed

3 files changed

+198
-1
lines changed

README.md

Lines changed: 57 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,58 @@
1-
# airflow-tutorial
21
Airflow tutorial
2+
---
3+
4+
Airflow tutorial
5+
6+
## Getting Started
7+
8+
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.
9+
10+
- Clone this repo
11+
- Install the prerequisites
12+
- Run the service
13+
- Check http://localhost:5000
14+
- Done! :tada:
15+
16+
### Prerequisites
17+
18+
- Install [Docker](https://www.docker.com/)
19+
- Install [Docker Compose](https://docs.docker.com/compose/install/)
20+
- Following the Airflow release from [Python Package Index](https://pypi.python.org/pypi/apache-airflow)
21+
22+
### Usage
23+
24+
Run the web service with docker
25+
26+
```
27+
docker-compose up -d
28+
29+
# Build the image
30+
# docker-compose up -d --build
31+
```
32+
33+
Check http://localhost:8080/
34+
35+
- `docker-compose logs` - Displays log output
36+
- `docker-compose ps` - List containers
37+
- `docker-compose down` - Stop containers
38+
39+
## Other commands
40+
41+
If you want to run other airflow sub-commands, you can do so like this:
42+
43+
- `docker-compose run --rm webserver airflow list_dags` - List dags
44+
- `docker-compose run --rm webserver airflow test [DAG_ID] [TASK_ID] [EXECUTION_DATE]` - Test specific task
45+
46+
## Connect to database
47+
48+
If you want to use Ad hoc query, make sure you've configured connections:
49+
Go to Admin -> Connections and Edit "postgres_default" set this values:
50+
- Host : postgres
51+
- Schema : airflow
52+
- Login : airflow
53+
- Password : airflow
54+
55+
56+
## Credits
57+
58+
- [docker-airflow](https://github.com/puckel/docker-airflow/tree/1.10.0-5)

dags/tutorial.py

Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
# -*- coding: utf-8 -*-
2+
#
3+
# Licensed to the Apache Software Foundation (ASF) under one
4+
# or more contributor license agreements. See the NOTICE file
5+
# distributed with this work for additional information
6+
# regarding copyright ownership. The ASF licenses this file
7+
# to you under the Apache License, Version 2.0 (the
8+
# "License"); you may not use this file except in compliance
9+
# with the License. You may obtain a copy of the License at
10+
#
11+
# http://www.apache.org/licenses/LICENSE-2.0
12+
#
13+
# Unless required by applicable law or agreed to in writing,
14+
# software distributed under the License is distributed on an
15+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
16+
# KIND, either express or implied. See the License for the
17+
# specific language governing permissions and limitations
18+
# under the License.
19+
20+
"""
21+
### Tutorial Documentation
22+
Documentation that goes along with the Airflow tutorial located
23+
[here](https://airflow.incubator.apache.org/tutorial.html)
24+
"""
25+
from datetime import timedelta
26+
27+
import airflow
28+
from airflow import DAG
29+
from airflow.operators.bash_operator import BashOperator
30+
31+
# These args will get passed on to each operator
32+
# You can override them on a per-task basis during operator initialization
33+
default_args = {
34+
'owner': 'airflow',
35+
'depends_on_past': False,
36+
'start_date': airflow.utils.dates.days_ago(2),
37+
'email': ['airflow@example.com'],
38+
'email_on_failure': False,
39+
'email_on_retry': False,
40+
'retries': 1,
41+
'retry_delay': timedelta(minutes=5),
42+
# 'queue': 'bash_queue',
43+
# 'pool': 'backfill',
44+
# 'priority_weight': 10,
45+
# 'end_date': datetime(2016, 1, 1),
46+
# 'wait_for_downstream': False,
47+
# 'dag': dag,
48+
# 'adhoc':False,
49+
# 'sla': timedelta(hours=2),
50+
# 'execution_timeout': timedelta(seconds=300),
51+
# 'on_failure_callback': some_function,
52+
# 'on_success_callback': some_other_function,
53+
# 'on_retry_callback': another_function,
54+
# 'trigger_rule': u'all_success'
55+
}
56+
57+
dag = DAG(
58+
'tutorial',
59+
default_args=default_args,
60+
description='A simple tutorial DAG',
61+
schedule_interval=timedelta(days=1),
62+
)
63+
64+
# t1, t2 and t3 are examples of tasks created by instantiating operators
65+
t1 = BashOperator(
66+
task_id='print_date',
67+
bash_command='date',
68+
dag=dag,
69+
)
70+
71+
t1.doc_md = """\
72+
#### Task Documentation
73+
You can document your task using the attributes `doc_md` (markdown),
74+
`doc` (plain text), `doc_rst`, `doc_json`, `doc_yaml` which gets
75+
rendered in the UI's Task Instance Details page.
76+
![img](http://montcs.bloomu.edu/~bobmon/Semesters/2012-01/491/import%20soul.png)
77+
"""
78+
79+
dag.doc_md = __doc__
80+
81+
t2 = BashOperator(
82+
task_id='sleep',
83+
depends_on_past=False,
84+
bash_command='sleep 5',
85+
dag=dag,
86+
)
87+
88+
templated_command = """
89+
{% for i in range(5) %}
90+
echo "{{ ds }}"
91+
echo "{{ macros.ds_add(ds, 7)}}"
92+
echo "{{ params.my_param }}"
93+
{% endfor %}
94+
"""
95+
96+
t3 = BashOperator(
97+
task_id='templated',
98+
depends_on_past=False,
99+
bash_command=templated_command,
100+
params={'my_param': 'Parameter I passed in'},
101+
dag=dag,
102+
)
103+
104+
t1 >> [t2, t3]

docker-compose.yml

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
version: '3'
2+
services:
3+
postgres:
4+
image: postgres:9.6
5+
environment:
6+
- POSTGRES_USER=airflow
7+
- POSTGRES_PASSWORD=airflow
8+
- POSTGRES_DB=airflow
9+
ports:
10+
- "5432:5432"
11+
12+
webserver:
13+
image: puckel/docker-airflow:1.10.0-5
14+
build:
15+
context: https://github.com/puckel/docker-airflow.git#1.10.0-5
16+
dockerfile: Dockerfile
17+
args:
18+
AIRFLOW_DEPS: gcp_api,s3
19+
restart: always
20+
depends_on:
21+
- postgres
22+
environment:
23+
- LOAD_EX=n
24+
- EXECUTOR=Local
25+
- FERNET_KEY=jsDPRErfv8Z_eVTnGfF8ywd19j4pyqE3NpdUBA_oRTo=
26+
volumes:
27+
- ./dags:/usr/local/airflow/dags
28+
# Uncomment to include custom plugins
29+
# - ./plugins:/usr/local/airflow/plugins
30+
ports:
31+
- "8080:8080"
32+
command: webserver
33+
healthcheck:
34+
test: ["CMD-SHELL", "[ -f /usr/local/airflow/airflow-webserver.pid ]"]
35+
interval: 30s
36+
timeout: 30s
37+
retries: 3

0 commit comments

Comments
 (0)