-
Notifications
You must be signed in to change notification settings - Fork 15.5k
Closed
Labels
affected_version:2.1Issues Reported for 2.1Issues Reported for 2.1area:corekind:bugThis is a clearly a bugThis is a clearly a bugpriority:mediumBug that should be fixed before next release but would not block a releaseBug that should be fixed before next release but would not block a release
Milestone
Description
Apache Airflow version: 2.1.1rc1
Kubernetes version (if you are using kubernetes) (use kubectl version
): 1.18
Environment:
- Cloud provider or hardware configuration: AWS EKS
- OS (e.g. from /etc/os-release):
Debian GNU/Linux 10
- Kernel (e.g.
uname -a
):Linux airflow-worker-1 4.14.232-176.381.amzn2.x86_64 #1 SMP Wed May 19 00:31:54 UTC 2021 x86_64 GNU/Linux
- Install tools: https://github.com/airflow-helm/charts
- Others:
What happened:
It was marked as failed
actually, though log said it's up for retry
.
[2021-07-06 22:37:25,624] {local_task_job.py:76} ERROR - Received SIGTERM. Terminating subprocesses
[2021-07-06 22:37:25,626] {process_utils.py:100} INFO - Sending Signals.SIGTERM to GPID 2884
[2021-07-06 22:37:25,626] {taskinstance.py:1284} ERROR - Received SIGTERM. Terminating subprocesses.
[2021-07-06 22:37:25,653] {taskinstance.py:1501} ERROR - Task failed with exception
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1157, in _run_raw_task
self._prepare_and_execute_task_with_callbacks(context, task)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1331, in _prepare_and_execute_task_with_callbacks
result = self._execute_task(context, task_copy)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1361, in _execute_task
result = task_copy.execute(context=context)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/operators/python.py", line 150, in execute
return_value = self.execute_callable()
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/operators/python.py", line 161, in execute_callable
return self.python_callable(*self.op_args, **self.op_kwargs)
File "/opt/airflow/dags/repo/dags/v1/sync/sync_segments/udfs/sync_segments.py", line 60, in main
customers = controllers.get_customers_metrics_v2(
File "/opt/airflow/dags/repo/dags/v1/utils/api/controllers.py", line 298, in get_customers_metrics_v2
res = q.all()
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3373, in all
return list(self)
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3535, in __iter__
return self._execute_and_instances(context)
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3560, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
return meth(self, multiparams, params)
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1124, in _execute_clauseelement
ret = self._execute_context(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context
self._handle_dbapi_exception(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1514, in _handle_dbapi_exception
util.raise_(exc_info[1], with_traceback=exc_info[2])
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
self.dialect.do_execute(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
cursor.execute(statement, parameters)
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/cursor.py", line 638, in execute
ret = self._execute_helper(query, **kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/cursor.py", line 456, in _execute_helper
ret = self._connection.cmd_query(
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/connection.py", line 945, in cmd_query
ret = self.rest.request(
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/network.py", line 381, in request
return self._post_request(
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/network.py", line 629, in _post_request
ret = self.fetch(
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/network.py", line 719, in fetch
ret = self._request_exec_wrapper(
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/network.py", line 841, in _request_exec_wrapper
raise e
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/network.py", line 762, in _request_exec_wrapper
return_object = self._request_exec(
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/network.py", line 1049, in _request_exec
raise err
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/network.py", line 926, in _request_exec
raw_ret = session.request(
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/vendored/requests/sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/vendored/requests/sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/vendored/requests/adapters.py", line 439, in send
resp = conn.urlopen(
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/vendored/urllib3/connectionpool.py", line 699, in urlopen
httplib_response = self._make_request(
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/vendored/urllib3/connectionpool.py", line 445, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/vendored/urllib3/connectionpool.py", line 440, in _make_request
httplib_response = conn.getresponse()
File "/usr/local/lib/python3.8/http/client.py", line 1344, in getresponse
response.begin()
File "/usr/local/lib/python3.8/http/client.py", line 307, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python3.8/http/client.py", line 268, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/local/lib/python3.8/socket.py", line 669, in readinto
return self._sock.recv_into(b)
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/vendored/urllib3/contrib/pyopenssl.py", line 331, in recv_into
if not util.wait_for_read(self.socket, self.socket.gettimeout()):
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/vendored/urllib3/util/wait.py", line 146, in wait_for_read
return wait_for_socket(sock, read=True, timeout=timeout)
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/vendored/urllib3/util/wait.py", line 107, in poll_wait_for_socket
return bool(_retry_on_intr(do_poll, timeout))
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/vendored/urllib3/util/wait.py", line 43, in _retry_on_intr
return fn(timeout)
File "/home/airflow/.local/lib/python3.8/site-packages/snowflake/connector/vendored/urllib3/util/wait.py", line 105, in do_poll
return poll_obj.poll(t)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1286, in signal_handler
raise AirflowException("Task received SIGTERM signal")
airflow.exceptions.AirflowException: Task received SIGTERM signal
[2021-07-06 22:37:25,658] {taskinstance.py:1544} INFO - Marking task as UP_FOR_RETRY. dag_id=UPDATE_INTERNAL_DAILY_EUROPEZURICH, task_id=SYNC_SEGMENTS_CORE_TASK_1893, execution_date=20210705T220000, start_date=20210706T222237, end_date=20210706T223725
[2021-07-06 22:37:25,718] {process_utils.py:66} INFO - Process psutil.Process(pid=2884, status='terminated', exitcode=1, started='22:22:36') (2884) terminated with exit code 1
What you expected to happen:
It should retry, or at least trigger on_failure_callback
to let me know it failed
How to reproduce it:
- Delete a running worker pod then you'll see:
kubectl delete pods -n airflow airflow-worker-3 --force
- or you can use
AWS spotinstance
orGCP preemptible instance
, k8s's node would be reclaimed by cloud service provider from time to time and result in this issue
Anything else we need to know:
How often does it occur: every time
Metadata
Metadata
Assignees
Labels
affected_version:2.1Issues Reported for 2.1Issues Reported for 2.1area:corekind:bugThis is a clearly a bugThis is a clearly a bugpriority:mediumBug that should be fixed before next release but would not block a releaseBug that should be fixed before next release but would not block a release