Skip to content

DockerOperator AWS ECR Login Unauthorised #9707

@gregbrowndev

Description

@gregbrowndev

Apache Airflow version:

1.10.10

Environment:

  • OS (e.g. from /etc/os-release):

NAME=Fedora
VERSION="29 (Workstation Edition)"
ID=fedora
VERSION_ID=29
VERSION_CODENAME=""
PLATFORM_ID="platform:f29"
PRETTY_NAME="Fedora 29 (Workstation Edition)"
ANSI_COLOR="0;34"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:29"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f29/system-administrators-guide/"
SUPPORT_URL="https://fedoraproject.org/wiki/Communicating_and_getting_help"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=29
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=29
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
VARIANT="Workstation Edition"
VARIANT_ID=workstation

  • Kernel (e.g. uname -a):

Linux LAP300.itoworld.internal 4.20.6-200.fc29.x86_64 #1 SMP Thu Jan 31 15:50:43 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

  • Install tools: poetry

What happened:

When using the DockerOperator with a private AWS ECR registry, an unauthorised error is thrown by the Docker daemon.

*** Reading local file: /opt/airflow/logs/example_dag/docker_command/2020-07-06T19:16:27.355788+00:00/1.log
[2020-07-06 19:16:34,273] {taskinstance.py:669} INFO - Dependencies all met for <TaskInstance: example_dag.docker_command 2020-07-06T19:16:27.355788+00:00 [queued]>
[2020-07-06 19:16:34,309] {taskinstance.py:669} INFO - Dependencies all met for <TaskInstance: example_dag.docker_command 2020-07-06T19:16:27.355788+00:00 [queued]>
[2020-07-06 19:16:34,310] {taskinstance.py:879} INFO - 
--------------------------------------------------------------------------------
[2020-07-06 19:16:34,310] {taskinstance.py:880} INFO - Starting attempt 1 of 1
[2020-07-06 19:16:34,310] {taskinstance.py:881} INFO - 
--------------------------------------------------------------------------------
[2020-07-06 19:16:34,330] {taskinstance.py:900} INFO - Executing <Task(DockerOperator): docker_command> on 2020-07-06T19:16:27.355788+00:00
[2020-07-06 19:16:34,334] {standard_task_runner.py:53} INFO - Started process 6003 to run task
[2020-07-06 19:16:34,388] {logging_mixin.py:112} INFO - Running %s on host %s <TaskInstance: example_dag.docker_command 2020-07-06T19:16:27.355788+00:00 [running]> f7868fb55087
[2020-07-06 19:16:34,421] {logging_mixin.py:112} INFO - [2020-07-06 19:16:34,421] {base_hook.py:87} INFO - Using connection to: id: docker_ecr. Host: ********.dkr.ecr.eu-west-1.amazonaws.com, Port: None, Schema: None, Login: *********, Password: XXXXXXXX, extra: None
[2020-07-06 19:16:35,190] {logging_mixin.py:112} INFO - [2020-07-06 19:16:35,189] {docker_hook.py:87} ERROR - Docker registry login failed: 500 Server Error: Internal Server Error ("login attempt to https://********.dkr.ecr.eu-west-1.amazonaws.com/v2/ failed with status: 401 Unauthorized")
[2020-07-06 19:16:35,215] {taskinstance.py:1145} ERROR - ('Docker registry login failed: %s', '500 Server Error: Internal Server Error ("login attempt to https://********.dkr.ecr.eu-west-1.amazonaws.com/v2/ failed with status: 401 Unauthorized")')
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/docker/api/client.py", line 261, in _raise_for_status
    response.raise_for_status()
  File "/home/airflow/.local/lib/python3.6/site-packages/requests/models.py", line 941, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: http+docker://localhost/v1.40/auth

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/hooks/docker_hook.py", line 83, in __login
    reauth=self.__reauth
  File "/usr/local/lib/python3.6/site-packages/docker/api/daemon.py", line 152, in login
    return self._result(response, json=True)
  File "/usr/local/lib/python3.6/site-packages/docker/api/client.py", line 267, in _result
    self._raise_for_status(response)
  File "/usr/local/lib/python3.6/site-packages/docker/api/client.py", line 263, in _raise_for_status
    raise create_api_error_from_http_exception(e)
  File "/usr/local/lib/python3.6/site-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
    raise cls(e, response=response, explanation=explanation)
docker.errors.APIError: 500 Server Error: Internal Server Error ("login attempt to https://********.dkr.ecr.eu-west-1.amazonaws.com/v2/ failed with status: 401 Unauthorized")

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/taskinstance.py", line 983, in _run_raw_task
    result = task_copy.execute(context=context)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/operators/docker_operator.py", line 257, in execute
    self.cli = self.get_hook().get_conn()
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/hooks/docker_hook.py", line 72, in get_conn
    self.__login(client)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/hooks/docker_hook.py", line 88, in __login
    raise AirflowException('Docker registry login failed: %s', str(docker_error))
airflow.exceptions.AirflowException: ('Docker registry login failed: %s', '500 Server Error: Internal Server Error ("login attempt to https://********.dkr.ecr.eu-west-1.amazonaws.com/v2/ failed with status: 401 Unauthorized")')
[2020-07-06 19:16:35,221] {taskinstance.py:1202} INFO - Marking task as FAILED.dag_id=example_dag, task_id=docker_command, execution_date=20200706T191627, start_date=20200706T191634, end_date=20200706T191635
[2020-07-06 19:16:44,250] {logging_mixin.py:112} INFO - [2020-07-06 19:16:44,250] {local_task_job.py:103} INFO - Task exited with return code 1

What you expected to happen:

Docker credentials are retrieved for docker_conn_id and the DockerOperator is able to login and pull the image from ECR.

How to reproduce it:

The simple DockerOperator looks like:

t1 = DockerOperator(
        task_id="docker_command",
        docker_conn_id="docker_ecr",
        image="**********.dkr.ecr.eu-west-1.amazonaws.com/myimage:latest",
        api_version="auto",
        auto_remove=True,
        command=["python", "-c", 'print("Hello World")'],
        network_mode="bridge",
    )

The error can be reproduced with the following code used by the DockerHook:

registry = "***********.dkr.ecr.eu-west-1.amazonaws.com"  # Note: Not doesn't end in /v2
username = "***********"
password = "***********"

base_url = "unix://var/run/docker.sock"
version = "auto"

client = docker.APIClient(
    base_url=base_url,
    version=version,
    tls=None
)

client.login(
    username=username,
    password=password,
    registry=registry,
    email=None,
    reauth=True
)

Anything else we need to know:

The problem seems to be in the docker library. The error message shows the registry suffixed with '/v2' but nowhere
in the code is that suffix provided.

Appears to be related to this issue: aws/aws-cli#4962

Is Airflow's DockerOperator incompatible with v1 ECR? (I wasn't appear there was a v1 or v2 of ECR).

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind:bugThis is a clearly a bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions