Skip to content

Architecture Overview of Harbor

Steven Zou edited this page Mar 23, 2020 · 11 revisions

Architecture

version: 1.10

As depicted in the above diagram, Harbor comprises the following components placed in the 3 layers:

Data Access Layer

k-v storage: formed by Redis, provides data cache functions and supports temporarily persisting job metadata for the job service.

data storage: multiple storages supported for data persistence as backend storage of registry and chart museum. For checking more details, please refer to the driver list document at docker website and ChartMuseum GitHub repository.

Database: stores the related metadata of Harbor models, like projects, users, roles, replication policies, tag retention policies, scanners, charts, and images. PostgreSQL is adopted.

Fundamental Services

Proxy: reverse-proxy formed by the Nginx Server to provide API routing capabilities. Components of Harbor, such as core, registry, web portal, and token services, etc., are all behind this reversed proxy. The proxy forwards requests from browsers and Docker clients to various backend services.

Core: Harbor’s core functions, which mainly provides the following services:

  • Authentication & Authorization
    • requests are protected by the authentication service which can be powered by a local database, AD/LDAP or OIDC.
    • RBAC mechanism is enabled for performing authorizations to the related actions, e.g: pull/push an image
    • Token service is designed for issuing a token for every docker push/pull command according to a user’s role of a project. If there is no token in a request sent from a Docker client, the Registry will redirect the request to the token service.
  • Config Management: covers the management of all the system configurations, like authentication type settings, email settings, and certificates, etc..
  • Project Management: manages the base data and corresponding metadata of the project, which is created to isolate the managed artifacts.
  • Quota Management: manages the quota settings of projects and performs the quota validations when new pushes happened.
  • Chart Service: proxy the chart related requests to backend chartmuseum and provides several extensions to improve chart management experiences.
  • Tag Retention: manages the tag retention policies and perform and monitor the tag retention processes
  • Content Trust: add extensions to the trust capability provided by backend Notary to support the smoothly content trust process. At present, only container images are supported to sign.
  • Replication: manages the replication policies and registry adapters, triggers and monitors the concurrent replication processes. Many registry adapters are implemented:
    • Distribution (docker registry)
    • Docker Hub
    • Huawei SWR
    • Amazon ECR
    • Google GCR
    • Azure ACR
    • Ali ACR
    • Helm Hub
  • Scan Management: manages the multiple configured scanners adapted by different providers and also provides scan summaries and reports for the specified artifacts.
    • The Trivy scanner provided by Aqua Security, the Anchore Engine scanner provided by Anchore and the Clair scanner sponsored by CentOS (Redhat) will be supported.
    • At present, only container images are supported to scan.
  • Webhook: a mechanism configured in Harbor so that artifact status changes in Harbor can be populated to the Webhook endpoints configured in Harbor. The interested parties can trigger some follow-up actions by listening to the related webhook events.

Job Service: a general execution queue service to let other components/services submit requests of running asynchronous tasks concurrently with simple restful APIs

Logs: Log collector, responsible for collecting logs of other modules into a single place.

GC Controller: manages the online GC schedule settings and start and track the GC progress.

Chart Museum: a 3rd party chart repository server providing chart management and access APIs. To learn more details, check here.

Docker Registry: a 3rd party registry server, responsible for storing Docker images and processing Docker push/pull commands. As Harbor needs to enforce access control to images, the Registry will direct clients to a token service to obtain a valid token for each pull or push request.

Notary: a 3rd party content trust server, responsible for securely publishing and verifying content. To learn more details, check here.

Consumers

As a standard cloud-native artifact registry, the related clients will be naturally supported, like docker CLI, notary client, and helm. Besides those clients, Harbor also provides a web portal for the administrators to easily manage and monitor all the artifacts.

Web Portal: a graphical user interface to help users manage images on the Registry

The following two examples of the Docker command illustrate the interaction between Harbor’s components.

The process of docker login

Suppose Harbor is deployed on a host with IP 192.168.1.10. A user runs the docker command to send a login request to Harbor:

$ docker login 192.168.1.10

After the user enters the required credentials, the Docker client sends an HTTP GET request to the address “192.168.1.10/v2/”. The different containers of Harbor will process it according to the following steps:

(a) First, this request is received by the proxy container listening on port 80. Nginx in the container forwards the request to the Registry container at the backend.

(b) The Registry container has been configured for token-based authentication, so it returns an error code 401, notifying the Docker client to obtain a valid token from a specified URL. In Harbor, this URL points to the token service of Core Services;

(c) When the Docker client receives this error code, it sends a request to the token service URL, embedding username and password in the request header according to basic authentication of HTTP specification;

(d) After this request is sent to the proxy container via port 80, Nginx again forwards the request to the UI container according to pre-configured rules. The token service within the UI container receives the request, it decodes the request and obtains the username and password;

(e) After getting the username and password, the token service checks the database and authenticates the user by the data in the MySql database. When the token service is configured for LDAP/AD authentication, it authenticates against the external LDAP/AD server. After successful authentication, the token service returns an HTTP code that indicates success. The HTTP response body contains a token generated by a private key.

At this point, one docker login process has been completed. The Docker client saves the encoded username/password from step (c) locally in a hidden file.

The process of docker push

(We have omitted proxy forwarding steps. The figure above illustrates communication between different components during the docker push process)

After the user logs in successfully, a Docker Image is sent to Harbor via a Docker Push command:

# docker push 192.168.1.10/library/hello-world

(a) Firstly, the docker client repeats the process similar to login by sending the request to the registry, and then gets back the URL of the token service;

(b) Subsequently, when contacting the token service, the Docker client provides additional information to apply for a token of the push operation on the image (library/hello-world);

(c) After receiving the request forwarded by Nginx, the token service queries the database to look up the user’s role and permissions to push the image. If the user has the proper permission, it encodes the information of the push operation and signs it with a private key and generates a token to the Docker client;

(d) After the Docker client gets the token, it sends a push request to the registry with a header containing the token. Once the Registry receives the request, it decodes the token with the public key and validates its content. The public key corresponds to the private key of the token service. If the registry finds the token valid for pushing the image, the image transferring process begins.