-
Notifications
You must be signed in to change notification settings - Fork 4.8k
Architecture Overview of Harbor
version: 1.10 (ongoing)
As depicted in the above diagram, Harbor comprises the following components placed in the 3 layers:
k-v storage: formed by Redis, provides data cache functions and supports temporarily persisting job metadata for the job service.
data storage: multiple storages supported for data persistence as backend storage of registry and chart museum. For checking more details, please refer to the driver list document at docker website and ChartMuseum GitHub repository.
Database: stores the related metadata of Harbor models, like projects, users, roles, replication policies, tag retention policies, scanners, charts, and images. PostgreSQL is adopted.
Proxy: reverse-proxy formed by the Nginx Server to provide API routing capabilities. Components of Harbor, such as core, registry, web portal, and token services, etc., are all behind this reversed proxy. The proxy forwards requests from browsers and Docker clients to various backend services.
Core: Harbor’s core functions, which mainly provides the following services:
- Authentication & Authorization
- requests are protected by the authentication service which can be powered by a local database, AD/LDAP or OIDC.
- RBAC mechanism is enabled for performing authorizations to the related actions, e.g: pull/push an image
- Config Management: covers the management of all the system configurations, like authentication type settings, email settings, and certificates etc..
- Project Management: manages the base data and corresponding metadata of the project, which is created to isolate the managed artifacts.
- Quota Management: manages the quota settings of projects and performs the quota validations when new pushes happened.
- Chart Service: proxy the chart related requests to backend chartmuseum and provides several extensions to improve chart management experiences.
- Tag Retention: manages the tag retention policies and perform and monitor the tag retention processes
- Content Trust: add extensions to the trust capability provided by backend Notary to support the smoothly content trust process. At present, only container images are supported to sign.
- Replication: manages the replication policies and registry adapters, triggers and monitors the concurrent replication processes.
- Scan Management: manages the multiple configured scanners adapted by different providers and also provides scan summaries and reports for the specified artifacts. At present, only container images are supported to scan.
Registry: Responsible for storing Docker images and processing Docker push/pull commands. As Harbor needs to enforce access control to images, the Registry will direct clients to a token service to obtain a valid token for each pull or push request.
UI: a graphical user interface to help users manage images on the Registry Webhook: Webhook is a mechanism configured in the Registry so that image status changes in the Registry can be populated to the Webhook endpoint of Harbor. Harbor uses webhook to update logs, initiate replications, and some other functions. Token service: Responsible for issuing a token for every docker push/pull command according to a user’s role of a project. If there is no token in a request sent from a Docker client, the Registry will redirect the request to the token service.
Job services: used for image replication, local images can be replicated(synchronized) to other Harbor instances.
Log collector: Responsible for collecting logs of other modules in a single place.
Each component of Harbor is wrapped as a Docker container. Naturally, Harbor is deployed by Docker Compose.
In the source code (https://github.com/vmware/harbor), the Docker Compose template used to deploy Harbor is located at /Deployer/docker-compse.yml. Opening this template file reveals the 6 container components making up Harbor:
proxy: Reverse-proxy formed by the Nginx Server.
registry: Container instance created from the official image of Docker distribution.
ui: Core services within the architecture. This container is the main part of Project Harbor.
mysql: Database container created from the official MySql image.
job services: Replicating images to a remote registry via state machines. Image deletion can also be synchronized to a remote Harbor instance.
log: Container that runs rsyslogd, used for collecting logs from other containers through the log-driver mode.
These containers are linked via DNS service discovery in Docker. By this means, each container can be accessed by their names. For the end user, only the service port of the proxy (Nginx) needs to be revealed.
The following two examples of Docker command illustrate the interaction between Harbor’s components.
Suppose Harbor is deployed on a host with IP 192.168.1.10. A user runs the docker command to send a login request to Harbor:
$ docker login 192.168.1.10
After the user enters the required credentials, the Docker client sends an HTTP GET request to the address “192.168.1.10/v2/”. The different containers of Harbor will process it according to the following steps:
(a) First, this request is received by the proxy container listening on port 80. Nginx in the container forwards the request to the Registry container at the backend.
(b) The Registry container has been configured for token-based authentication, so it returns an error code 401, notifying the Docker client to obtain a valid token from a specified URL. In Harbor, this URL points to the token service of Core Services;
(c) When the Docker client receives this error code, it sends a request to the token service URL, embedding username and password in the request header according to basic authentication of HTTP specification;
(d) After this request is sent to the proxy container via port 80, Nginx again forwards the request to the UI container according to pre-configured rules. The token service within the UI container receives the request, it decodes the request and obtains the username and password;
(e) After getting the username and password, the token service checks the database and authenticates the user by the data in the MySql database. When the token service is configured for LDAP/AD authentication, it authenticates against the external LDAP/AD server. After a successful authentication, the token service returns a HTTP code that indicates the success. The HTTP response body contains a token generated by a private key.
At this point, one docker login process has been completed. The Docker client saves the encoded username/password from step (c) locally in a hidden file.
(We have omitted proxy forwarding steps. The figure above illustrates communication between different components during the docker push process)
After the user logs in successfully, a Docker Image is sent to Harbor via a Docker Push command:
# docker push 192.168.1.10/library/hello-world
(a) Firstly, the docker client repeats the process similar to login by sending the request to the registry, and then gets back the URL of the token service;
(b) Subsequently, when contacting the token service, the Docker client provides additional information to apply for a token of the push operation on the image (library/hello-world);
(c) After receiving the request forwarded by Nginx, the token service queries the database to look up the user’s role and permissions to push the image. If the user has the proper permission, it encodes the information of the push operation and signs it with a private key and generates a token to the Docker client;
(d) After the Docker client gets the token, it sends a push request to the registry with a header containing the token. Once the Registry receives the request, it decodes the token with the public key and validates its content. The public key corresponds to the private key of the token service. If the registry finds the token valid for pushing the image, the image transferring process begins.