-
Notifications
You must be signed in to change notification settings - Fork 759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Harbor 2.6 - Jobservice cannot run replicated with RWO PVCs #1320
Comments
We have the same issue after upgrading to harbor 2.6 |
+1 |
1 similar comment
+1 |
+1 |
I can confirm this is a valid bug |
Struggling with the same issue, I went for an approach of choosing HA over scan data export, by manually moving from PVC volume type to emptyDir: - name: job-scandata-exports
persistentVolumeClaim:
claimName: registry-harbor-jobservice-scandata to - name: job-scandata-exports
emptyDir: {} very meh (scan data export does not seem to work at all), but at least jobservice can run in the HA mode. |
+1 |
Thanks for the report, I will bring this issue up during the next community meeting, please join. https://github.com/goharbor/community/wiki/Harbor-Community-Meetings. |
@chlins Yes, we did. I described our environment in the issue description. The possibility to use the database for the job log was also the reason why I proposed to use the database for the exports as well. |
@maxdanilov Hi, what problem did you encounter for scan data export when you changed the volume to emptyDir? |
@Tonkari Did you use Kubernetes in the AWS or Azure, or config any affinity/antiAffinity policies? In my environment, scale up the jobservice replica to 3, 3 pods can all be ready as be scheduled to the same node. |
@chlins Yes, we did specify anti-affinity. Having all Pods run on the same node is not an option for us, because we exchange nodes fairly often. Also it only compensates pod level failues, and node level failures would be met with downtime. We are also running our cluster in multiple availability zones. A volume in a single availability zone does not protect us from zone level failures. |
@chlins we're running Harbor on GCP, with multiple instances of the |
Yes, the HA is the problem that this issue should fix, but another question as you mentioned when you change the volume type to emptyDir, the scan data export function not work at all, could you share the details for this problem like jobservice logs? |
It seems to not complain when exporting project CVEs:
But the files in the |
Same with us. we are running on 2.4.x, however to upgrade to 2.6 its failing to unmount or mount with a PVC has ReadWriteOnce. @Vad1mo Does it solve upgrade failures if we use S3 for these HA pods? |
The function works from the log, no CVE found may be caused by other problem like filter or permission. |
@thangamani-arun The temporary workaround is to change the scandata volume to emptyDir, we'll fix this issue completely in the coming patch. |
I don't think it's the case: the export was run with user having access to everything and no filters were applied (with vulnerabilities being present in the exported repos). |
Summary
Jobservice can only run as a single pod when the underlying kubernetes cluster does not support ReadWriteMany access mode.
Environment
We are using this chart to deploy harbor on a kubernetes cluster that only supports ReadWriteOnce for persistent volumes. Currently we are running 3 replicas for the jobservice component. We are using the database JobLogger and S3 as storage backend for the registry.
Problem
When installing or updating to chart version 1.10.0 (Harbor 2.6) only one jobservice pod will become available, while the others stay in "ContainerCreating" phase. This happens because they all try to mount the same new scan data export volume. As a workaround we can only reduce the number of replicas to 1, which we have been trying to avoid.
The chart does not allow for any configuration regarding the component. If persistence is enabled, a single scan data export volume will always be created and used for the deployment.
Possible solutions
Separate persistence option for scan data export
Without knowing, whether it's necessary to persist the scan data exports, a possible solution would be to add a separate option to disable persistence for this feature. Instead, exports could be stored in the pod's filesystem - long enough to download them - and disappear when the pod gets terminated. Since I have not really dug in the harbor code, I'm not sure if this is realistic.
Store scan data exports in the database
The exports could be stored in the database. Same as above, I do not know if it's realistic, especially regarding the size of the generated csv files.
External storage support for scan data export
Since the registry already has an S3 bucket to use for storage, the same could be done for the jobservice component. Whether the same bucket with a different path or a different bucket should be used, can be discussed.
Use statefulSet for jobservice
The jobservice deployment could be converted to a statefulSet, and a volumeClaimTemplate could be created for the scan data exports. For the file JobLogger the persistentVolumeClaim could remain the same.
Update docs
If none of the above solutions are desirable, this incompatibility should be mentioned in the docs.
The text was updated successfully, but these errors were encountered: