-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial working draft of pluggable scanning proposal and working group #82
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,240 @@ | ||
# Proposal: `Pluggable Container Scanning` | ||
|
||
Author: `Zach Hill (zhill)` | ||
|
||
Discussion: [Issue 6234](https://github.com/goharbor/harbor/issues/6234) | ||
|
||
## Abstract | ||
|
||
Add a generic image scanner abstraction layer between Harbor and external image scanners. Scanner integrations are achieved | ||
by implementing an adapter which implements the scanner adapter interface and which is loaded, configured, and invoked by the | ||
a generic scanner job. | ||
|
||
The adapter interface will support the following high-level capabilities: | ||
|
||
1. Scanning images achieved by the scanner fetching image data and returning a vulnerability listing mapping vulnerabilities to artifacts in the image | ||
2. Providing a image check pass/fail recommendation based on the specific capabilities/polices of the scanner (new) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As long as there's a vulnerability list provided by scanner, the Is this a case we think valuable when different scanners report different There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In my experience, a vulnerability listing is the very bare minimum of useful information about an image's content, and often not sufficient for production security practices. Clair provides a very simple scan capability, but others like Anchore and Aqua provide much deeper analysis and capabilities such as secret detection, provenance, configuration checks, and metadata checks on the image content which all factor into acceptance criteria. As I said in comments previously, a scanner may or may not implement this and Harbor/its users are free to ignore/disable such capabilities as desired, but I think it is important to have support for more sophisticated scanners than just vulnerability mappings and returning lists. I will certainly update the doc with more clarity on this position and the central idea behind including this function to facilitate a decision on inclusion. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree with @zhill . Our company purchased anchore enterprise because we wanted to detect the secret in any docker images as well; meanwhile, it also gives the cyber security team the ability to define some custom policy. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @cafeliker @zhill I see this is a valuable use case. That means we need to make Harbor's security policy flexible to support different attributes for different scanners? |
||
3. Cleanup of data in the scanner when an image is deleted from Harbor (for image lifecycle completeness) (new) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed that the logic can be complex, and I think that can be handled in the plugin configuration, since that has the plugin-specific context, and/or the overall configuration that can be deployment specific. The intent is simply to provide a facility to allow the scanner to be told that an image is no longer in Harbor and thus may be GC'd if the scanner does have some state for the image and if the user wants it to be kept in sync with the registry. Such behavior is optional and a specific plugin may choose to implement the method as a no-op. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @zhill How about Harbor send a webhook to scanner? I hope the scanner adapter or plugin should focus on enabling Harbor to scan artifacts, cleanup the data on the scanner may be out of the scope? |
||
|
||
## Background | ||
|
||
Harbor contains a tightly-coupled interface to Clair for security scans of image content uploaded to the registry. | ||
Some users have other container scanning solutions already, or would like to leverage a solution with other desirable | ||
features without duplicating image scans. It will add convenience and value to Harbor deployments to allow such scanners | ||
to provide results for Harbor in place of the default integration. | ||
|
||
The current implementation for scanner support is included in the following components: | ||
|
||
1. Job service's ClairJob job type for executing scans | ||
2. DAO - Internal data model for storing and retrieving scan results | ||
3. API for both viewing and triggering scans | ||
5. API for receiving notifications from the scanner to trigger new scans | ||
4. Initialization and configuration (including ClairDB init in the DAO layer) | ||
5. policyChecker implementation consumes the vulnerability overview produced by the scanner to make access decisions | ||
|
||
Scans are initiated by the job service due to: | ||
* New image pushed | ||
* Scheduled scan interval | ||
* User API call | ||
* Notification (POST) from Clair to _/service/notifications/clair_ indicating an update available | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This will be removed. |
||
|
||
The scan job (ClairJob) interacts with the scanner (Clair) and process the result for hand-off to the DAO layer to be persisted. | ||
The persisted result (ImgScanOverview) is made available via the Harbor API as well as consumed by the policyChecker interceptor if configured. | ||
|
||
The policyChecker interceptor consumes the scan result for comparison with the configured vulnerability-severity-based policy | ||
and determines if specific HTTP requests are to be allowed. | ||
|
||
The Harbor UI also consumes the scan results and visualizes them to the user as well as providing UI components to configure and execute scans. | ||
|
||
## Proposal | ||
|
||
Introduce a layer of abstraction between the scan invocation logic and the scanner-specific logic: an adapter interface for any scanner to be integrated with Harbor. | ||
The existing Clair implementation (ClairJob, etc) will be moved into a Clair adapter invoked by the common scan. | ||
|
||
A ScanJob layer will added that will load, configure, and invoke the appropriate adapter defined in the Harbor configuration. | ||
|
||
The new ImageScannerAdapter interface provides: | ||
|
||
1. Execute a scan on the image data, retrievable by the digest of the image manifest document (aka the "image digest") (same as current impl) | ||
1. Notify the scanner that the image is removed from Harbor. Depending on scanner and configuration this may be a no-op. (new) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My assumption is that the adapter should be minimal and focus on the general workflow for all scanners. So this is probably not part of it. |
||
1. Execute a rescan of an image previously scanned (Note: For some scanners this will be the same as #1, but for others it may differ in what/if data is retrieved.) (new, but largely identical to current impl) | ||
1. Execute a "check" operation for an image against the scanner, if implemented, to provide an evaluation and pass/fail (new) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could the result of "check" part of the scan result? So that the adapter will only support two actions "TriggerScan" and "GetResult". |
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The scannerAdapter may also need to provide capability to describe itself, to help Harbor consume the heterogeneous scan result? Or is can be encapsulate in |
||
### Accessing Image Data | ||
|
||
The adapter interface passes a reference to the image data to the adapters: the image digest (digest of the image manifest object) and the registry url | ||
necessary to retrieve the data via registry API mechanisms. An example reference would be: internalharborregistry:5000/project/image@sha256:abc123. | ||
From such an reference the scanners may use HTTP GET to download the image manifest that describes the additional data layers needed for analysis. Optimization of | ||
the data retrieval and analysis is left to the individual adapters and scanner implementations as an implementation detail. | ||
|
||
The current implementation of the scanner integration assumes that the scanner has access to the internal registry API without first passing thru the proxy component | ||
that implements access control for external Harbor users. This allows the scanners to fetch image data even when external users cannot retrieve image data due to | ||
access restriction caused by the scan status and/or vulnerability status of the image as configured by the API policyChecker component. As such, either a given | ||
scanner to be integrated with Harbor must have access to internal network interfaces not exposed to external users, or else must implement a proxy that can run | ||
within that network that is exposed externally (presumably with authc/authn) to ensure access to the data regardless of the access status for Harbor users. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can implement that via "service account"? |
||
|
||
As an example, the current implementation runs Clair in the same docker-compose network so it can access ports that are not exposed to the host itself. Such a model | ||
will work for some scanners, but will require additional implementation work for adapters for scanners that are external to the registry deployment (e.g. Saas or externally hosted to the registry). | ||
|
||
#### Adapter Interface Operations | ||
|
||
* ScanImage | ||
* Description: The base scan operation to process an image's content and return a list of found vulnerabilities, some scan metadata, and a check result (scanner specific) | ||
* Params: image digest, registry URL, pull credential, tags applied to the digest | ||
* Output: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does Harbor use Clair's representation of vulnerability info at the moment? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Essentially, yes. It saves the listing effectively as-is from Clair, but also creates a summary with counts for each severity category that is used by things like visualization in the UI and the API proxy's policyCheck. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It does not have to return the scan result, the scan may be async, so how about returning an uri containing all the information to fetch the scan result? |
||
scan_status: boolean | ||
vulnerabilities: array of vulnerable artifacts in the image | ||
scan_metadata: json object, scanner specific | ||
image_check: boolean (true if image is 'ok' according to scanner implementation and/or policy) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does Harbor currently have this kind of policy decision built in? I am hesitant about it partly because of separation of concerns ("do one thing well") but also because image check policy might be context-dependent. For example,
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This field simply to provides a facility for a scanner to indicate a recommendation. What Harbor does with it is open for discussion and mostly for future work. The check result is simply an evaluation of the image content since that is the only context present for a scanner in the base case. I can definitely see a case where the actual admission decision for an image download by a user is based on identity attributes, properties of the project (the namespace for repositories in Harbor), and other user-configurable metadata. A more robust policy system is an interesting and certainly useful addition, but I think falls outside the scope of this proposal. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A snapshot of Harbor vulnerability checking which designed as one of the project security settings: For different environments, you can set different severity levels restriction. About CVE whitelist, it has been planned to deliver in the next release V1.9 (probably released at end of Aug. ). |
||
|
||
* NotifyImageDeleted | ||
* Description: Explicit call to notify scanner that a specific image manifest has been deleted from Harbor and the scanner may flush any state if necessary. | ||
* Params: image digest, tags, registry URL | ||
* Output: status code (ok or error) | ||
|
||
* RescanImage (since some scanners may need to treat this differently) | ||
* Description: functionally the same as the Scan Image operation, but may occur at different lifecycle events for an image and provided here to support scanners which need to differentiate between a new image scan and a re-scan. | ||
* Params: image digest, registry URL, pull credential, tags applied to the digest | ||
* Output: | ||
scan_status: boolean | ||
vulnerabilities: array of vulnerable artifacts in the image | ||
scan_metadata: json object, scanner specific | ||
image_check: boolean (true if image is 'ok' according to scanner implementation and/or policy) | ||
|
||
* HandleScannerNotification | ||
* Description: Handler for webhooks invoked by the scanner itself. The adapter, with knowledge of the payload, can decide if a new scan task is needed and indicate it in the return value | ||
* Params: notification content | ||
* Output: array of digests to re-scan | ||
|
||
* CheckImage | ||
* Description: A scanner-specific synchronous check of an image. Provided for optional use in interceptors or other cases where Harbor may want an up-to-date check result from the scanner to make an admission decision. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this admission from the perspective of "let this image be stored in the registry" or "let this image be deployed to the cluster"? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The current Harbor implementation is to block API requests in the API proxy component, and those requests are for a download of an image manifest (initiating an image pull) rather than the push of the image. The checks are configurable on a per-project basis in Harbor and also include things like Notary signature checks. See: https://github.com/goharbor/harbor/blob/master/src/core/proxy/interceptors.go |
||
* Params: image digest, image reference (e.g. pullstring if a tag-based manifest fetch) | ||
* Output: check result (boolean) | ||
|
||
#### Configuration | ||
|
||
* adapter selection - configured in the Harbor configuration itself | ||
* adapter configuration, as a json document, specific to the adapter impl. Config key is the adapter id/name. | ||
* Scanner credentials - included in the adapter-specific configuration if necessary. Should be kept in a secret store in encrypted form. | ||
* Details tbd with feedback from harbor maintainers on overall configuration handling | ||
|
||
#### Error Handling | ||
|
||
Adapter implementations of these operations should return errors in standard go mechanisms (not outlined here since the are common to all function calls). However, the objects returned | ||
as errors, should be well defined by the adapter and implement a common interface: | ||
|
||
AdapterError an error normalization abstraction between the adapter framework and adapter implementations: | ||
* Message - string - description of the error, intended to be short for single line presentation | ||
* Detail - string - Additional detail for human consumption that may be longer than a single line and may include remediation information. | ||
* ErrorCode - int - code to uniquely identify the class of error | ||
* CanRetry - bool - an indicator to the caller that the error may be resolved with a retry | ||
a | ||
#### DAO Updates needed: | ||
|
||
Overall, the objective is to require as little DAO change as possible to minimize upgrade and db maintenance impact. | ||
|
||
* Extend the existing result store to include json fields for: | ||
* Augmentation of existing vulnerability listing to be more generic (primarily column/table naming, not schema). | ||
* Scan result metadata from scanner (opaque json object) | ||
* Image check result - a boolean and a json object returned by the scanner | ||
|
||
## Non-Goals | ||
|
||
* Changes to the UI presentation of image vulnerability data (such changes may be a goal of a later work or independent work parallel to this) | ||
* More than one scanner configured for use concurrently in Harbor | ||
|
||
## Rationale | ||
|
||
The proposed approach is intended to provide a generic-enough interface that scanners with different data management and lifecycle designs can still be integrated into the | ||
Harbor and implement only the aspects necessary for that system in the adapter. The basic vulnerability listing format is kept normalized so that clients of the Harbor API have | ||
a well-defined format for consuming scan results, but support for additional scanner-specific metadata will allow scanner-aware clients to gain additional insight into the | ||
scan results/details as available for a specific scanner. | ||
|
||
|
||
This proposal includes 2 new capabilities not found in the current Clair-specific implementation: | ||
1. Full data lifecycle management | ||
1. The current implementation assumes that the admin is manually managing the data lifecycle for scan state in the scanners. That is certainly always an option but | ||
quickly imposes some operational issues with resource provisioning of the scanner (db storage sizes, etc). | ||
1. By including a notification mechanism that an image is logically deleted, the scanner may, based on its configuration and how the user wants it deployed, be able | ||
to clean up its own scan state and maintain some parity with the image state of the Harbor deployment. This is purely a notification mechanism, not an actuation, and the scanner adapters | ||
are free to implement as makes sense, including a no-op. | ||
|
||
1. Scanner-implemented image evaluation with a policy recommendation | ||
1. The addition of an image check function allows scanners with more functionality than just returning simple vulnerability lists to make recommendations | ||
to Harbor on suitability of an image. For end-users, the raw vulnerability list is rarely sufficient for determining if an image is acceptable or not, but | ||
the mechanisms and inputs to the check decision will be scanner specific and must be handled by the adapter implementations themselves. Additional future work in the | ||
policyChecker or new interceptors can leverage the check recommendations from the scanners or utilize the new CheckImage operation to have control over the staleness of | ||
the check result. This check is not intended to replace other policy evaluations but to augment them. | ||
1. This functionality could be abstracted into a separate Policy adapter layer in the future if Harbor intends to build a generic policy layer. However, this proposal attempts to avoid significant changes outside of the image scan flow. | ||
|
||
## Compatibility | ||
|
||
TBD | ||
|
||
## Implementation | ||
|
||
Overall approach is to add an abstraction layer, then extend it | ||
|
||
### Phase 1: Abstract existing Clair implementation into the adapter model | ||
|
||
Add abstraction layer between existing ScanJob logic and the existing ClairJob interface. | ||
|
||
Example interfaces: | ||
|
||
``` | ||
|
||
// The interface that each scanner implementation would implement | ||
// The ScannerAdapter objects should have default constructors to facilitate lazy initialization | ||
|
||
type ScannerAdapter interface { | ||
func configure(conf map[string]interface{}) err | ||
func scanImage(imgReference ImageReference, credentials AccessCredentials) *VulnerabilityReport, *ImageCheckReport, err | ||
func rescanImage(imgReference ImageReference, credentials AccessCredentials) *VulnerabilityReport, *ImageCheckReport, err | ||
func notifyImageDeletion(imgReference ImageReference) err | ||
func checkImage(imgReference ImageReference) PolicyResult, err | ||
} | ||
|
||
type ScanResult struct { | ||
vulnResult VulnerabilityReport | ||
checkResult ImageCheckReport | ||
} | ||
|
||
type ImageReference struct { | ||
registryUrl string | ||
digest string | ||
tag string | ||
} | ||
|
||
// Access credentials for the image content | ||
type AccessCredentials struct { | ||
token string | ||
tokenUrl string | ||
} | ||
|
||
// For now, use the existing vuln format for ease of conversion into the db model | ||
type VulnerabilityReport ClairVulnerabilityEnvelope | ||
|
||
// Optionally implemented, | ||
type ImageCheckReport struct { | ||
imageDigest string | ||
imagePasses boolean //Pass/Fail flag | ||
checkTimestamp time.Time | ||
details map[string]interface{} // Opaque json object for scanner-specific results if implemented by the scanner | ||
} | ||
|
||
``` | ||
|
||
### Phase 2: Add one or more non-Clair adapters | ||
Likely the Anchore adapter and Aqua Microscanner adapters. | ||
|
||
|
||
### Phase 3: Add new image data lifecycle capabilities | ||
|
||
Specifically, the delete lifecycle call, invoked in the proxy handler on a successful response from the backing registry implementation. | ||
|
||
|
||
|
||
## Open issues (if applicable) | ||
|
||
General approach to adapters in Harbor. Expectation is that they will all be in-tree and compiled into the build, but selected at runtime based on configuration, but this is open for discussion as it will | ||
impact the libraries and licenses of software that the adapters themselves as well as software delivery implications. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does Harbor already do this kind of policy checking? I would think this is better outsourced to a policy component (e.g. OPA), for separation of concerns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current harbor implementation includes an optional "policyCheck" api interceptor that as I understand it will block image pulls if the image fails the policy check. The current implementation of that check uses the last vuln scan result and uses a simple severity comparison.
My proposal (at least thus far) has intentionally minimal on the impact to the larger Harbor architecture and attempts to bring support for other scanners in with minimal changes to other aspects of the system. I agree there are broader changes possible to support policy-based access gating to images, and the proposal's addition of an explicit image check from the scanner would allow the scanner to provide an input to any such policy evaluation system build into Harbor itself, either by plugin of OPA or another system.
I think what you're suggesting makes sense but could be implemented independently of this proposal and this proposal's changes would fit into that work as an input to the evaluation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lizrice
@zhill is right. Currently, harbor provides an interceptor to do the policy evaluation and checking work. They can be driven by some other tools like OPA in the feature (actually we have started to consider if we should introduce a unique and powerful policy engine to handle the policy related things). But that can be performed in another proposal. In this proposal, as @zhill mentioned, we should define a flexible interface to support the checking process, either by an interceptor or by the OPA, etc..