Skip to content
This repository was archived by the owner on Mar 27, 2024. It is now read-only.

Update README #145

Merged
merged 1 commit into from
Nov 23, 2017
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
127 changes: 68 additions & 59 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Download the [container-diff-windows-amd64.exe](https://storage.googleapis.com/c

To use `container-diff analyze` to perform analysis on a single image, you need one Docker image (in the form of an ID, tarball, or URL from a repo). Once you have that image, you can run any of the following analyzers:

```
```shell
container-diff analyze <img> [Run default analyzers]
container-diff analyze <img> --type=history [History]
container-diff analyze <img> --type=file [File System]
Expand All @@ -53,7 +53,7 @@ container-diff analyze <img> --type=apt --type=node [Apt and Node]
By default, with no `--type` flag specified, container-diff will run Apt package analysis.

To use container-diff to perform a diff analysis on two images, you need two Docker images (in the form of an ID, tarball, or URL from a repo). Once you have those images, you can run any of the following differs:
```
```shell
container-diff diff <img1> <img2> [Run default differs]
container-diff diff <img1> <img2> --type=history [History]
container-diff diff <img1> <img2> --type=file [File System]
Expand All @@ -64,21 +64,23 @@ container-diff diff <img1> <img2> --type=node [Node]

You can similarly run many analyzers at once:

```
```shell
container-diff diff <img1> <img2> --type=history --type=apt --type=node
```

To view the diff of an individual file in two different images, you can use the filename flag in conjuction with the file system diff analyzer.

```
```shell
container-diff diff <img1> <img2> --type=file --filename=/path/to/file
```

## Image Sources

container-diff supports Docker images located in both a local Docker daemon and a remote registry. To explicitly specify a local image, use the `daemon://` prefix on the image name; similarly, for an explicitly remote image, use the `remote://` prefix.

```container-diff diff daemon://modified_debian:latest remote://gcr.io/google-appengine/debian8:latest```
```shell
container-diff diff daemon://modified_debian:latest remote://gcr.io/google-appengine/debian8:latest
```

Additionally, tarballs can be provided to the tool directly. Make sure your file has a valid tar extension (.tar, .tar.gz, .tgz).

Expand All @@ -96,23 +98,25 @@ For the Google Container Registry, make sure you have the `docker-credential-gcr

To get a JSON version of the container-diff output add a `-j` or `--json` flag.

```container-diff <img1> <img2> -j```

```container-diff <img1> <img2> -e```
```shell
container-diff diff --type=file --json gcr.io/gcp-runtimes/multi-base gcr.io/gcp-runtimes/multi-modified
```

To order files and packages by size (in descending order) when performing file system or package analyses/diffs, add a `-o` or `--order` flag.

```container-diff <img1> <img2> -o```
```shell
container-diff analyze remote://gcr.io/gcp-runtimes/multi-modified --type=pip --order
```


## Analysis Result Format

JSON output for analysis results is in the following format:
```
```json
{
"Image": "foo",
"AnalyzeType": "Apt",
"Analysis": {},
"Analysis": {}
}
```
The possible contents of the `Analysis` field are detailed below.
Expand All @@ -127,11 +131,11 @@ The file system analyzer outputs a list of file system contents, including names

### Package Analysis

Package analyzers such as pip, apt, and node inspect the packages installed within the image provided. All package analyses leverage the PackageOutput struct, which contains the version and size for a given package instance (and a potential installation path for a specific instance of a package where multiple versions are allowed to be installed), as detailed below:
```
Package analyzers such as pip, apt, and node inspect the packages installed within the image provided. All package analyses leverage the `PackageOutput` struct, which contains the version and size for a given package instance (and a potential installation path for a specific instance of a package where multiple versions are allowed to be installed), as detailed below:
```go
type PackageOutput struct {
Name string
Path string
Name string
Path string
Version string
Size int64
}
Expand All @@ -143,7 +147,6 @@ Single version package analyzers (apt) have the following output structure: `[]P

Here, the `Path` field is omitted because there is only one instance of each package.


#### Multi Version Package Analysis

Multi version package analyzers (pip, node) have the following output structure: `[]PackageOutput`
Expand All @@ -154,68 +157,68 @@ Here, the `Path` field is included because there may be more than one instance o
## Diff Result Format

JSON output for diff results is in the following format:
```
```json
{
"Image1": "foo",
"Image2": "bar",
"DiffType": "Apt",
"Diff": {},
"Diff": {}
}
```
The possible structures of the `Diff` field are detailed below.

### History Diff

The history differ has the following JSON output structure:
The history differ has the following output structure:

```
```go
type HistDiff struct {
Adds []string
Dels []string
Adds []string
Dels []string
}
```

### File System Diff

The file system differ has the following JSON output structure:
The file system differ has the following output structure:

```
```go
type DirDiff struct {
Adds []string
Dels []string
Mods []string
Adds []string
Dels []string
Mods []string
}
```

### Package Diffs

Package differs such as pip, apt, and node inspect the packages contained within the images provided. All packages differs currently leverage the PackageInfo struct which contains the version and size for a given package instance, as detailed below:
```
```go
type PackageInfo struct {
Version string
Size string
Size string
}
```

#### Single Version Package Diffs

Single version differs (apt) have the following JSON output structure:

```
```go
type PackageDiff struct {
Packages1 []PackageOutput
Packages2 []PackageOutput
InfoDiff []Info
}
```

Packages1 and Packages2 detail which packages exist uniquely in Image1 and Image2, respectively, with package name, version and size info. InfoDiff contains a list of Info structs, each of which contains the package name (which occurred in both images but had a difference in size or version), and the PackageInfo struct for each package instance.
Packages1 and Packages2 detail which packages exist uniquely in Image1 and Image2, respectively, with package name, version and size info. InfoDiff contains a list of Info structs, each of which contains the package name (which occurred in both images but had a difference in size or version), and the PackageInfo struct for each package instance.

#### Multi Version Package Diffs

The multi version differs (pip, node) support processing images which may have multiple versions of the same package. Below is the json output structure:

```
```go
type MultiVersionPackageDiff struct {
Packages1 []PackageOutput
Packages2 []PackageOutput
Expand All @@ -225,11 +228,11 @@ type MultiVersionPackageDiff struct {

Packages1 and Packages2 detail which packages exist uniquely in Image1 and Image2, respectively, with package name, installation path, version and size info. InfoDiff here is exanded to allow for multiple versions to be associated with a single package. In this case, a package of the same name is considered to differ between two images when there exist one or more instances of it installed in one image but not the other (i.e. have a unique version and/or size).

```
```go
type MultiVersionInfo struct {
Package string
Info1 []PackageInfo
Info2 []PackageInfo
Info1 []PackageInfo
Info2 []PackageInfo
}
```

Expand All @@ -241,7 +244,7 @@ Tarballs provided directly to the tool must be in the Docker format (i.e. have a

## Example Run

```
```shell
$ container-diff diff gcr.io/google-appengine/python:2017-07-21-123058 gcr.io/google-appengine/python:2017-06-29-190410 --type=apt --type=node --type=pip

-----AptDiffer-----
Expand Down Expand Up @@ -271,7 +274,8 @@ Packages found only in gcr.io/google-appengine/python:2017-06-29-190410: None
Version differences: None

```
```

```shell
$ container-diff diff file1.tar file2.tar --type=file --filename=go/src/app/file.txt
Starting diff on images file1.tar and file2.tar, using differs: [file]
Retrieving image file2.tar from source Tar Archive
Expand Down Expand Up @@ -304,7 +308,7 @@ This is a file

## Example Run with JSON post-processing
The following example demonstrates how one might selectively display the output of their diff, such that version differences are ignored and only package absence/presence is displayed and the packages present in only one image are sorted by size in descending order. A small piece of the JSON being post-processed can be seen below:
```
```json
[
{
"DiffType": "AptDiffer",
Expand All @@ -320,12 +324,16 @@ The following example demonstrates how one might selectively display the output
"libmpdec2": {
"Version": "2.4.1-1",
"Size": "275"
},
...
}
}
}
}
]
```
The post-processing script used for this example is below:

```import sys, json
```python
import sys, json

def main():
data = json.loads(sys.stdin.read())
Expand All @@ -343,7 +351,7 @@ def main():
for package in diff['Packages2']:
Size = package['Size']
img2packages.append((str(package), int(str(Size))))

img1packages = reversed(sorted(img1packages, key=lambda x: x[1]))
img2packages = reversed(sorted(img2packages, key=lambda x: x[1]))

Expand All @@ -361,7 +369,7 @@ if __name__ == "__main__":
```

Given the above python script to postprocess json output, you can produce the following behavior:
```
```shell
container-diff gcr.io/gcp-runtimes/multi-base gcr.io/gcp-runtimes/multi-modified -a -j | python pyscript.py

Only in image1
Expand All @@ -387,33 +395,34 @@ Feel free to develop your own analyzer leveraging the utils currently available.

In order to quickly make your own analyzer, follow these steps:

1. Add your analyzer identifier to the flags in [root.go](https://github.com/GoogleCloudPlatform/container-diff/blob/ReadMe/cmd/root.go)
2. Determine if you can use existing analyzing or diffing tools. If you can make use of existing tools, you then need to construct the structs to feed into the tools by getting all of the packages for each image or the analogous quality to be analyzed. To determine if you can leverage existing tools, think through these questions:
1. Determine if you can use existing analyzing or diffing tools. If you can make use of existing tools, you then need to construct the structs to feed into the tools by getting all of the packages for each image or the analogous quality to be analyzed. To determine if you can leverage existing tools, think through these questions:
- Are you trying to analyze packages?
- Yes: Does the relevant package manager support different versions of the same package on one image?
- Yes: Implement `getPackages` to collect all versions of all packages within an image in a `map[string]map[string]PackageInfo`. Use `GetMultiVerisonMapDiff` to diff map objects. See [nodeDiff.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/differs/nodeDiff.go#L33) or [pipDiff.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/differs/pipDiff.go#L23) for examples.
- No: Implement `getPackages` to collect all versions of all packages within an image in a `map[string]PackageInfo`. Use `GetMapDiff` to diff map objects. See [aptDiff.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/differs/aptDiff.go#L29).
- No: Look to [History](https://github.com/GoogleCloudPlatform/container-diff/blob/ReadMe/differs/historyDiff.go) and [File System](https://github.com/GoogleCloudPlatform/container-diff/blob/ReadMe/differs/fileDiff.go) differs as models for diffing.
- Yes: Implement `getPackages` to collect all versions of all packages within an image in a `map[string]map[string]util.PackageInfo`. Use [`GetMultiVersionMapDiff`](https://github.com/GoogleCloudPlatform/container-diff/blob/0031c88993c9ac019e2d404815ef50c652d8d010/util/package_diff_utils.go#L119-L126) to diff map objects. See [`differs/node_diff.go`](https://github.com/GoogleCloudPlatform/container-diff/blob/0031c88993c9ac019e2d404815ef50c652d8d010/differs/node_diff.go#L49-L93) or [`differs/pip_diff.go`](https://github.com/GoogleCloudPlatform/container-diff/blob/0031c88993c9ac019e2d404815ef50c652d8d010/differs/pip_diff.go#L48-L111) for examples.
- No: Implement `getPackages` to collect all versions of all packages within an image in a `map[string]util.PackageInfo`. Use [`GetMapDiff`](https://github.com/GoogleCloudPlatform/container-diff/blob/31cec2304b54ae6ae444ccde4382b113d8e06097/util/package_diff_utils.go#L110-L117) to diff map objects. See [`differs/apt_diff.go`](https://github.com/GoogleCloudPlatform/container-diff/blob/master/differs/apt_diff.go#L29).
- No: Look to [History](https://github.com/GoogleCloudPlatform/container-diff/blob/0031c88993c9ac019e2d404815ef50c652d8d010/differs/history_diff.go) and [File System](https://github.com/GoogleCloudPlatform/container-diff/blob/0031c88993c9ac019e2d404815ef50c652d8d010/differs/file_diff.go) differs as models for diffing.

3. Write your analyzer driver in the `differs` directory, such that you have a struct for your analyzer type and two methods for that analyzer: `Analyze` for single image analysis and `Diff` for comparison between two images:
2. Write your analyzer driver in the `differs` directory, such that you have a struct for your analyzer type and two methods for that analyzer: `Analyze` for single image analysis and `Diff` for comparison between two images:

```
```go
type YourAnalyzer struct {}

func (a YourAnalyzer) Analyze(image utils.Image) (utils.Result, error) {...}
func (a YourAnalyzer) Diff(image1, image2 utils.Image) (utils.Result, error) {...}
func (a YourAnalyzer) Analyze(image util.Image) (util.Result, error) {...}
func (a YourAnalyzer) Diff(image1, image2 util.Image) (util.Result, error) {...}
```
The image arguments passed to your analyzer contain the path to the unpacked tar representation of the image, as well as certain configuration information (e.g. environment variables upon image creation and image history).

If using existing package tools, you should create the appropriate structs (e.g. `SingleVersionPackageAnalyzeResult` or `SingleVersionPackageDiffResult`) to analyze or diff. Otherwise, create your own structs which should yield information to fill an AnalyzeResult or DiffResult as the return type for Analyze() and Diff(), respectively, and should implement the `Result` interface, as in the next step.

4. Create a struct following the `Result` interface by implementing the following two methods.
```
GetStruct() interface{}
OutputText(diffType string) error
```
3. Create a struct following the [`Result`](https://github.com/GoogleCloudPlatform/container-diff/blob/0031c88993c9ac019e2d404815ef50c652d8d010/util/analyze_output_utils.go#L27-L30) interface by implementing the following two methods.

This is where you define how your analyzer should output for a human readable format (`OutputText`) and as a struct which can then be written to a `.json` file. See [diff_output_utils.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/utils/diff_output_utils.go) and [analyze_output_utils.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/analyze_output_utils.go).
```go
type Result interface {
OutputStruct() interface{}
OutputText(resultType string) error
}
```

5. Add your analyzer to the `analyses` map in [differs.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/differs/differs.go#L22) with the corresponding Analyzer struct as the value.
This is where you define how your analyzer should output for a human readable format (`OutputText`) and as a struct which can then be written to a `.json` file. See [`util/diff_output_utils.go`](https://github.com/GoogleCloudPlatform/container-diff/blob/0031c88993c9ac019e2d404815ef50c652d8d010/util/diff_output_utils.go) and [`util/analyze_output_utils.go`](https://github.com/GoogleCloudPlatform/container-diff/blob/0031c88993c9ac019e2d404815ef50c652d8d010/util/analyze_output_utils.go).

4. Add your analyzer to the `Analyzers` map in [`differs/differs.go`](https://github.com/GoogleCloudPlatform/container-diff/blob/0031c88993c9ac019e2d404815ef50c652d8d010/differs/differs.go#L44-L50) with the corresponding Analyzer struct as the value.