Skip to content
This repository was archived by the owner on Mar 27, 2024. It is now read-only.

Commit d6c17e0

Browse files
authored
Merge pull request #145 from ezkl/improve-analyzer-docs
Update README
2 parents 0031c88 + 34461ae commit d6c17e0

File tree

1 file changed

+68
-59
lines changed

1 file changed

+68
-59
lines changed

README.md

Lines changed: 68 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ Download the [container-diff-windows-amd64.exe](https://storage.googleapis.com/c
3939

4040
To use `container-diff analyze` to perform analysis on a single image, you need one Docker image (in the form of an ID, tarball, or URL from a repo). Once you have that image, you can run any of the following analyzers:
4141

42-
```
42+
```shell
4343
container-diff analyze <img> [Run default analyzers]
4444
container-diff analyze <img> --type=history [History]
4545
container-diff analyze <img> --type=file [File System]
@@ -53,7 +53,7 @@ container-diff analyze <img> --type=apt --type=node [Apt and Node]
5353
By default, with no `--type` flag specified, container-diff will run Apt package analysis.
5454

5555
To use container-diff to perform a diff analysis on two images, you need two Docker images (in the form of an ID, tarball, or URL from a repo). Once you have those images, you can run any of the following differs:
56-
```
56+
```shell
5757
container-diff diff <img1> <img2> [Run default differs]
5858
container-diff diff <img1> <img2> --type=history [History]
5959
container-diff diff <img1> <img2> --type=file [File System]
@@ -64,21 +64,23 @@ container-diff diff <img1> <img2> --type=node [Node]
6464

6565
You can similarly run many analyzers at once:
6666

67-
```
67+
```shell
6868
container-diff diff <img1> <img2> --type=history --type=apt --type=node
6969
```
7070

7171
To view the diff of an individual file in two different images, you can use the filename flag in conjuction with the file system diff analyzer.
7272

73-
```
73+
```shell
7474
container-diff diff <img1> <img2> --type=file --filename=/path/to/file
7575
```
7676

7777
## Image Sources
7878

7979
container-diff supports Docker images located in both a local Docker daemon and a remote registry. To explicitly specify a local image, use the `daemon://` prefix on the image name; similarly, for an explicitly remote image, use the `remote://` prefix.
8080

81-
```container-diff diff daemon://modified_debian:latest remote://gcr.io/google-appengine/debian8:latest```
81+
```shell
82+
container-diff diff daemon://modified_debian:latest remote://gcr.io/google-appengine/debian8:latest
83+
```
8284

8385
Additionally, tarballs can be provided to the tool directly. Make sure your file has a valid tar extension (.tar, .tar.gz, .tgz).
8486

@@ -96,23 +98,25 @@ For the Google Container Registry, make sure you have the `docker-credential-gcr
9698

9799
To get a JSON version of the container-diff output add a `-j` or `--json` flag.
98100

99-
```container-diff <img1> <img2> -j```
100-
101-
```container-diff <img1> <img2> -e```
101+
```shell
102+
container-diff diff --type=file --json gcr.io/gcp-runtimes/multi-base gcr.io/gcp-runtimes/multi-modified
103+
```
102104

103105
To order files and packages by size (in descending order) when performing file system or package analyses/diffs, add a `-o` or `--order` flag.
104106

105-
```container-diff <img1> <img2> -o```
107+
```shell
108+
container-diff analyze remote://gcr.io/gcp-runtimes/multi-modified --type=pip --order
109+
```
106110

107111

108112
## Analysis Result Format
109113

110114
JSON output for analysis results is in the following format:
111-
```
115+
```json
112116
{
113117
"Image": "foo",
114118
"AnalyzeType": "Apt",
115-
"Analysis": {},
119+
"Analysis": {}
116120
}
117121
```
118122
The possible contents of the `Analysis` field are detailed below.
@@ -127,11 +131,11 @@ The file system analyzer outputs a list of file system contents, including names
127131

128132
### Package Analysis
129133

130-
Package analyzers such as pip, apt, and node inspect the packages installed within the image provided. All package analyses leverage the PackageOutput struct, which contains the version and size for a given package instance (and a potential installation path for a specific instance of a package where multiple versions are allowed to be installed), as detailed below:
131-
```
134+
Package analyzers such as pip, apt, and node inspect the packages installed within the image provided. All package analyses leverage the `PackageOutput` struct, which contains the version and size for a given package instance (and a potential installation path for a specific instance of a package where multiple versions are allowed to be installed), as detailed below:
135+
```go
132136
type PackageOutput struct {
133-
Name string
134-
Path string
137+
Name string
138+
Path string
135139
Version string
136140
Size int64
137141
}
@@ -143,7 +147,6 @@ Single version package analyzers (apt) have the following output structure: `[]P
143147

144148
Here, the `Path` field is omitted because there is only one instance of each package.
145149

146-
147150
#### Multi Version Package Analysis
148151

149152
Multi version package analyzers (pip, node) have the following output structure: `[]PackageOutput`
@@ -154,68 +157,68 @@ Here, the `Path` field is included because there may be more than one instance o
154157
## Diff Result Format
155158

156159
JSON output for diff results is in the following format:
157-
```
160+
```json
158161
{
159162
"Image1": "foo",
160163
"Image2": "bar",
161164
"DiffType": "Apt",
162-
"Diff": {},
165+
"Diff": {}
163166
}
164167
```
165168
The possible structures of the `Diff` field are detailed below.
166169

167170
### History Diff
168171

169-
The history differ has the following JSON output structure:
172+
The history differ has the following output structure:
170173

171-
```
174+
```go
172175
type HistDiff struct {
173-
Adds []string
174-
Dels []string
176+
Adds []string
177+
Dels []string
175178
}
176179
```
177180

178181
### File System Diff
179182

180-
The file system differ has the following JSON output structure:
183+
The file system differ has the following output structure:
181184

182-
```
185+
```go
183186
type DirDiff struct {
184-
Adds []string
185-
Dels []string
186-
Mods []string
187+
Adds []string
188+
Dels []string
189+
Mods []string
187190
}
188191
```
189192

190193
### Package Diffs
191194

192195
Package differs such as pip, apt, and node inspect the packages contained within the images provided. All packages differs currently leverage the PackageInfo struct which contains the version and size for a given package instance, as detailed below:
193-
```
196+
```go
194197
type PackageInfo struct {
195198
Version string
196-
Size string
199+
Size string
197200
}
198201
```
199202

200203
#### Single Version Package Diffs
201204

202205
Single version differs (apt) have the following JSON output structure:
203206

204-
```
207+
```go
205208
type PackageDiff struct {
206209
Packages1 []PackageOutput
207210
Packages2 []PackageOutput
208211
InfoDiff []Info
209212
}
210213
```
211214

212-
Packages1 and Packages2 detail which packages exist uniquely in Image1 and Image2, respectively, with package name, version and size info. InfoDiff contains a list of Info structs, each of which contains the package name (which occurred in both images but had a difference in size or version), and the PackageInfo struct for each package instance.
215+
Packages1 and Packages2 detail which packages exist uniquely in Image1 and Image2, respectively, with package name, version and size info. InfoDiff contains a list of Info structs, each of which contains the package name (which occurred in both images but had a difference in size or version), and the PackageInfo struct for each package instance.
213216

214217
#### Multi Version Package Diffs
215218

216219
The multi version differs (pip, node) support processing images which may have multiple versions of the same package. Below is the json output structure:
217220

218-
```
221+
```go
219222
type MultiVersionPackageDiff struct {
220223
Packages1 []PackageOutput
221224
Packages2 []PackageOutput
@@ -225,11 +228,11 @@ type MultiVersionPackageDiff struct {
225228

226229
Packages1 and Packages2 detail which packages exist uniquely in Image1 and Image2, respectively, with package name, installation path, version and size info. InfoDiff here is exanded to allow for multiple versions to be associated with a single package. In this case, a package of the same name is considered to differ between two images when there exist one or more instances of it installed in one image but not the other (i.e. have a unique version and/or size).
227230

228-
```
231+
```go
229232
type MultiVersionInfo struct {
230233
Package string
231-
Info1 []PackageInfo
232-
Info2 []PackageInfo
234+
Info1 []PackageInfo
235+
Info2 []PackageInfo
233236
}
234237
```
235238

@@ -241,7 +244,7 @@ Tarballs provided directly to the tool must be in the Docker format (i.e. have a
241244

242245
## Example Run
243246

244-
```
247+
```shell
245248
$ container-diff diff gcr.io/google-appengine/python:2017-07-21-123058 gcr.io/google-appengine/python:2017-06-29-190410 --type=apt --type=node --type=pip
246249

247250
-----AptDiffer-----
@@ -271,7 +274,8 @@ Packages found only in gcr.io/google-appengine/python:2017-06-29-190410: None
271274
Version differences: None
272275

273276
```
274-
```
277+
278+
```shell
275279
$ container-diff diff file1.tar file2.tar --type=file --filename=go/src/app/file.txt
276280
Starting diff on images file1.tar and file2.tar, using differs: [file]
277281
Retrieving image file2.tar from source Tar Archive
@@ -304,7 +308,7 @@ This is a file
304308

305309
## Example Run with JSON post-processing
306310
The following example demonstrates how one might selectively display the output of their diff, such that version differences are ignored and only package absence/presence is displayed and the packages present in only one image are sorted by size in descending order. A small piece of the JSON being post-processed can be seen below:
307-
```
311+
```json
308312
[
309313
{
310314
"DiffType": "AptDiffer",
@@ -320,12 +324,16 @@ The following example demonstrates how one might selectively display the output
320324
"libmpdec2": {
321325
"Version": "2.4.1-1",
322326
"Size": "275"
323-
},
324-
...
327+
}
328+
}
329+
}
330+
}
331+
]
325332
```
326333
The post-processing script used for this example is below:
327334

328-
```import sys, json
335+
```python
336+
import sys, json
329337

330338
def main():
331339
data = json.loads(sys.stdin.read())
@@ -343,7 +351,7 @@ def main():
343351
for package in diff['Packages2']:
344352
Size = package['Size']
345353
img2packages.append((str(package), int(str(Size))))
346-
354+
347355
img1packages = reversed(sorted(img1packages, key=lambda x: x[1]))
348356
img2packages = reversed(sorted(img2packages, key=lambda x: x[1]))
349357

@@ -361,7 +369,7 @@ if __name__ == "__main__":
361369
```
362370

363371
Given the above python script to postprocess json output, you can produce the following behavior:
364-
```
372+
```shell
365373
container-diff gcr.io/gcp-runtimes/multi-base gcr.io/gcp-runtimes/multi-modified -a -j | python pyscript.py
366374

367375
Only in image1
@@ -387,33 +395,34 @@ Feel free to develop your own analyzer leveraging the utils currently available.
387395

388396
In order to quickly make your own analyzer, follow these steps:
389397

390-
1. Add your analyzer identifier to the flags in [root.go](https://github.com/GoogleCloudPlatform/container-diff/blob/ReadMe/cmd/root.go)
391-
2. Determine if you can use existing analyzing or diffing tools. If you can make use of existing tools, you then need to construct the structs to feed into the tools by getting all of the packages for each image or the analogous quality to be analyzed. To determine if you can leverage existing tools, think through these questions:
398+
1. Determine if you can use existing analyzing or diffing tools. If you can make use of existing tools, you then need to construct the structs to feed into the tools by getting all of the packages for each image or the analogous quality to be analyzed. To determine if you can leverage existing tools, think through these questions:
392399
- Are you trying to analyze packages?
393400
- Yes: Does the relevant package manager support different versions of the same package on one image?
394-
- Yes: Implement `getPackages` to collect all versions of all packages within an image in a `map[string]map[string]PackageInfo`. Use `GetMultiVerisonMapDiff` to diff map objects. See [nodeDiff.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/differs/nodeDiff.go#L33) or [pipDiff.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/differs/pipDiff.go#L23) for examples.
395-
- No: Implement `getPackages` to collect all versions of all packages within an image in a `map[string]PackageInfo`. Use `GetMapDiff` to diff map objects. See [aptDiff.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/differs/aptDiff.go#L29).
396-
- No: Look to [History](https://github.com/GoogleCloudPlatform/container-diff/blob/ReadMe/differs/historyDiff.go) and [File System](https://github.com/GoogleCloudPlatform/container-diff/blob/ReadMe/differs/fileDiff.go) differs as models for diffing.
401+
- Yes: Implement `getPackages` to collect all versions of all packages within an image in a `map[string]map[string]util.PackageInfo`. Use [`GetMultiVersionMapDiff`](https://github.com/GoogleCloudPlatform/container-diff/blob/0031c88993c9ac019e2d404815ef50c652d8d010/util/package_diff_utils.go#L119-L126) to diff map objects. See [`differs/node_diff.go`](https://github.com/GoogleCloudPlatform/container-diff/blob/0031c88993c9ac019e2d404815ef50c652d8d010/differs/node_diff.go#L49-L93) or [`differs/pip_diff.go`](https://github.com/GoogleCloudPlatform/container-diff/blob/0031c88993c9ac019e2d404815ef50c652d8d010/differs/pip_diff.go#L48-L111) for examples.
402+
- No: Implement `getPackages` to collect all versions of all packages within an image in a `map[string]util.PackageInfo`. Use [`GetMapDiff`](https://github.com/GoogleCloudPlatform/container-diff/blob/31cec2304b54ae6ae444ccde4382b113d8e06097/util/package_diff_utils.go#L110-L117) to diff map objects. See [`differs/apt_diff.go`](https://github.com/GoogleCloudPlatform/container-diff/blob/master/differs/apt_diff.go#L29).
403+
- No: Look to [History](https://github.com/GoogleCloudPlatform/container-diff/blob/0031c88993c9ac019e2d404815ef50c652d8d010/differs/history_diff.go) and [File System](https://github.com/GoogleCloudPlatform/container-diff/blob/0031c88993c9ac019e2d404815ef50c652d8d010/differs/file_diff.go) differs as models for diffing.
397404

398-
3. Write your analyzer driver in the `differs` directory, such that you have a struct for your analyzer type and two methods for that analyzer: `Analyze` for single image analysis and `Diff` for comparison between two images:
405+
2. Write your analyzer driver in the `differs` directory, such that you have a struct for your analyzer type and two methods for that analyzer: `Analyze` for single image analysis and `Diff` for comparison between two images:
399406

400-
```
407+
```go
401408
type YourAnalyzer struct {}
402409

403-
func (a YourAnalyzer) Analyze(image utils.Image) (utils.Result, error) {...}
404-
func (a YourAnalyzer) Diff(image1, image2 utils.Image) (utils.Result, error) {...}
410+
func (a YourAnalyzer) Analyze(image util.Image) (util.Result, error) {...}
411+
func (a YourAnalyzer) Diff(image1, image2 util.Image) (util.Result, error) {...}
405412
```
406413
The image arguments passed to your analyzer contain the path to the unpacked tar representation of the image, as well as certain configuration information (e.g. environment variables upon image creation and image history).
407414

408415
If using existing package tools, you should create the appropriate structs (e.g. `SingleVersionPackageAnalyzeResult` or `SingleVersionPackageDiffResult`) to analyze or diff. Otherwise, create your own structs which should yield information to fill an AnalyzeResult or DiffResult as the return type for Analyze() and Diff(), respectively, and should implement the `Result` interface, as in the next step.
409416

410-
4. Create a struct following the `Result` interface by implementing the following two methods.
411-
```
412-
GetStruct() interface{}
413-
OutputText(diffType string) error
414-
```
417+
3. Create a struct following the [`Result`](https://github.com/GoogleCloudPlatform/container-diff/blob/0031c88993c9ac019e2d404815ef50c652d8d010/util/analyze_output_utils.go#L27-L30) interface by implementing the following two methods.
415418

416-
This is where you define how your analyzer should output for a human readable format (`OutputText`) and as a struct which can then be written to a `.json` file. See [diff_output_utils.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/utils/diff_output_utils.go) and [analyze_output_utils.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/analyze_output_utils.go).
419+
```go
420+
type Result interface {
421+
OutputStruct() interface{}
422+
OutputText(resultType string) error
423+
}
424+
```
417425

418-
5. Add your analyzer to the `analyses` map in [differs.go](https://github.com/GoogleCloudPlatform/container-diff/blob/master/differs/differs.go#L22) with the corresponding Analyzer struct as the value.
426+
This is where you define how your analyzer should output for a human readable format (`OutputText`) and as a struct which can then be written to a `.json` file. See [`util/diff_output_utils.go`](https://github.com/GoogleCloudPlatform/container-diff/blob/0031c88993c9ac019e2d404815ef50c652d8d010/util/diff_output_utils.go) and [`util/analyze_output_utils.go`](https://github.com/GoogleCloudPlatform/container-diff/blob/0031c88993c9ac019e2d404815ef50c652d8d010/util/analyze_output_utils.go).
419427

428+
4. Add your analyzer to the `Analyzers` map in [`differs/differs.go`](https://github.com/GoogleCloudPlatform/container-diff/blob/0031c88993c9ac019e2d404815ef50c652d8d010/differs/differs.go#L44-L50) with the corresponding Analyzer struct as the value.

0 commit comments

Comments
 (0)