Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

image: Refactor to use cas/ref engines instead of walkers #159

Closed
wants to merge 13 commits into from

Conversation

wking
Copy link
Contributor

@wking wking commented Jun 17, 2016

Abstract out the details of the image-layout format, since the
validators and unpackers shouldn't care about that sort of backend
stuff. This should make it easy to build backends based on zip files,
HTTP, the Docker registry format, etc. See #140.

I ran out of time on the final commit, but wanted to push this in case
folks had time for early feedback. I should have the last commit
fixed up today, but there may be some more rerolling while I do that.

Ping @s-urbaniak.

@wking wking changed the title WIP: image: Refactor to use cas/ref engines instead of walkers image: Refactor to use cas/ref engines instead of walkers Jun 17, 2016
@wking
Copy link
Contributor Author

wking commented Jun 17, 2016

I haven't tested all the old commands yet, but 9252deeaba65b0 fixes the obvious-to-me-from-reading-the-code issues, so I've removed the WIP label.

@wking wking force-pushed the cas-api branch 2 times, most recently from 7efffe9 to b4482da Compare June 17, 2016 18:08
@wking
Copy link
Contributor Author

wking commented Jun 17, 2016

I'd forgotten to remove image/walkers.go now that we don't need it anymore. Done with aba65b0b4482da.

}
}

func (state *casGetCmd) Run(cmd *cobra.Command, args []string) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just call this cmd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Mon, Jun 20, 2016 at 05:30:34AM -0700, Sergiusz Urbaniak wrote:

+func (state *casGetCmd) Run(cmd *cobra.Command, args []string) {

just call this cmd

There are a few things in that line. Do you mean ‘Run’ → ‘cmd’?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being inprecise. I meant the state variable, I'd call this cmd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Mon, Jun 20, 2016 at 02:09:20PM -0700, Sergiusz Urbaniak wrote:

+func (state *casGetCmd) Run(cmd *cobra.Command, args []string) {

Sorry for being inprecise. I meant the state variable, I'd call this cmd.

What do you want me to use for *cobra.Command?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, sorry, it's 11pm here in old Europe, my eyes are starting to blur ;-) In this case I'd recommend calling it getCmd, or even cgc which is quite idiomatic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Mon, Jun 20, 2016 at 02:19:47PM -0700, Sergiusz Urbaniak wrote:

+func (state *casGetCmd) Run(cmd *cobra.Command, args []string) {

Haha, sorry, it's 11pm here in old Europe, my eyes are starting to
blur ;-) In this case I'd recommend calling it getCmd, or even
cgc which is quite idiomatic.

getCmd would cause collisions between the current casGetCmd and
refsGetCmd. Do you really prefer types ‘cgc’ and ‘rgc’ to casGetCmt
and refsGetCmd? I can reroll (it's not my project ;), but the short
versions are a bit dense for my taste ;).

Signed-off-by: W. Trevor King <wking@tremily.us>
@wking
Copy link
Contributor Author

wking commented Sep 2, 2016

I've pushed c0711397e96d03, rebasing around #212, #229, and #230
(master has been active in the past few days ;). I'm still waiting on
maintainer feedback for:

On Tue, Aug 30, 2016 at 11:08:41PM -0700, W. Trevor King wrote 1:

The main outstanding issue from @stevvooe's recent review is whether I
need to reroll the tar handling to use the filesystem for scratch
space instead of using memory for scratch space [2 and many other
recent comments ;)]. I don't think we need to put in the work to make
tar writing more performant 3, but I'll put in the time if it's a
blocker for landing this PR.


[2]: #159 (comment)
3: #159 (comment)

I'm expecting something like one of:

a. “We agree that the current PR's tar handling isn't performant, but
that it's not broken for constructing small image-layout tarballs
from scratch, for append-only CAS operations, for new-key ref
operations, or for ‘overwriting the same key with larger serialized
descriptor JSON’ ref operations [2]. That's enough for the
tar-backed engines, and we'll accept it for this PR. We realize
that there is a chance of a corrupted tarball if a writer crashes
(discussed in 469bbb6) and are ok with that because nobody should
be using the tar-writer in production if they aren't prepared to
catch crashed writes and start over from scratch.”,

b. “You need to use temporary files to avoid holding blobs in memory
while you hash their contents and to avoid holding tarballs in
memory while you rewrite them. Using TempFile 3 with the default
TempDir 4 is fine for this. We realize that moving a blob from a
temporary-disk file into a tarball is not an atomic operation, and
that copying from TempDir over the original tar path may require
futher copies [5]. We also realize that this will slow the
read-only use-case, because readers will have to stat or re-open
the file to detect changes made by other process [6]. We still
want you to use temporary files.”, or

c. “Everything from (b), except you cannot rewrite the whole tar file
when appending a new entry. You must seek to before the trailing
two zero-byte records and write into the existing tar file. We
realize that this leaves a window where the content of the tar file
at the original tar path is incomplete, and accept the risk of
corruption if a writer crashes (see (a)) in return for the
efficiency gains from avoiding (b)'s temporary tar file.”.

[2]: Is there really no truncate(2) in Go's high-level file ops? More
in 469bbb6.

[5]: E.g. if TempFile and the original tar path are on different
filesystems.
[6]: Both readers and writers currently seek back to the beginning of
the file without re-opening it.

wking added a commit to wking/image-tools that referenced this pull request Sep 16, 2016
And implement that interface for tarballs based on the specs
image-layout.  I plan on adding other backends later, but this is
enough for a proof of concept.

Also add a new oci-refs command so folks can access the new read
functionality from the command line.

The Engine.List interface uses a callback instead of returning
channels or a slice.  Benefits vs. returning a slice of names:

* There's no need to allocate a slice for the results, so calls with
  large (or negative) 'size' values can be made without consuming
  large amounts of memory.

* The name collection and processing can happen concurrently, so:
  * We don't waste cycles collecting names we won't use.
  * Slow collection can happen in the background if/when the consumer
    is blocked on something else.

The benefit of using callbacks vs. returning name and error channels
(as discussed in [1]) is more of a trade-off.  Stephen Day [2] and JT
Olds [3] don't like channel's internal locks.  Dave Cheney doesn't
have a problem with them [4].  Which approach is more efficient for a
given situation depends on how expensive it is for the engine to find
the next key and how expensive it is to act on a returned name.  If
both are expensive, you want goroutines in there somewhere to get
concurrent execution, and channels will help those goroutines
communicate.  When either action is fast (or both are fast), channels
are unnecessary overhead.  By using a callback in the interface, we
avoid baking in the overhead.  Folks who want concurrent execution can
initialize their own channel, launch List in a goroutine, and use the
callback to inject names into their channel.

In a subsequent commit, I'll replace the image/walker.go functionality
with this new API.

I'd prefer casLayout for the imported package, but Stephen doesn't
want camelCase for package names [5].

[1]: https://blog.golang.org/pipelines
[2]: opencontainers/image-spec#159 (comment)
[3]: http://www.jtolds.com/writing/2016/03/go-channels-are-bad-and-you-should-feel-bad/
[4]: https://groups.google.com/d/msg/golang-nuts/LM648yrPpck/idyupwodAwAJ
     Subject: Re: [go-nuts] Re: "Go channels are bad and you should feel bad"
     Date: Wed, 2 Mar 2016 16:04:13 -0800 (PST)
     Message-Id: <c8e4433a-53c0-4ee6-9dc5-98f62eea06d2@googlegroups.com>
[5]: opencontainers/image-spec#159 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/image-tools that referenced this pull request Sep 16, 2016
The validation/unpacking code doesn't really care what the reference
and CAS implemenations are.  And the new generic interfaces in
image/refs and image/cas will scale better as we add new backends than
the walker interface.

The old tar/directory distinction between image and imageLayout is
gone.  The new CAS/refs engines don't support directory backends yet
(I plan on adding them once the engine framework lands), but the new
framework will handle tar/directory/... detection inside
layout.NewEngine (and possibly inside a new (cas|refs).NewEngine when
we grow engine types that aren't based on image-layout).

Also replace the old methods like:

  func (d *descriptor) validateContent(r io.Reader) error

with functions like:

  validateContent(ctx context.Context, descriptor *specs.Descriptor, r io.Reader) error

to avoid local types that duplicate the image-spec types.  This saves
an extra instantiation for folks who want to validate (or whatever) a
specs.Descriptor they have obtained elsewhere.

I'd prefer casLayout and refsLayout for the imported packages, but
Stephen doesn't want camelCase for package names [1].

[1]: opencontainers/image-spec#159 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/image-tools that referenced this pull request Sep 16, 2016
The validation/unpacking code doesn't really care what the reference
and CAS implemenations are.  And the new generic interfaces in
image/refs and image/cas will scale better as we add new backends than
the walker interface.

The old tar/directory distinction between image and imageLayout is
gone.  The new CAS/refs engines don't support directory backends yet
(I plan on adding them once the engine framework lands), but the new
framework will handle tar/directory/... detection inside
layout.NewEngine (and possibly inside a new (cas|refs).NewEngine when
we grow engine types that aren't based on image-layout).

Also replace the old methods like:

  func (d *descriptor) validateContent(r io.Reader) error

with functions like:

  validateContent(ctx context.Context, descriptor *specs.Descriptor, r io.Reader) error

to avoid local types that duplicate the image-spec types.  This saves
an extra instantiation for folks who want to validate (or whatever) a
specs.Descriptor they have obtained elsewhere.

I'd prefer casLayout and refsLayout for the imported packages, but
Stephen doesn't want camelCase for package names [1].

[1]: opencontainers/image-spec#159 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
@wking
Copy link
Contributor Author

wking commented Sep 16, 2016

Moved to opencontainers/image-tools#5.

@wking wking closed this Sep 16, 2016
wking added a commit to wking/image-tools that referenced this pull request Sep 16, 2016
The validation/unpacking code doesn't really care what the reference
and CAS implemenations are.  And the new generic interfaces in
image/refs and image/cas will scale better as we add new backends than
the walker interface.

The old tar/directory distinction between image and imageLayout is
gone.  The new CAS/refs engines don't support directory backends yet
(I plan on adding them once the engine framework lands), but the new
framework will handle tar/directory/... detection inside
layout.NewEngine (and possibly inside a new (cas|refs).NewEngine when
we grow engine types that aren't based on image-layout).

Also replace the old methods like:

  func (d *descriptor) validateContent(r io.Reader) error

with functions like:

  validateContent(ctx context.Context, descriptor *specs.Descriptor, r io.Reader) error

to avoid local types that duplicate the image-spec types.  This saves
an extra instantiation for folks who want to validate (or whatever) a
specs.Descriptor they have obtained elsewhere.

I'd prefer casLayout and refsLayout for the imported packages, but
Stephen doesn't want camelCase for package names [1].

[1]: opencontainers/image-spec#159 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/image-tools that referenced this pull request Sep 16, 2016
And implement that interface for tarballs based on the specs
image-layout.  I plan on adding other backends later, but this is
enough for a proof of concept.

Also add a new oci-refs command so folks can access the new read
functionality from the command line.

The Engine.List interface uses a callback instead of returning
channels or a slice.  Benefits vs. returning a slice of names:

* There's no need to allocate a slice for the results, so calls with
  large (or negative) 'size' values can be made without consuming
  large amounts of memory.

* The name collection and processing can happen concurrently, so:
  * We don't waste cycles collecting names we won't use.
  * Slow collection can happen in the background if/when the consumer
    is blocked on something else.

The benefit of using callbacks vs. returning name and error channels
(as discussed in [1]) is more of a trade-off.  Stephen Day [2] and JT
Olds [3] don't like channel's internal locks.  Dave Cheney doesn't
have a problem with them [4].  Which approach is more efficient for a
given situation depends on how expensive it is for the engine to find
the next key and how expensive it is to act on a returned name.  If
both are expensive, you want goroutines in there somewhere to get
concurrent execution, and channels will help those goroutines
communicate.  When either action is fast (or both are fast), channels
are unnecessary overhead.  By using a callback in the interface, we
avoid baking in the overhead.  Folks who want concurrent execution can
initialize their own channel, launch List in a goroutine, and use the
callback to inject names into their channel.

In a subsequent commit, I'll replace the image/walker.go functionality
with this new API.

I'd prefer casLayout for the imported package, but Stephen doesn't
want camelCase for package names [5].

[1]: https://blog.golang.org/pipelines
[2]: opencontainers/image-spec#159 (comment)
[3]: http://www.jtolds.com/writing/2016/03/go-channels-are-bad-and-you-should-feel-bad/
[4]: https://groups.google.com/d/msg/golang-nuts/LM648yrPpck/idyupwodAwAJ
     Subject: Re: [go-nuts] Re: "Go channels are bad and you should feel bad"
     Date: Wed, 2 Mar 2016 16:04:13 -0800 (PST)
     Message-Id: <c8e4433a-53c0-4ee6-9dc5-98f62eea06d2@googlegroups.com>
[5]: opencontainers/image-spec#159 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/image-tools that referenced this pull request Sep 16, 2016
The validation/unpacking code doesn't really care what the reference
and CAS implemenations are.  And the new generic interfaces in
image/refs and image/cas will scale better as we add new backends than
the walker interface.

The old tar/directory distinction between image and imageLayout is
gone.  The new CAS/refs engines don't support directory backends yet
(I plan on adding them once the engine framework lands), but the new
framework will handle tar/directory/... detection inside
layout.NewEngine (and possibly inside a new (cas|refs).NewEngine when
we grow engine types that aren't based on image-layout).

Also replace the old methods like:

  func (d *descriptor) validateContent(r io.Reader) error

with functions like:

  validateContent(ctx context.Context, descriptor *specs.Descriptor, r io.Reader) error

to avoid local types that duplicate the image-spec types.  This saves
an extra instantiation for folks who want to validate (or whatever) a
specs.Descriptor they have obtained elsewhere.

I'd prefer casLayout and refsLayout for the imported packages, but
Stephen doesn't want camelCase for package names [1].

[1]: opencontainers/image-spec#159 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/image-tools that referenced this pull request Sep 16, 2016
The validation/unpacking code doesn't really care what the reference
and CAS implemenations are.  And the new generic interfaces in
image/refs and image/cas will scale better as we add new backends than
the walker interface.

The old tar/directory distinction between image and imageLayout is
gone.  The new CAS/refs engines don't support directory backends yet
(I plan on adding them once the engine framework lands), but the new
framework will handle tar/directory/... detection inside
layout.NewEngine (and possibly inside a new (cas|refs).NewEngine when
we grow engine types that aren't based on image-layout).

Also replace the old methods like:

  func (d *descriptor) validateContent(r io.Reader) error

with functions like:

  validateContent(ctx context.Context, descriptor *specs.Descriptor, r io.Reader) error

to avoid local types that duplicate the image-spec types.  This saves
an extra instantiation for folks who want to validate (or whatever) a
specs.Descriptor they have obtained elsewhere.

I'd prefer casLayout and refsLayout for the imported packages, but
Stephen doesn't want camelCase for package names [1].

[1]: opencontainers/image-spec#159 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/image-tools that referenced this pull request Sep 16, 2016
And implement that interface for tarballs based on the specs
image-layout.  I plan on adding other backends later, but this is
enough for a proof of concept.

Also add a new oci-refs command so folks can access the new read
functionality from the command line.

The Engine.List interface uses a callback instead of returning
channels or a slice.  Benefits vs. returning a slice of names:

* There's no need to allocate a slice for the results, so calls with
  large (or negative) 'size' values can be made without consuming
  large amounts of memory.

* The name collection and processing can happen concurrently, so:
  * We don't waste cycles collecting names we won't use.
  * Slow collection can happen in the background if/when the consumer
    is blocked on something else.

The benefit of using callbacks vs. returning name and error channels
(as discussed in [1]) is more of a trade-off.  Stephen Day [2] and JT
Olds [3] don't like channel's internal locks.  Dave Cheney doesn't
have a problem with them [4].  Which approach is more efficient for a
given situation depends on how expensive it is for the engine to find
the next key and how expensive it is to act on a returned name.  If
both are expensive, you want goroutines in there somewhere to get
concurrent execution, and channels will help those goroutines
communicate.  When either action is fast (or both are fast), channels
are unnecessary overhead.  By using a callback in the interface, we
avoid baking in the overhead.  Folks who want concurrent execution can
initialize their own channel, launch List in a goroutine, and use the
callback to inject names into their channel.

In a subsequent commit, I'll replace the image/walker.go functionality
with this new API.

I'd prefer casLayout for the imported package, but Stephen doesn't
want camelCase for package names [5].

[1]: https://blog.golang.org/pipelines
[2]: opencontainers/image-spec#159 (comment)
[3]: http://www.jtolds.com/writing/2016/03/go-channels-are-bad-and-you-should-feel-bad/
[4]: https://groups.google.com/d/msg/golang-nuts/LM648yrPpck/idyupwodAwAJ
     Subject: Re: [go-nuts] Re: "Go channels are bad and you should feel bad"
     Date: Wed, 2 Mar 2016 16:04:13 -0800 (PST)
     Message-Id: <c8e4433a-53c0-4ee6-9dc5-98f62eea06d2@googlegroups.com>
[5]: opencontainers/image-spec#159 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/image-tools that referenced this pull request Sep 16, 2016
The validation/unpacking code doesn't really care what the reference
and CAS implemenations are.  And the new generic interfaces in
image/refs and image/cas will scale better as we add new backends than
the walker interface.

The old tar/directory distinction between image and imageLayout is
gone.  The new CAS/refs engines don't support directory backends yet
(I plan on adding them once the engine framework lands), but the new
framework will handle tar/directory/... detection inside
layout.NewEngine (and possibly inside a new (cas|refs).NewEngine when
we grow engine types that aren't based on image-layout).

Also replace the old methods like:

  func (d *descriptor) validateContent(r io.Reader) error

with functions like:

  validateContent(ctx context.Context, descriptor *specs.Descriptor, r io.Reader) error

to avoid local types that duplicate the image-spec types.  This saves
an extra instantiation for folks who want to validate (or whatever) a
specs.Descriptor they have obtained elsewhere.

I'd prefer casLayout and refsLayout for the imported packages, but
Stephen doesn't want camelCase for package names [1].

[1]: opencontainers/image-spec#159 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/image-tools that referenced this pull request Sep 17, 2016
And implement that interface for tarballs based on the specs
image-layout.  I plan on adding other backends later, but this is
enough for a proof of concept.

Also add a new oci-refs command so folks can access the new read
functionality from the command line.

The Engine.List interface uses a callback instead of returning
channels or a slice.  Benefits vs. returning a slice of names:

* There's no need to allocate a slice for the results, so calls with
  large (or negative) 'size' values can be made without consuming
  large amounts of memory.

* The name collection and processing can happen concurrently, so:
  * We don't waste cycles collecting names we won't use.
  * Slow collection can happen in the background if/when the consumer
    is blocked on something else.

The benefit of using callbacks vs. returning name and error channels
(as discussed in [1]) is more of a trade-off.  Stephen Day [2] and JT
Olds [3] don't like channel's internal locks.  Dave Cheney doesn't
have a problem with them [4].  Which approach is more efficient for a
given situation depends on how expensive it is for the engine to find
the next key and how expensive it is to act on a returned name.  If
both are expensive, you want goroutines in there somewhere to get
concurrent execution, and channels will help those goroutines
communicate.  When either action is fast (or both are fast), channels
are unnecessary overhead.  By using a callback in the interface, we
avoid baking in the overhead.  Folks who want concurrent execution can
initialize their own channel, launch List in a goroutine, and use the
callback to inject names into their channel.

In a subsequent commit, I'll replace the image/walker.go functionality
with this new API.

I'd prefer casLayout for the imported package, but Stephen doesn't
want camelCase for package names [5].

[1]: https://blog.golang.org/pipelines
[2]: opencontainers/image-spec#159 (comment)
[3]: http://www.jtolds.com/writing/2016/03/go-channels-are-bad-and-you-should-feel-bad/
[4]: https://groups.google.com/d/msg/golang-nuts/LM648yrPpck/idyupwodAwAJ
     Subject: Re: [go-nuts] Re: "Go channels are bad and you should feel bad"
     Date: Wed, 2 Mar 2016 16:04:13 -0800 (PST)
     Message-Id: <c8e4433a-53c0-4ee6-9dc5-98f62eea06d2@googlegroups.com>
[5]: opencontainers/image-spec#159 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/image-tools that referenced this pull request Sep 17, 2016
The validation/unpacking code doesn't really care what the reference
and CAS implemenations are.  And the new generic interfaces in
image/refs and image/cas will scale better as we add new backends than
the walker interface.

The old tar/directory distinction between image and imageLayout is
gone.  The new CAS/refs engines don't support directory backends yet
(I plan on adding them once the engine framework lands), but the new
framework will handle tar/directory/... detection inside
layout.NewEngine (and possibly inside a new (cas|refs).NewEngine when
we grow engine types that aren't based on image-layout).

Also replace the old methods like:

  func (d *descriptor) validateContent(r io.Reader) error

with functions like:

  validateContent(ctx context.Context, descriptor *specs.Descriptor, r io.Reader) error

to avoid local types that duplicate the image-spec types.  This saves
an extra instantiation for folks who want to validate (or whatever) a
specs.Descriptor they have obtained elsewhere.

I'd prefer casLayout and refsLayout for the imported packages, but
Stephen doesn't want camelCase for package names [1].

[1]: opencontainers/image-spec#159 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/image-tools that referenced this pull request Oct 11, 2016
And implement that interface for tarballs based on the specs
image-layout.  I plan on adding other backends later, but this is
enough for a proof of concept.

Also add a new oci-refs command so folks can access the new read
functionality from the command line.

The Engine.List interface uses a callback instead of returning
channels or a slice.  Benefits vs. returning a slice of names:

* There's no need to allocate a slice for the results, so calls with
  large (or negative) 'size' values can be made without consuming
  large amounts of memory.

* The name collection and processing can happen concurrently, so:
  * We don't waste cycles collecting names we won't use.
  * Slow collection can happen in the background if/when the consumer
    is blocked on something else.

The benefit of using callbacks vs. returning name and error channels
(as discussed in [1]) is more of a trade-off.  Stephen Day [2] and JT
Olds [3] don't like channel's internal locks.  Dave Cheney doesn't
have a problem with them [4].  Which approach is more efficient for a
given situation depends on how expensive it is for the engine to find
the next key and how expensive it is to act on a returned name.  If
both are expensive, you want goroutines in there somewhere to get
concurrent execution, and channels will help those goroutines
communicate.  When either action is fast (or both are fast), channels
are unnecessary overhead.  By using a callback in the interface, we
avoid baking in the overhead.  Folks who want concurrent execution can
initialize their own channel, launch List in a goroutine, and use the
callback to inject names into their channel.

In a subsequent commit, I'll replace the image/walker.go functionality
with this new API.

I'd prefer casLayout for the imported package, but Stephen doesn't
want camelCase for package names [5].

[1]: https://blog.golang.org/pipelines
[2]: opencontainers/image-spec#159 (comment)
[3]: http://www.jtolds.com/writing/2016/03/go-channels-are-bad-and-you-should-feel-bad/
[4]: https://groups.google.com/d/msg/golang-nuts/LM648yrPpck/idyupwodAwAJ
     Subject: Re: [go-nuts] Re: "Go channels are bad and you should feel bad"
     Date: Wed, 2 Mar 2016 16:04:13 -0800 (PST)
     Message-Id: <c8e4433a-53c0-4ee6-9dc5-98f62eea06d2@googlegroups.com>
[5]: opencontainers/image-spec#159 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/image-tools that referenced this pull request Oct 11, 2016
The validation/unpacking code doesn't really care what the reference
and CAS implemenations are.  And the new generic interfaces in
image/refs and image/cas will scale better as we add new backends than
the walker interface.  This replaces the simpler interface from
image/reader.go with something more robust.

The old tar/directory distinction between image and imageLayout is
gone.  The new CAS/refs engines don't support directory backends yet
(I plan on adding them once the engine framework lands), but the new
framework will handle tar/directory/... detection inside
layout.NewEngine (and possibly inside a new (cas|refs).NewEngine when
we grow engine types that aren't based on image-layout).

Also replace the old methods like:

  func (d *descriptor) validateContent(r io.Reader) error

with functions like:

  validateContent(ctx context.Context, descriptor *specs.Descriptor, r io.Reader) error

to avoid local types that duplicate the image-spec types.  This saves
an extra instantiation for folks who want to validate (or whatever) a
specs.Descriptor they have obtained elsewhere.

I'd prefer casLayout and refsLayout for the imported packages, but
Stephen doesn't want camelCase for package names [1].

[1]: opencontainers/image-spec#159 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/image-tools that referenced this pull request Nov 18, 2016
And implement that interface for tarballs based on the specs
image-layout.  I plan on adding other backends later, but this is
enough for a proof of concept.

Also add a new oci-refs command so folks can access the new read
functionality from the command line.

The Engine.List interface uses a callback instead of returning
channels or a slice.  Benefits vs. returning a slice of names:

* There's no need to allocate a slice for the results, so calls with
  large (or negative) 'size' values can be made without consuming
  large amounts of memory.

* The name collection and processing can happen concurrently, so:
  * We don't waste cycles collecting names we won't use.
  * Slow collection can happen in the background if/when the consumer
    is blocked on something else.

The benefit of using callbacks vs. returning name and error channels
(as discussed in [1]) is more of a trade-off.  Stephen Day [2] and JT
Olds [3] don't like channel's internal locks.  Dave Cheney doesn't
have a problem with them [4].  Which approach is more efficient for a
given situation depends on how expensive it is for the engine to find
the next key and how expensive it is to act on a returned name.  If
both are expensive, you want goroutines in there somewhere to get
concurrent execution, and channels will help those goroutines
communicate.  When either action is fast (or both are fast), channels
are unnecessary overhead.  By using a callback in the interface, we
avoid baking in the overhead.  Folks who want concurrent execution can
initialize their own channel, launch List in a goroutine, and use the
callback to inject names into their channel.

In a subsequent commit, I'll replace the image/walker.go functionality
with this new API.

I'd prefer casLayout for the imported package, but Stephen doesn't
want camelCase for package names [5].

[1]: https://blog.golang.org/pipelines
[2]: opencontainers/image-spec#159 (comment)
[3]: http://www.jtolds.com/writing/2016/03/go-channels-are-bad-and-you-should-feel-bad/
[4]: https://groups.google.com/d/msg/golang-nuts/LM648yrPpck/idyupwodAwAJ
     Subject: Re: [go-nuts] Re: "Go channels are bad and you should feel bad"
     Date: Wed, 2 Mar 2016 16:04:13 -0800 (PST)
     Message-Id: <c8e4433a-53c0-4ee6-9dc5-98f62eea06d2@googlegroups.com>
[5]: opencontainers/image-spec#159 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
wking added a commit to wking/image-tools that referenced this pull request Nov 18, 2016
The validation/unpacking code doesn't really care what the reference
and CAS implemenations are.  And the new generic interfaces in
image/refs and image/cas will scale better as we add new backends than
the walker interface.  This replaces the simpler interface from
image/reader.go with something more robust.

The old tar/directory distinction between image and imageLayout is
gone.  The new CAS/refs engines don't support directory backends yet
(I plan on adding them once the engine framework lands), but the new
framework will handle tar/directory/... detection inside
layout.NewEngine (and possibly inside a new (cas|refs).NewEngine when
we grow engine types that aren't based on image-layout).

Also replace the old methods like:

  func (d *descriptor) validateContent(r io.Reader) error

with functions like:

  validateContent(ctx context.Context, descriptor *specs.Descriptor, r io.Reader) error

to avoid local types that duplicate the image-spec types.  This saves
an extra instantiation for folks who want to validate (or whatever) a
specs.Descriptor they have obtained elsewhere.

I'd prefer casLayout and refsLayout for the imported packages, but
Stephen doesn't want camelCase for package names [1].

[1]: opencontainers/image-spec#159 (comment)

Signed-off-by: W. Trevor King <wking@tremily.us>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants