Skip to content

Commit

Permalink
image/cas: Implement Engine.Put
Browse files Browse the repository at this point in the history
This is a bit awkward.  For writing a tar entry, we need to know both
the name and size of the file ahead of time.  The implementation in
this commit accomplishes that by reading the Put content into a
buffer, hashing and sizing the buffer, and then calling
WriteTarEntryByName to create the entry.  With a filesystem-backed CAS
engine, we could avoid the buffer by writing the file to a temporary
location with rolling hash and size tracking and then renaming the
temporary file to the appropriate path.

WriteTarEntryByName itself has awkward buffering to avoid dropping
anything onto disk.  It reads through its current file and writes the
new tar into a buffer, and then writes that buffer back back over its
current file.  There are a few issues with this:

* It's a lot more work than you need if you're just appending a new
  entry to the end of the tarball.  But writing the whole file into a
  buffer means we don't have to worry about the trailing blocks that
  mark the end of the tarball; that's all handled transparently for us
  by the Go implementation.  And this implementation doesn't have to
  be performant (folks should not be using tarballs to back
  write-heavy engines).

* It could leave you with a corrupted tarball if the caller dies
  mid-overwrite.  Again, I expect folks will only ever write to a
  tarball when building a tarball for publishing.  If the caller dies,
  you can just start over.  Folks looking for a more reliable
  implementation should use a filesystem-backed engine.

* It could leave you with dangling bytes at the end of the tarball.  I
  couldn't find a Go invocation to truncate the file.  Go does have an
  ftruncate(2) wrapper [1], but it doesn't seem to be exposed at the
  io.Reader/io.Writer/... level.  So if you write a shorter file with
  the same name as the original, you may end up with some dangling
  bytes.

cas.Engine.Put protects against excessive writes with a Get guard;
after hashing the new data, Put trys to Get it from the tarball and
only writes a new entry if it can't find an existing entry.  This also
protects the CAS engine from the dangling-bytes issue.

The 0666 file modes and 0777 directory modes rely on the caller's
umask to appropriately limit user/group/other permissions for the
tarball itself and any content extracted to the filesystem from the
tarball.

The trailing slash manipulation (stripping before comparison and
injecting before creation) is based on part of libarchive's
description of old-style archives [2]:

  name
    Pathname, stored as a null-terminated string.  Early tar
    implementations only stored regular files (including hardlinks to
    those files).  One common early convention used a trailing "/"
    character to indicate a directory name, allowing directory
    permissions and owner information to be archived and restored.

and POSIX ustar archives [3]:

  name, prefix
    ... The standard does not require a trailing / character on
    directory names, though most implementations still include this
    for compatibility reasons.

[1]: https://golang.org/pkg/syscall/#Ftruncate
[2]: https://github.com/libarchive/libarchive/wiki/ManPageTar5#old-style-archive-format
[3]: https://github.com/libarchive/libarchive/wiki/ManPageTar5#posix-ustar-archives

Signed-off-by: W. Trevor King <wking@tremily.us>
  • Loading branch information
wking committed Sep 2, 2016
1 parent c92ec50 commit 469bbb6
Show file tree
Hide file tree
Showing 5 changed files with 259 additions and 3 deletions.
1 change: 1 addition & 0 deletions cmd/oci-image-tool/cas.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ func newCASCmd(stdout io.Writer, stderr *log.Logger) *cobra.Command {
}

cmd.AddCommand(newCASGetCmd(stdout, stderr))
cmd.AddCommand(newCASPutCmd(stdout, stderr))

return cmd
}
90 changes: 90 additions & 0 deletions cmd/oci-image-tool/cas_put.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
// Copyright 2016 The Linux Foundation
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package main

import (
"fmt"
"io"
"log"
"os"

"github.com/opencontainers/image-spec/image/cas/layout"
"github.com/spf13/cobra"
"golang.org/x/net/context"
)

type casPutCmd struct {
stdout io.Writer
stderr *log.Logger
path string
}

func newCASPutCmd(stdout io.Writer, stderr *log.Logger) *cobra.Command {
state := &casPutCmd{
stdout: stdout,
stderr: stderr,
}

return &cobra.Command{
Use: "put PATH",
Short: "Write a blob to the store",
Long: "Read a blob from stdin, write it to the store, and print the digest to stdout.",
Run: state.Run,
}
}

func (state *casPutCmd) Run(cmd *cobra.Command, args []string) {
if len(args) != 1 {
if err := cmd.Usage(); err != nil {
state.stderr.Println(err)
}
os.Exit(1)
}

state.path = args[0]

err := state.run()
if err != nil {
state.stderr.Println(err)
os.Exit(1)
}

os.Exit(0)
}

func (state *casPutCmd) run() (err error) {
ctx := context.Background()

engine, err := layout.NewEngine(ctx, state.path)
if err != nil {
return err
}
defer engine.Close()

digest, err := engine.Put(ctx, os.Stdin)
if err != nil {
return err
}

n, err := fmt.Fprintln(state.stdout, digest)
if err != nil {
return err
}
if n < len(digest) {
return fmt.Errorf("wrote %d of %d bytes", n, len(digest))
}

return nil
}
2 changes: 1 addition & 1 deletion image/cas/layout/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ import (
// NewEngine instantiates an engine with the appropriate backend (tar,
// HTTP, ...).
func NewEngine(ctx context.Context, path string) (engine cas.Engine, err error) {
file, err := os.Open(path)
file, err := os.OpenFile(path, os.O_RDWR, 0)
if err != nil {
return nil, err
}
Expand Down
30 changes: 28 additions & 2 deletions image/cas/layout/tar.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,14 @@
package layout

import (
"bytes"
"crypto/sha256"
"encoding/hex"
"errors"
"fmt"
"io"
"io/ioutil"
"os"
"strings"

"github.com/opencontainers/image-spec/image/cas"
Expand Down Expand Up @@ -47,8 +51,30 @@ func NewTarEngine(ctx context.Context, file ReadWriteSeekCloser) (eng cas.Engine

// Put adds a new blob to the store.
func (engine *TarEngine) Put(ctx context.Context, reader io.Reader) (digest string, err error) {
// FIXME
return "", errors.New("TarEngine.Put is not supported yet")
data, err := ioutil.ReadAll(reader)
if err != nil {
return "", err
}

size := int64(len(data))
hash := sha256.Sum256(data)
hexHash := hex.EncodeToString(hash[:])
algorithm := "sha256"
digest = fmt.Sprintf("%s:%s", algorithm, hexHash)

_, err = engine.Get(ctx, digest)
if err == os.ErrNotExist {
targetName := fmt.Sprintf("./blobs/%s/%s", algorithm, hexHash)
reader = bytes.NewReader(data)
err = layout.WriteTarEntryByName(ctx, engine.file, targetName, reader, &size)
if err != nil {
return "", err
}
} else if err != nil {
return "", err
}

return digest, nil
}

// Get returns a reader for retrieving a blob from the store.
Expand Down
139 changes: 139 additions & 0 deletions image/layout/tar.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,15 @@ package layout

import (
"archive/tar"
"bytes"
"encoding/json"
"errors"
"fmt"
"io"
"io/ioutil"
"os"
"strings"
"time"

"github.com/opencontainers/image-spec/specs-go"
"golang.org/x/net/context"
Expand Down Expand Up @@ -57,6 +61,141 @@ func TarEntryByName(ctx context.Context, reader io.ReadSeeker, name string) (hea
}
}

// WriteTarEntryByName reads content from reader into an entry at name
// in the tarball at file, replacing a previous entry with that name
// (if any). The current implementation avoids writing a temporary
// file to disk, but risks leaving a corrupted tarball if the program
// crashes mid-write.
//
// To add an entry to a tarball (with Go's interface) you need to know
// the size ahead of time. If you set the size argument,
// WriteTarEntryByName will use that size in the entry header (and
// Go's implementation will check to make sure it matches the length
// of content read from reader). If unset, WriteTarEntryByName will
// copy reader into a local buffer, measure its size, and then write
// the entry header and content.
func WriteTarEntryByName(ctx context.Context, file io.ReadWriteSeeker, name string, reader io.Reader, size *int64) (err error) {
var buffer bytes.Buffer
tarWriter := tar.NewWriter(&buffer)

components := strings.Split(name, "/")
if components[0] != "." {
return fmt.Errorf("tar name entry does not start with './': %q", name)
}

var parents []string
for i := 2; i < len(components); i++ {
parents = append(parents, strings.Join(components[:i], "/"))
}

_, err = file.Seek(0, os.SEEK_SET)
if err != nil {
return err
}

tarReader := tar.NewReader(file)
found := false
for {
select {
case <-ctx.Done():
return ctx.Err()
default:
}

var header *tar.Header
header, err = tarReader.Next()
if err == io.EOF {
break
} else if err != nil {
return err
}

dirName := strings.TrimRight(header.Name, "/")
for i, parent := range parents {
if dirName == parent {
parents = append(parents[:i], parents[i+1:]...)
break
}
}

if header.Name == name {
found = true
err = writeTarEntry(ctx, tarWriter, name, reader, size)
} else {
err = tarWriter.WriteHeader(header)
if err != nil {
return err
}
_, err = io.Copy(tarWriter, tarReader)
}
if err != nil {
return err
}
}

if !found {
now := time.Now()
for _, parent := range parents {
header := &tar.Header{
Name: parent + "/",
Mode: 0777,
ModTime: now,
Typeflag: tar.TypeDir,
}
err = tarWriter.WriteHeader(header)
if err != nil {
return err
}
}
err = writeTarEntry(ctx, tarWriter, name, reader, size)
if err != nil {
return err
}
}

err = tarWriter.Close()
if err != nil {
return err
}

_, err = file.Seek(0, os.SEEK_SET)
if err != nil {
return err
}
// FIXME: truncate file

_, err = buffer.WriteTo(file)
return err
}

func writeTarEntry(ctx context.Context, writer *tar.Writer, name string, reader io.Reader, size *int64) (err error) {
if size == nil {
var data []byte
data, err = ioutil.ReadAll(reader)
if err != nil {
return err
}
reader = bytes.NewReader(data)
_size := int64(len(data))
size = &_size
}
now := time.Now()
header := &tar.Header{
Name: name,
Mode: 0666,
Size: *size,
ModTime: now,
Typeflag: tar.TypeReg,
}
err = writer.WriteHeader(header)
if err != nil {
return err
}

_, err = io.Copy(writer, reader)
return err
}

// CheckTarVersion walks a tarball pointed to by reader and returns an
// error if oci-layout is missing or has unrecognized content.
func CheckTarVersion(ctx context.Context, reader io.ReadSeeker) (err error) {
Expand Down

0 comments on commit 469bbb6

Please sign in to comment.