Skip to content
This repository has been archived by the owner on Sep 9, 2020. It is now read-only.

Parameterization: OS/Arch/build tags #291

Closed
sdboyer opened this issue Mar 6, 2017 · 8 comments
Closed

Parameterization: OS/Arch/build tags #291

sdboyer opened this issue Mar 6, 2017 · 8 comments

Comments

@sdboyer
Copy link
Member

sdboyer commented Mar 6, 2017

To this point, dep has completely sidestepped the question of build tags. This is a big thing we need to deal with, and may affect the structure of lock and manifest files.

The basic issue: whereas the toolchain today operates within a concrete build environment (for any given run, there is one GOOS, one GOARCH, and a finite set of applied build tags), dep produces more broadly useful results when it computes not just the current environment, but across potential build environments - ideally, across all of them. In essence, the goal is that even if you're running dep on darwin/amd64, the lock file it produces should have sufficient information to be able to construct a vendor tree for linux/386, or freebsd/sparc64, or whatever (assuming they're any different).

Compare these examples - first, a lock.json now indicating three subpackages of a project are used:

        {
            "name": "golang.org/x/sys",
            "branch": "master",
            "revision": "e48874b42435b4347fc52bdee0424a52abc974d7",
            "packages": ["unix"]
        }

And one where the requirement for those subpackages is parameterized:

        {
            "name": "golang.org/x/sys",
            "branch": "master",
            "revision": "e48874b42435b4347fc52bdee0424a52abc974d7",
            "parameterized-packages": [
                {
                    "os": "solaris",
                    "packages": ["unix"]
                }
            ]
        }

With this additional information, it's immediately evident that the source code in golang.org/x/sys/unix - a rather large repository - isn't needed unless we need to build for solaris. Without this information, the tool would have to redo static analysis of the code in order to figure it out, which is just wasteful. Also, note that this example is not only real, but common; it results from relying on the very popular github.com/sirupsen/logrus - project.

Taking this approach means cross-compiling is possible by default. That's already something the go toolchain aims for, which strongly suggests that we're on the right track, and dep's strong default behavior should be to create portable lock files.

Now, if you/your project isn't cross-compiling or testing different os/arch combinations itself, then lock parameterization might seem a bit misleading - how do we know a lock's solution works on a different concrete build environment? Well, we don't, of course, but that's the wrong way to think about it; we don't know that a different os/arch would work today, either, and if a user on a different os/arch has a build that doesn't work, it's still strictly better if that build is reproducible than not.


Achieving lock parameterization first requires that gps start supporting a notion of it in the first place. There's a solid chunk of work to do on this:

  • The solver has to keep track of whether a given package is globally included, or only in a certain parameterized context - Keep tags on selected deps so we know the context of their inclusion sdboyer/gps#44
  • The parser in gps.ListPackages() currently lumps all imports together in the returned PackageTree's PackageOrErrs, regardless of build tag context. It needs to be refactored to return a parameterized view of a given directory's go code (no issue created yet)
  • Following on the preceding, PackageTree.ToReachMap() also needs to parameterize its returned information (no issue created yet)
  • "A parameterized view" sounds nice, but it could actually be tricky to create a concise representation of a directory when accounting for negated build tags (e.g. // +build !debug, because negative-matching against a non-finite set (build tags) is nasty. We may just have to preserve such negations directly in the lock.

Once the basic support for this is in place, we can also think about the ways that project authors can indicate that they want to ignore a particular os/arch/build tag set, or (an even stronger statement) that they do not work with a certain tag set.

@kardianos
Copy link

This "feature" appears to be contradicting someone wanting to check-in the vendor folder and them wanting the vendor folder to be useful across platforms.

Is that correct? Or am I misinterpreting this?

@sdboyer
Copy link
Member Author

sdboyer commented Mar 7, 2017

@kardianos ach, if that's your takeaway, then i probably need to wordsmith some more - sorry, this is a complex thing, and my first pass at a summary is rambly and probably needs more examples.

The baseline behavior here is what you describe - vendor/ gets populated with a dependency set that's complete for all os/arch. This feature is primarily about putting enough metadata into the lock that users could choose to ignore certain combinations of os/arch when deciding what to put into vendor. But the default behavior doesn't change: a plain dep ensure will populate vendor/ with the complete dep set entailed by all tag combos.

@kardianos
Copy link

@sdboyer Okay, thanks for the clarification.

To repeat what you said for clarification, this is a proposal to project the build tags from the go files (transversed from the local project go source outwards to leaf dependencies, noting each build tag set as they are encountered) into the lock file. That way if you are fetching sources on say a CI builder you can say "dep ensure -specific-os-arch" (or equiv) and it will not require incrementally inspecting the files during fetch and not fetch too much. Does that sound about right?

@sdboyer
Copy link
Member Author

sdboyer commented Mar 7, 2017

@kardianos eeeexactly. For those not committing vendor/, keeping your CI from downloading more deps than it needs to is a big plus.

Another use case would be, say, annotating visualizations with this information - e.g., #271

@kardianos
Copy link

@sdboyer Okay, thanks for the clarification. :) That does make sense in those use cases.

@nightlyone
Copy link

Please mind things like appengine here, too.

Google Appengine projects that share code with non-appengine projects are notoriously difficult to vendor. This includes having code available in the vendored path for build tags //+build appengine and also //+build !appengine along with their dependency chains excluding all the import path hierarchies "appengine" and "appengine_internal".

@sdboyer
Copy link
Member Author

sdboyer commented Mar 8, 2017

@nightlyone There's some special handling in place for appengine already, but this is a good point, and does remind me...

We need to compile a list of build tag that have some clear, established meaning.

@sdboyer
Copy link
Member Author

sdboyer commented Apr 29, 2017

I meant to update this a while back, but didn't have a chance. Catching up.

I'm now pretty sure that we can bump this until later, as it will be possible to introduce this information into the manifest and lock in a manner that's backwards-compatible. At the very least, it means this isn't necessary for the purposes of stabilizing the files.

The OP describes the mechanism for doing so with the lock - we can retain the existing packages key for the default/unconditional case, then add in a new subsection to record packages whose inclusion is somehow conditional. From a file format perspective, at least, this is an accretion-only change, so the only thing we'd have to worry about is if folks end up writing other, random data to those keys later on. I plan to open up another issue about that problem more generally, though.

I'm less clear on exactly what the manifest changes will look like, but am still reasonably sure that any necessary changes can be restricted to accretion-only changes.

Bottom line, I think this is a can we can kick safely down the road a bit.

wking added a commit to wking/openshift-installer that referenced this issue Feb 21, 2019
And zounds of other dependencies (Prometheus!).  We aren't actually
using most of these, but dep does not currently support pruning by
build tag [1], and a few of the packages slop some unrelated stuff
together.  For example, Prometheus comes in via:

  github.com/containers/image/docker
    github.com/docker/distribution/registry/client
      github.com/docker/distribution/registry/storage/cache
        github.com/docker/distribution/metrics
          github.com/docker/go-metrics
            github.com/prometheus/client_golang/prometheus

and BoltDB comes in via:

  github.com/containers/image/pkg/blobinfocache
    github.com/boltdb/bolt

we don't use either the storage backend or the BoltDB blob-info cache,
but dep isn't checking at that level of granularity.  Ideally the
upstream repositories would restructure to split these out into
separate packages, but until then, just commit all the cruft dep
brings in.

Generated with:

  $ dep ensure

using:

  $ dep version
  dep:
   version     : v0.5.0-31-g73b3afe
   build date  : 2019-02-08
   git hash    : 73b3afe
   go version  : go1.10.3
   go compiler : gc
   platform    : linux/amd64
   features    : ImportDuringSolve=false

[1]: golang/dep#291
wking added a commit to wking/openshift-installer that referenced this issue Mar 8, 2019
And zounds of other dependencies (Prometheus!).  We aren't actually
using most of these, but dep does not currently support pruning by
build tag [1], and a few of the packages slop some unrelated stuff
together.  For example, Prometheus comes in via:

  github.com/containers/image/docker
    github.com/docker/distribution/registry/client
      github.com/docker/distribution/registry/storage/cache
        github.com/docker/distribution/metrics
          github.com/docker/go-metrics
            github.com/prometheus/client_golang/prometheus

and BoltDB comes in via:

  github.com/containers/image/pkg/blobinfocache
    github.com/boltdb/bolt

we don't use either the storage backend or the BoltDB blob-info cache,
but dep isn't checking at that level of granularity.  Ideally the
upstream repositories would restructure to split these out into
separate packages, but until then, just commit all the cruft dep
brings in.

Generated with:

  $ dep ensure

using:

  $ dep version
  dep:
   version     : v0.5.0-31-g73b3afe
   build date  : 2019-02-08
   git hash    : 73b3afe
   go version  : go1.10.3
   go compiler : gc
   platform    : linux/amd64
   features    : ImportDuringSolve=false

[1]: golang/dep#291
wking added a commit to wking/openshift-installer that referenced this issue Mar 8, 2019
And zounds of other dependencies (Prometheus!).  We aren't actually
using most of these, but dep does not currently support pruning by
build tag [1], and a few of the packages slop some unrelated stuff
together.  For example, Prometheus comes in via:

  github.com/containers/image/docker
    github.com/docker/distribution/registry/client
      github.com/docker/distribution/registry/storage/cache
        github.com/docker/distribution/metrics
          github.com/docker/go-metrics
            github.com/prometheus/client_golang/prometheus

and BoltDB comes in via:

  github.com/containers/image/pkg/blobinfocache
    github.com/boltdb/bolt

we don't use either the storage backend or the BoltDB blob-info cache,
but dep isn't checking at that level of granularity.  Ideally the
upstream repositories would restructure to split these out into
separate packages, but until then, just commit all the cruft dep
brings in.

Generated with:

  $ dep ensure

using:

  $ dep version
  dep:
   version     : v0.5.0-31-g73b3afe
   build date  : 2019-02-08
   git hash    : 73b3afe
   go version  : go1.10.3
   go compiler : gc
   platform    : linux/amd64
   features    : ImportDuringSolve=false

[1]: golang/dep#291
wking added a commit to wking/openshift-installer that referenced this issue Mar 8, 2019
And zounds of other dependencies (Prometheus!).  We aren't actually
using most of these, but dep does not currently support pruning by
build tag [1], and a few of the packages slop some unrelated stuff
together.  For example, Prometheus comes in via:

  github.com/containers/image/docker
    github.com/docker/distribution/registry/client
      github.com/docker/distribution/registry/storage/cache
        github.com/docker/distribution/metrics
          github.com/docker/go-metrics
            github.com/prometheus/client_golang/prometheus

and BoltDB comes in via:

  github.com/containers/image/pkg/blobinfocache
    github.com/boltdb/bolt

we don't use either the storage backend or the BoltDB blob-info cache,
but dep isn't checking at that level of granularity.  Ideally the
upstream repositories would restructure to split these out into
separate packages, but until then, just commit all the cruft dep
brings in.

Generated with:

  $ dep ensure

using:

  $ dep version
  dep:
   version     : v0.5.0-31-g73b3afe
   build date  : 2019-02-08
   git hash    : 73b3afe
   go version  : go1.10.3
   go compiler : gc
   platform    : linux/amd64
   features    : ImportDuringSolve=false

[1]: golang/dep#291
wking added a commit to wking/openshift-installer that referenced this issue Mar 9, 2019
And zounds of other dependencies (Prometheus!).  We aren't actually
using most of these, but dep does not currently support pruning by
build tag [1], and a few of the packages slop some unrelated stuff
together.  For example, Prometheus comes in via:

  github.com/containers/image/docker
    github.com/docker/distribution/registry/client
      github.com/docker/distribution/registry/storage/cache
        github.com/docker/distribution/metrics
          github.com/docker/go-metrics
            github.com/prometheus/client_golang/prometheus

and BoltDB comes in via:

  github.com/containers/image/pkg/blobinfocache
    github.com/boltdb/bolt

we don't use either the storage backend or the BoltDB blob-info cache,
but dep isn't checking at that level of granularity.  Ideally the
upstream repositories would restructure to split these out into
separate packages, but until then, just commit all the cruft dep
brings in.

Generated with:

  $ dep ensure

using:

  $ dep version
  dep:
   version     : v0.5.0-31-g73b3afe
   build date  : 2019-02-08
   git hash    : 73b3afe
   go version  : go1.10.3
   go compiler : gc
   platform    : linux/amd64
   features    : ImportDuringSolve=false

[1]: golang/dep#291
wking added a commit to wking/openshift-installer that referenced this issue Mar 10, 2019
And zounds of other dependencies.  Use Gopkg.toml's 'ignored' [1] to
exclude, BoltDB, which comes in via:

  github.com/containers/image/pkg/blobinfocache
    github.com/boltdb/bolt

and github.com/docker/distribution/registry/storage/cache, which comes
in via:

  github.com/containers/image/docker
    github.com/docker/distribution/registry/client
      github.com/docker/distribution/registry/storage/cache

because we don't use either the storage backend or the BoltDB
blob-info cache, but dep isn't checking at that level of granularity.
Runc was being pulled in from a few places for runc/libcontainer/user,
but we apparently don't need that either.  Ideally the upstream
repositories would restructure to split these out into separate
packages (and/or dep will grow support for pruning by build tag [2]),
but until then, we can manually prune via 'ignored'.

Also ignore mtrmac/gpgme, because we only need signature verification
and we don't want to use CGO (because we want to be portable to other
operating systems, and we only need verification support, not signing
support [3]).

There may be more fat we can prune as well, but I got tired of looking
and gave up, even though ~50k lines of new code is pretty embarassing
for what is effectively just a handful of HTTPS requests and an
OpenPGP check.

Generated (after manually editing Gopkg.toml) with:

  $ dep ensure

using:

  $ dep version
  dep:
   version     : v0.5.0-31-g73b3afe
   build date  : 2019-02-08
   git hash    : 73b3afe
   go version  : go1.10.3
   go compiler : gc
   platform    : linux/amd64
   features    : ImportDuringSolve=false

[1]: https://golang.github.io/dep/docs/Gopkg.toml.html#ignored
[2]: golang/dep#291
[3]: containers/image#268
wking added a commit to wking/openshift-installer that referenced this issue Mar 12, 2019
And zounds of other dependencies.  Use Gopkg.toml's 'ignored' [1] to
exclude, BoltDB, which comes in via:

  github.com/containers/image/pkg/blobinfocache
    github.com/boltdb/bolt

and github.com/docker/distribution/registry/storage/cache, which comes
in via:

  github.com/containers/image/docker
    github.com/docker/distribution/registry/client
      github.com/docker/distribution/registry/storage/cache

because we don't use either the storage backend or the BoltDB
blob-info cache, but dep isn't checking at that level of granularity.
Ideally the upstream repositories would restructure to split these out
into separate packages (and/or dep will grow support for pruning by
build tag [2]), but until then, we can manually prune via 'ignored'.

Also ignore mtrmac/gpgme, because we only need signature verification
and we don't want to use CGO (because we want to be portable to other
operating systems, and we only need verification support, not signing
support [3]).

There may be more fat we can prune as well, but I got tired of looking
and gave up, even though ~50k lines of new code is pretty embarassing
for what is effectively just a handful of HTTPS requests and an
OpenPGP check.

Generated (after manually editing Gopkg.toml) with:

  $ dep ensure

using:

  $ dep version
  dep:
   version     : v0.5.0-31-g73b3afe
   build date  : 2019-02-08
   git hash    : 73b3afe
   go version  : go1.10.3
   go compiler : gc
   platform    : linux/amd64
   features    : ImportDuringSolve=false

[1]: https://golang.github.io/dep/docs/Gopkg.toml.html#ignored
[2]: golang/dep#291
[3]: containers/image#268
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants