Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: import filtering #543

Open
mbolivar-nordic opened this issue Sep 29, 2021 · 16 comments
Open

Enhancement: import filtering #543

mbolivar-nordic opened this issue Sep 29, 2021 · 16 comments
Labels
Partial imports Incomplete or changing imports are much more complicated than you think

Comments

@mbolivar-nordic
Copy link
Contributor

mbolivar-nordic commented Sep 29, 2021

As of west 0.11.x, you cannot combine groups with import in a single project definition. The documentation says:

As a restriction, no project may use both import: and groups:. (This avoids some edge cases whose semantics are difficult to specify.)

If you try, you'll hit the error handling block that begins here:

if imp and groups:

This issue describes:

  • a motivating use case for relaxing this restriction
  • a rationale for why I introduced the restriction in the first place
  • a proposal for how to remove the restriction and deal with the resulting difficulties
  • a chicken bit I would put in place to save us in case the whole idea is a great big mistake

Use case

Consider a CI maintainer for a downstream Zephyr-based software distribution that provides additional platform support on top of the base OS.

As that CI maintainer, I (the hypothetical CI maintainer, not necessarily @mbolivar-nordic) would like to associate a set of test repositories with the manifest repository for my Zephyr distribution. I would like to do this by including some repositories containing test cases in my west manifest.

This will allow me to conveniently version the test repository revisions alongside the revisions of the code being tested. That allows me to maintain different branches of the same test case repositories to exercise different branches of the accompanying downstream distribution. Furthermore, putting my test repositories into west allows my platform developers to more easily reproduce results such as build errors in my test applications by running west update to get the test code.

I would like to adapt the continuous integration overrides pattern from the west documentation for my use case. This will also allow me to exercise changes that have to go into my test and platform repositories at the same time, which may occur for example if a Zephyr API used by both types of repository changes. I would like to adapt the documented pattern as follows.

First, I do not want developers to have to clone these test repositories by default into the workspace for several potential reasons:

  • they are not relevant to day-to-day development
  • they contain proprietary information
  • they contain large amounts of binary test data or other artifacts

I therefore would like the test repositories to be inactive projects by default.

Second, I do not want the names or locations of the test repositories to be made public as they are also considered proprietary. (This is not incompatible with an open source platform, e.g. sqlite is another OSS project that has a proprietary test suite.)

I therefore would like to organize my manifests like this:

# my-manifest-repo/west.yml
manifest:
  group-filter: [-proprietary-tests]
  projects:
    # ... import zephyr + add my special sauce platform/application repositories
  self:
    import: submanifests

# my-manifest-repo/submanifests/99-tests.yaml
manifest:
  projects:
    - name: my-main-test-repo
      url: https://git.mycompany.com/my-main-test-repo
      revision: SOME-SHA
      groups: [proprietary-tests]
      import: true

# my-main-test-repo/west.yml
manifest:
  # ...
  projects:
    - name: my-test-repo-1
      groups: [proprietary-tests]
    - name: my-test-repo-2
      groups: [proprietary-tests]
    - name: my-test-repo-3
      groups: [proprietary-tests]
    # ...
    - name: my-test-repo-N
      groups: [proprietary-tests]

This would accomplish my goals:

  1. I can use different revision fields for my-main-test-repo in different release branches of my platform to keep parallel test branches alive at the same time
  2. I can dynamically generate my-manifest-repo/submanifests/00-test-overrides.yml for individual pull requests to selectively override any test repository or platform repository, allowing me to test workspaces where changes to test and platform repositories must be exercised in concert
  3. My developers and users can still run west update using my-manifest-repo without getting the tests
  4. The names and locations of the test case repositories themselves are not public because my-main-test-repo has a generic name and is access controlled
  5. My developers can enable the proprietary-tests group and run west update to get an environment that matches my CI, in order to reproduce errors that require test repositories to build and run

However, I can't do this because west won't let me do this snippet which combines groups and import:

      groups: [proprietary-tests]
      import: true

This is a showstopper for me.

Rationale for the restriction

For context, I (@mbolivar-nordic this time) was the main designer and the implementer for both the import: and groups: features; import: came first.

When working on groups:, I found that specifying the semantics for the combination of the two ran into some edge cases that I found difficult to specify without a good motivating use case that would help guide the design. I therefore decided to simply forbid it for convenience, and wait to see if anything would come along that would light the way forward.

Here's an example I cooked up that shows the seemingly paradoxical results that can arise from naive interpretations of what to do:

# manifest-repo/west.yml
manifest:
  projects:
    - name: submanifest-repo
      groups: [foo]
      import: true

# submanifest-repo/west.yml
manifest:
  group-filter: [-foo]
  projects:
    - name: proj1
    - name: proj2

I might say "submanifest-repo is obviously active because foo is enabled in manifest-repo/west.yml." But then if west imports submanifest-repo/west.yml, I end up with foo disabled in the resolved manifest, because group-filter values from imported manifests affect the resolved manifest. This makes submanifest-repo inactive, because its only group is disabled.

A paradox:

  • if submanifest-repo is considered active and we import from it, then it's inactive, for reasons described above
  • but if it's considered inactive, then we would never import the [-foo] group filter, so then... it's active, actually?

What about proj1 and proj2 then? Should they be "deleted" from the resolved manifest somehow, or are they still OK to include as active projects in situations like this?

Proposal for removing the restriction

I think the above use case is a valid one and I would like to propose that west enables it. However, we have to figure out how to resolve the above difficulties.

I think we should go ahead and perform the import:, even if the project has groups:, as long as the available information indicates that the project is active. So in the example from the above rationale, we should import submanifest-repo unless the workspace's manifest.group-filter configuration option has disabled group foo, because manifest-repo/west.yml on its own would treat submanifest-repo as an active project since not all its groups are disabled by default.

Then, after resolving the complete manifest, we should do a postprocessing check that all projects with an import: remain active projects after all the imports are done. (A project with no groups is active by definition, which is why forbidding the combination of the two sidestepped these difficulties.) If so, the resulting resolved manifest seems to be internally consistent to me. If not, we skip the paradox by throwing an exception from the west.manifest.Manifest constructor, preventing users from constructing this type of self-contradictory (even Rusellian) Manifest object.

This behavior would handle the main use case just fine, since the CI maintainer is not going to do something strange like disable the proprietary-tests group from my-main-test-repo/west.yml. So the workspace works as intended with and without that group enabled: if it's enabled, the test projects are active. If it's disabled (the default), they are inactive, and that includes my-main-test-repo itself. Users and developers get what they want by default (i.e. no test repositories), but it's easy to make test repositories into active projects by enabling a single group.

The above error handling might force west update to clone some additional projects that it would ordinarily not have in order to compute the resolved manifest, especially in the case of multiple levels of imports. That in turn could lead to permissions errors if the repositories are access controlled. So be it.

Chicken bit

The above proposal is a bit risky since we only have one use case, so I think this should be an experimental behavior we can rip out later. Furthermore, I think it should require explicit "opt-in" support. So I am also proposing a new boolean manifest.experimental-group-import configuration option that must be enabled to turn on this new behavior.

Therefore the CI environment would have to do something like this in order to opt in:

west config --global manifest.experimental-group-import true
west init -m my-manifest-repo-url my-workspace
cd my-workspace
west config manifest.group-filter +proprietary-tests
west update

If manifest.experimental-group-import is disabled, the restriction remains in place and west update fails with the existing error message. Otherwise we proceed as above.

The maintainers of CI environments have many options available for controlling exactly the version of west used for testing and their setup scripts, so this allows us to get some real world experience without fully committing to the above behavior if we run into unanticipated consequences.

@DatGizmo
Copy link

Would a new filter flag for imports be less prone to the confusion you described in the "Rational restricrions"?
So instead of reusing the group filter use a new "import-filter" flag.

@carlescufi
Copy link
Member

Thanks for the detailed explanation, it makes it quite clear why this will be useful. No objections at all from my side to the principle of the feature or the proposed restriction removal.

@mbolivar-nordic
Copy link
Contributor Author

So instead of reusing the group filter use a new "import-filter" flag.

What would the semantics of this be?

@mbolivar-nordic
Copy link
Contributor Author

I updated the rationale section a bit to try to make it clearer following an offline discussion with @marc-hb .

@marc-hb
Copy link
Collaborator

marc-hb commented Oct 1, 2021

So instead of reusing the group filter use a new "import-filter" flag.

I wish this comment had been at the top so I had read it first. I don't know if a new import-filter tag is the solution but I feel like there's at least no possibly shorter way to summarize the problem.

Please let me rephrase and try to summarize too and correct me. After a lot of time getting familiar again with these features I came to the same conclusion: I think what is desired here could be called "import filtering". Correct?

Up to now filtering has allowed the distribution of manifests that include private ("opt-in") repos, however the current implementation still forces the disclosure of the location and name of ALL repos; active or not. Hence the desire to filter not just "leaf" repos but higher level imports too.

HOWEVER the current filter resolution is based on a complete view of the entire, resolved manifest that includes ALL imports! So no surprise a paradox arises!

So the proposal in the description is to add a new, first and dynamic import filtering pass performed at import time. Then keep the older and now second repo filtering pass working after imports as before with just some additional sanity checks added. This is effectively overloading the group-filter syntax with two closely interrelated but different features, so I'm not surprised by the suggestion of a new import-filter tag.

@marc-hb
Copy link
Collaborator

marc-hb commented Oct 1, 2021

This is effectively overloading the group-filter syntax with two closely interrelated but different features

The current proposal seems to gloss over the fact that a "project" combines two very different things in one:

  1. a git repo
  2. a symbolic link to other manifest(s), a.k.a an import

The current group-filter feature is very well defined for git repos and the proposal wants to extend it to imports but the big problem is that 1. and 2. are very different. A git repo is obviously very different from a symbolic link to other manifest(s) but that's not the main issue; the main issue is that filtering git repos requires by design since #481 (which I didn't like...) to import all manifests first, so no surprise it clashes with filtering manifests! Despite the lack of detail, I strongly suspect that's where the import-filter suggestion came from: the desire to separate these different types of filters.

I also think these two types of filters should be better separated to reduce confusion enough and make documentation possible. But I'd like to suggest a different angle. Instead of sharing groups between repos and symbolic links and having two sorts of filters (I'm merely guessing about import-filter here), how about sharing filters and have two types of groups instead ? Groups of git repos on one hand (no change here) and groups of import on the other hand. import: already has many filtering possibilities: just import filtering based on groups.

Unlike filtering repo groups, filtering import groups is distinct and "dynamic" meaning it does NOT require importing everything first which would obviously defeat its own purpose. It obviously happens before repo filtering. This should remove any paradox. For instance:

# manifest-repo/west.yml
manifest:
  projects:
    - name: submanifest-repo
      groups: [foo]
      import:
        - groups: [foo]       # NEW !
        - ...

# submanifest-repo/west.yml
manifest:
  group-filter: [-foo]
  projects:
    - name: proj1
    - name: proj2

With west config manifest.group-filter -- +foo, the import is performed. Then, the second repo filtering pass is performed unchanged and every repo is active.

With west config manifest.group-filter -- -foo, the import is blocked and submanifest-repo/west.yml is never seen by west. submanifest-repo.git is also inactive because it is in group foo too.

With west config -D manifest.group-filter, the import is performed. Then, the second repo filtering pass is performed unchanged and every repo is inactive because of -foo.

No paradox because git repos and imports don't share groups.

@DatGizmo
Copy link

DatGizmo commented Oct 1, 2021

marc-hb more or less already explained what I meant, but will do so as well, as I'm looking at this more from the user / CI side.

Variant A:

# manifest-repo/west.yml
manifest:
  projects:
    [ ... ]
  self:
    import: submanifests

# submanifests/test_repo.yml
manifest:
  import-filter: [-test_repo, -test_repo2]
  projects:
    - name: test-repo
      [ ... ]
      import-group: test_repo 
    - name: test-repo2
      [ ... ]
      import-group: test_repo2

Here the import would be done on a project level.
But I feel this is too close to the group setup and imho to low level.

Variant B:

# manifest-repo/west.yml
manifest:
  projects:
    [ ... ]
  self:
    import: submanifests

# submanifests/test_repo.yml
manifest:
  import-group: test_repo
  import-filter: [-test_repo]
  projects:
    [ ... ]

In this variant all the information is given by the yml file in the submanifests folder, which means that individual files in the submanifests folder can be selected.

Variant C:

# manifest-repo/west.yml
manifest:
  import-filter: [-test_repos]
  projects:
    [ ... ]
  self:
    import: 
      path: submanifests
      import-group: test_repos

In this setup you can select if the whole submanifests folder should be imported.

The selection via cmd-line would be similar to the group-filter

west update --import-filter=+test_repos

@mbolivar-nordic
Copy link
Contributor Author

Thanks for great feedback! I agree with the general idea that we are dealing with two different things and it is probably worth writing them down separately in the manifest file.

With west config -D manifest.group-filter, the import is performed.

@marc-hb Unfortunately, I don't think this addresses the motivating use case. Relevant snippet from the description:

    - name: my-main-test-repo
      url: https://git.mycompany.com/my-main-test-repo
      revision: SOME-SHA
      groups: [proprietary-tests]
      import: true

We cannot perform the import with an empty/missing group filter (i.e. the default group filter for a fresh workspace) because git.mycompany.com is a private server. That breaks all the external users who will never have access to this server.

Variant A:

@DatGizmo I agree with your assessment that we should not go with this one.

Variant B:

What I find interesting about this variant is that it seems to allow each submanifest to declare the groups it would like to be in. But I think this may have a similar problem around access control. The manifest-repo/west.yml file here says:

  self:
    import: submanifests

So 'submanifests' is a directory in the same repository and we definitely have access to it.

But if manifest-repo/west.yml had instead said:

manifest:
  projects:
    - name: my-private-repository
      url: https://git.mycompany.com/my-private-repository
      import: true

then this variant would seem not to work for similar reasons as @marc-hb 's.

Variant C:

I think this is the best proposal so far, because it seems to allow each manifest to put imports into groups, and also allows it to define a filter on what imports should be performed, by group. I wonder how it would work across imports, though. Like if three different manifest files each define the same import group, are these shared? Just trying to mentally work through some more details.

@mbolivar-nordic mbolivar-nordic changed the title Enhancement: allowing 'groups:' + 'import:' Enhancement: import filtering Oct 1, 2021
@gregshue
Copy link

gregshue commented Oct 4, 2021

if three different manifest files each define the same import group, are these shared?

This is just another example of a global namespace that must be managed across the entire extended Zephyr ecosystem (along with Kconfig symbols, C preprocessor symbols, include pathnames, linker symbols, shell commands, settings IDs, testcase tags, ...).

@il-steffen
Copy link

I think I have a similar use case and wanted to see if there are any known fixes/workarounds.

I have a project that uses west and uses Zephyr RTOS as an optional example use case. To enable the Zephyr example, the best option seems to 'import' the zephyrproject. But this forces me to always include Zephyr, even for users that do not know and understand Zephyr. My current solution is to optionally add the zephyr import via a local submanifest folder.

Another complication is that west import always seems to look at the declared git revision and ignores any local modifications. This makes it hard to customize imported manifests, and to test changes to a manifest.

@marc-hb
Copy link
Collaborator

marc-hb commented Feb 8, 2022

But this forces me to always include Zephyr, even for users that do not know and understand Zephyr.

Note it's never required to download/clone everything in the manifest. On that topic see

Another complication is that west import always seems to look at the declared git revision and ignores any local modifications.

Yes this is documented here:
https://docs.zephyrproject.org/latest/guides/west/manifest.html#example-1-2-rolling-release-zephyr-downstream

It’s also important to understand that west ignores your working tree’s zephyr/west.yml entirely when resolving imports.

I believe (@mbolivar-nordic ?) this is to keep a reasonable level of complexity. Here's a somewhat related example that shows how complexity can spiral out of control really fast:

@il-steffen
Copy link

il-steffen commented Feb 14, 2022

Note it's never required to download/clone everything in the manifest. On that topic see #519

Could you elaborate? I see a proposal and discussion here but no implementation and nothing in west update --help. (edit: to clarify - I know about groups and "west update $project" option, but my question here is specifically on projects that have an import line. In this case west complains that I cannot use group and import commands on same project.)

Another complication is that west import always seems to look at the declared git revision and ignores any local modifications.

Yes this is documented here: https://docs.zephyrproject.org/latest/guides/west/manifest.html#example-1-2-rolling-release-zephyr-downstream

I believe (@mbolivar-nordic ?) this is to keep a reasonable level of complexity. Here's a somewhat related example that shows how complexity can spiral out of control really fast: #548 (comment)

I don't see how this would get more complex when taking the local worktree status. I would argue it is a lot less surprising if results actually match the files I see and modified..

@marc-hb
Copy link
Collaborator

marc-hb commented Feb 14, 2022

Could you elaborate?

It's very simple: never ever use west update. Instead, always use: west update only what I need

@marc-hb
Copy link
Collaborator

marc-hb commented Feb 14, 2022

I don't see how this would get more complex when taking the local worktree status. I would argue it is a lot less surprising if results actually match the files I see and modified..

This is not about the least surprise for the user. It's about the complexity of the internal implementation meaning bugs, corner cases and maybe... even more surprising behaviors in the end.

I think @mbolivar-nordic explained this once somewhere but I couldn't find it sorry.

@marc-hb
Copy link
Collaborator

marc-hb commented May 3, 2023

@marc-hb Unfortunately, I don't think this addresses the motivating use case. Relevant snippet from the description:

That's a different example. Your example has [-proprietary-tests] at the top, mine had not.

@marc-hb
Copy link
Collaborator

marc-hb commented May 3, 2023

I forgot to say: @mbolivar-nordic submitted

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Partial imports Incomplete or changing imports are much more complicated than you think
Projects
None yet
Development

No branches or pull requests

6 participants