Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove randomness in package uniqueness ordering #461

Merged
merged 1 commit into from
Oct 18, 2023

Conversation

m1kola
Copy link
Member

@m1kola m1kola commented Oct 16, 2023

Description

Partly addresses #459. I would still want to keep the issue open so we can get rid of prettyUnsatMessage function.

Reviewer Checklist

  • API Go Documentation
  • Tests: Unit Tests (and E2E Tests, if appropriate)
  • Comprehensive Commit Messages
  • Links to related GitHub Issue(s)

Signed-off-by: Mikalai Radchuk <mradchuk@redhat.com>
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 16, 2023
@m1kola m1kola marked this pull request as ready for review October 16, 2023 10:12
@m1kola m1kola requested a review from a team as a code owner October 16, 2023 10:12
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 16, 2023
@codecov
Copy link

codecov bot commented Oct 16, 2023

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (e001a0f) 83.80% compared to head (0f0c312) 83.82%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #461      +/-   ##
==========================================
+ Coverage   83.80%   83.82%   +0.01%     
==========================================
  Files          23       23              
  Lines         846      847       +1     
==========================================
+ Hits          709      710       +1     
  Misses         94       94              
  Partials       43       43              
Flag Coverage Δ
e2e 65.99% <100.00%> (+0.04%) ⬆️
unit 76.61% <100.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...rnal/resolution/variablesources/crd_constraints.go 100.00% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@m1kola m1kola mentioned this pull request Oct 16, 2023
4 tasks
@@ -43,7 +44,9 @@ func (g *CRDUniquenessConstraintsVariableSource) GetVariables(ctx context.Contex
// todo(perdasilva): better handle cases where a provided gvk is not found
// not all packages will necessarily export a CRD

pkgToBundleMap := map[string]map[deppy.Identifier]struct{}{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be more straightforward to use map[string]sets.Set[deppy.Identifier]{} and simply append a sorted list from the set. That would give a defined order to the output, whereas now it seems to mirror the order of the variables in the input - is the order of the variables somehow useful or do we just need something deterministic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the order of the variables somehow useful or do we just need something deterministic?

Order of dependencies in bundle variable is important in some places. E.g. when we want to upgrade we sort dependencies in order of prefrence (which is just by versions at the moment).

I believe here specifically is not that important as we enforce uniqueness, but we still want to be deterministic so we can avoid doing this at some point.

It might be more straightforward to use map[string]sets.Set[deppy.Identifier]{} and simply append a sorted list from the set.

It increase time complexity to O(n log n) and space will still be O(n) in both cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space will still be O(n) in both cases.

You can remove bundleIDs and packageOrder variables. Regardless of the memory efficiency, it's just easier to read since there are fewer things in play.

Copy link
Member Author

@m1kola m1kola Oct 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think going into log linear time is not worth the readability improvement.

Also It seems to me that we will still need:

  • A slice for package names so we can to perform ordering on it
  • A slice or a map of slices for bundle ids so we can perform ordering on it
  • A loop or two to convert a map into slices before sorting

Or am I missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How large is n? I imagine the change in runtime has no practical implication whatsoever.

Or am I missing something?

If you have:

packageOrder := []string{}
pkgToBundles := map[string]sets.Set[deppy.Identifier]{}

// ...
for _, packageName := range packageOrder {
    variables = append(variables, olmvariables.NewBundleUniquenessVariable(varID, pkgToBundles[packageName].List()...))

I believe that would be sufficient, no? In any case, it's a small point and not worth belaboring.

Your transformation seems fine, but adding bundleIDs does make this slightly harder to follow, while allowing for new behavior - previously if a bundle ID existed for more than one package, it would be added to the list of both. Now there's a global uniqueness constraint. Is that intentional? Similarly, by tracking bundleOrder separately (as opposed to extracting the keys of pkgToBundles) there's space for errors if the two are out of sync. If you don't think the duplication of the source of truth there and some of the new edges are worth worrying about, that's totally acceptable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it won't be sufficient since we also need to ensure that we do not have duplicate items in packageOrder which can happen (multiple packages depend on the package name).

Just to verify, the current logic already ensures that packages and bundles are not duplicated in the resolver even if multiple catalogs provide the same information? For example, if I created a Catalog named foo that had an image ref to someimage:foo and a Catalog named bar that had the exact same image ref of someimage:foo - what would happen in resolution if I asked to install a package that exists in the someimage:foo catalog image?

I'm definitely lacking domain knowledge regarding resolution, but I do think it is important to point out that catalogd doesn't perform any validation as to whether or not FBC content is repeated across multiple catalogs. As long as this is accounted for in resolution I think that should be fine though.

Copy link
Member Author

@m1kola m1kola Oct 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to verify, the current logic already ensures that packages and bundles are not duplicated in the resolver even if multiple catalogs provide the same information?

I do think it is important to point out that catalogd doesn't perform any validation as to whether or not FBC content is repeated across multiple catalogs. As long as this is accounted for in resolution I think that should be fine though.

OLMv1 (at least at the moment) allows the same package to come from multiple catalogs. If you have a package foo in catalog1 and in catalog2 it will treat them as the same package and I belive will allow upgrade of of foo from version listed in catalog1 to version listed in catalog2.

However in this specific code we ensure that resolution only allows 1 bundle which belongs to pacakge foo. So even if you install catalog1 and catalog2 which is direct copy - we should be good.

For example, if I created a Catalog named foo that had an image ref to someimage:foo and a Catalog named bar that had the exact same image ref of someimage:foo - what would happen in resolution if I asked to install a package that exists in the someimage:foo catalog image?

Did you mean that someimage:foo is the ref of the catalog image? If yes - we should be good as we only allow one bundle per pacakge in resolution.

However I fear the case when there are pacakges package1 and package2 which have bundles which share the same bundle image. I think this and this piece of code (they work together) might allow upgrading from package1 to package2 and I think we don't want that.

E.g. when catalogs (single catalog or two different catalogs) have something like this:

package1 bundles:

  • image:tag1 - version 1.0.0

package2 bundles:

  • image:tag1 - version 1.0.0
  • image:tag2 - version 1.0.1

In this case, I think, when you install package1 at version 1.0.0 and then upgrade - our installed bundle might end up being the one from package2 since we search bundes by image references. And installed bundle is driving successors.

I looked into how to mitigate this in #442, but it didn't feel like the right solutiion.

I want to get back to this.

P.S: Just noticed that we do not have an isue to track this - I'll create one shortly. Edit: created #466

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same bundle image.

Is the bundle uniqueness based on the bundle image or the "name" of the bundle? would there be an issue if two separate packages specified a bundle with the same "name" but referenced different images?

Copy link
Contributor

@grokspawn grokspawn Oct 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Late to the party. A couple of comments:

  1. we definitely want to allow (probably forever) the concept that a package may be provided by one or more catalogs, and it would be possible for an admin to 'walk' versions of the package across catalogs. We will eventually need to resolve collisions where the same package/bundle version is provided by multiple catalogs, but not right now.
  2. we absolutely have the assumption that the tree formed by package+bundle+version branching will be unique. It would make sense that sometime we need to resolve any collisions, but doesn't have to be now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the bundle uniqueness based on the bundle image or the "name" of the bundle?

Bundle image issue I mentioned above is the different part of the process (how we determine currently installed bundles for calculating possible successors). Sorry if I added confusion by mentioning unrelated thing: someone you wrote in the previous comment reminded me of this.

Here, where we create global uniqueness constraints for bundles, we are saying that "in our solution, bundle foo must have at most 1 bundle from the list of catalog-foo-bundlename.v1.0.0, catalog-foo-bundlename.v1.0.1, etc`. where things in the list are bundle variable IDs.

This brings us to this:

would there be an issue if two separate packages specified a bundle with the same "name" but referenced different images?

It will not be an issue. For resolution we use variable IDs and they must be unique across whole cluster. When it comes to bundles - currently we construct variable ID based on catalog name, package name and bundle name:

func BundleVariableID(bundle *catalogmetadata.Bundle) deppy.Identifier {
return deppy.Identifier(
fmt.Sprintf("%s-%s-%s", bundle.CatalogName, bundle.Package, bundle.Name),
)
}

In OLMv0 (and until recently in OLMv1) we also had a channel as part of ID, but it is not necessary since we filter out unrelated bundles before we created a variable ID for them bundle and before we give them to the resolver.

@stevekuznetsov
Copy link
Member

/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 17, 2023
@dtfranz dtfranz added this pull request to the merge queue Oct 18, 2023
Merged via the queue into operator-framework:main with commit 7020ae2 Oct 18, 2023
12 checks passed
@m1kola m1kola deleted the package_uniqueness_ordering branch October 19, 2023 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants