-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove randomness in package uniqueness ordering #461
Remove randomness in package uniqueness ordering #461
Conversation
Signed-off-by: Mikalai Radchuk <mradchuk@redhat.com>
Codecov ReportAll modified lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #461 +/- ##
==========================================
+ Coverage 83.80% 83.82% +0.01%
==========================================
Files 23 23
Lines 846 847 +1
==========================================
+ Hits 709 710 +1
Misses 94 94
Partials 43 43
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
@@ -43,7 +44,9 @@ func (g *CRDUniquenessConstraintsVariableSource) GetVariables(ctx context.Contex | |||
// todo(perdasilva): better handle cases where a provided gvk is not found | |||
// not all packages will necessarily export a CRD | |||
|
|||
pkgToBundleMap := map[string]map[deppy.Identifier]struct{}{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be more straightforward to use map[string]sets.Set[deppy.Identifier]{}
and simply append a sorted list from the set. That would give a defined order to the output, whereas now it seems to mirror the order of the variables in the input - is the order of the variables somehow useful or do we just need something deterministic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the order of the variables somehow useful or do we just need something deterministic?
Order of dependencies in bundle variable is important in some places. E.g. when we want to upgrade we sort dependencies in order of prefrence (which is just by versions at the moment).
I believe here specifically is not that important as we enforce uniqueness, but we still want to be deterministic so we can avoid doing this at some point.
It might be more straightforward to use map[string]sets.Set[deppy.Identifier]{} and simply append a sorted list from the set.
It increase time complexity to O(n log n) and space will still be O(n) in both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space will still be O(n) in both cases.
You can remove bundleIDs
and packageOrder
variables. Regardless of the memory efficiency, it's just easier to read since there are fewer things in play.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think going into log linear time is not worth the readability improvement.
Also It seems to me that we will still need:
- A slice for package names so we can to perform ordering on it
- A slice or a map of slices for bundle ids so we can perform ordering on it
- A loop or two to convert a map into slices before sorting
Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How large is n
? I imagine the change in runtime has no practical implication whatsoever.
Or am I missing something?
If you have:
packageOrder := []string{}
pkgToBundles := map[string]sets.Set[deppy.Identifier]{}
// ...
for _, packageName := range packageOrder {
variables = append(variables, olmvariables.NewBundleUniquenessVariable(varID, pkgToBundles[packageName].List()...))
I believe that would be sufficient, no? In any case, it's a small point and not worth belaboring.
Your transformation seems fine, but adding bundleIDs
does make this slightly harder to follow, while allowing for new behavior - previously if a bundle ID existed for more than one package, it would be added to the list of both. Now there's a global uniqueness constraint. Is that intentional? Similarly, by tracking bundleOrder
separately (as opposed to extracting the keys of pkgToBundles
) there's space for errors if the two are out of sync. If you don't think the duplication of the source of truth there and some of the new edges are worth worrying about, that's totally acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it won't be sufficient since we also need to ensure that we do not have duplicate items in packageOrder which can happen (multiple packages depend on the package name).
Just to verify, the current logic already ensures that packages and bundles are not duplicated in the resolver even if multiple catalogs provide the same information? For example, if I created a Catalog
named foo
that had an image ref to someimage:foo
and a Catalog
named bar
that had the exact same image ref of someimage:foo
- what would happen in resolution if I asked to install a package that exists in the someimage:foo
catalog image?
I'm definitely lacking domain knowledge regarding resolution, but I do think it is important to point out that catalogd doesn't perform any validation as to whether or not FBC content is repeated across multiple catalogs. As long as this is accounted for in resolution I think that should be fine though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to verify, the current logic already ensures that packages and bundles are not duplicated in the resolver even if multiple catalogs provide the same information?
I do think it is important to point out that catalogd doesn't perform any validation as to whether or not FBC content is repeated across multiple catalogs. As long as this is accounted for in resolution I think that should be fine though.
OLMv1 (at least at the moment) allows the same package to come from multiple catalogs. If you have a package foo
in catalog1
and in catalog2
it will treat them as the same package and I belive will allow upgrade of of foo
from version listed in catalog1
to version listed in catalog2
.
However in this specific code we ensure that resolution only allows 1 bundle which belongs to pacakge foo
. So even if you install catalog1
and catalog2
which is direct copy - we should be good.
For example, if I created a
Catalog
namedfoo
that had an image ref tosomeimage:foo
and aCatalog
namedbar
that had the exact same image ref ofsomeimage:foo
- what would happen in resolution if I asked to install a package that exists in thesomeimage:foo
catalog image?
Did you mean that someimage:foo
is the ref of the catalog image? If yes - we should be good as we only allow one bundle per pacakge in resolution.
However I fear the case when there are pacakges package1
and package2
which have bundles which share the same bundle image. I think this and this piece of code (they work together) might allow upgrading from package1
to package2
and I think we don't want that.
E.g. when catalogs (single catalog or two different catalogs) have something like this:
package1
bundles:
image:tag1
- version1.0.0
package2
bundles:
image:tag1
- version1.0.0
image:tag2
- version1.0.1
In this case, I think, when you install package1
at version 1.0.0
and then upgrade - our installed bundle might end up being the one from package2
since we search bundes by image references. And installed bundle is driving successors.
I looked into how to mitigate this in #442, but it didn't feel like the right solutiion.
I want to get back to this.
P.S: Just noticed that we do not have an isue to track this - I'll create one shortly. Edit: created #466
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same bundle image.
Is the bundle uniqueness based on the bundle image or the "name" of the bundle? would there be an issue if two separate packages specified a bundle with the same "name" but referenced different images?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Late to the party. A couple of comments:
- we definitely want to allow (probably forever) the concept that a package may be provided by one or more catalogs, and it would be possible for an admin to 'walk' versions of the package across catalogs. We will eventually need to resolve collisions where the same package/bundle version is provided by multiple catalogs, but not right now.
- we absolutely have the assumption that the tree formed by package+bundle+version branching will be unique. It would make sense that sometime we need to resolve any collisions, but doesn't have to be now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the bundle uniqueness based on the bundle image or the "name" of the bundle?
Bundle image issue I mentioned above is the different part of the process (how we determine currently installed bundles for calculating possible successors). Sorry if I added confusion by mentioning unrelated thing: someone you wrote in the previous comment reminded me of this.
Here, where we create global uniqueness constraints for bundles, we are saying that "in our solution, bundle foo
must have at most 1 bundle from the list of catalog-foo-bundlename.v1.0.0, catalog-foo-bundlename.v1.0.1, etc`. where things in the list are bundle variable IDs.
This brings us to this:
would there be an issue if two separate packages specified a bundle with the same "name" but referenced different images?
It will not be an issue. For resolution we use variable IDs and they must be unique across whole cluster. When it comes to bundles - currently we construct variable ID based on catalog name, package name and bundle name:
operator-controller/internal/resolution/variables/bundle.go
Lines 63 to 67 in c9d387e
func BundleVariableID(bundle *catalogmetadata.Bundle) deppy.Identifier { | |
return deppy.Identifier( | |
fmt.Sprintf("%s-%s-%s", bundle.CatalogName, bundle.Package, bundle.Name), | |
) | |
} |
In OLMv0 (and until recently in OLMv1) we also had a channel as part of ID, but it is not necessary since we filter out unrelated bundles before we created a variable ID for them bundle and before we give them to the resolver.
/lgtm |
Description
Partly addresses #459. I would still want to keep the issue open so we can get rid of
prettyUnsatMessage
function.Reviewer Checklist