Skip to content

Inject cross-resource references from provider metadata#12

Merged
ulucinar merged 5 commits into
crossplane:mainfrom
ulucinar:fix-7
Jul 27, 2022
Merged

Inject cross-resource references from provider metadata#12
ulucinar merged 5 commits into
crossplane:mainfrom
ulucinar:fix-7

Conversation

@ulucinar

@ulucinar ulucinar commented Jun 14, 2022

Copy link
Copy Markdown
Collaborator

Description of your changes

Fixes #7

I have:

  • Read and followed Crossplane's contribution process.
  • Run make reviewable to ensure this PR is ready for review.
  • Added backport release-x.y labels to auto-backport this PR if necessary.

How has this code been tested

For a azurerm_disk_encryption_set, here are the associated Terraform registry examples (together with its dependencies):

data "azurerm_client_config" "current" {}

resource "azurerm_resource_group" "example" {
  name     = "example-resources"
  location = "West Europe"
}

resource "azurerm_key_vault" "example" {
  name                        = "des-example-keyvault"
  location                    = azurerm_resource_group.example.location
  resource_group_name         = azurerm_resource_group.example.name
  tenant_id                   = data.azurerm_client_config.current.tenant_id
  sku_name                    = "premium"
  enabled_for_disk_encryption = true
  purge_protection_enabled    = true
}

resource "azurerm_key_vault_key" "example" {
  name         = "des-example-key"
  key_vault_id = azurerm_key_vault.example.id
  key_type     = "RSA"
  key_size     = 2048

  depends_on = [
    azurerm_key_vault_access_policy.example-user
  ]

  key_opts = [
    "decrypt",
    "encrypt",
    "sign",
    "unwrapKey",
    "verify",
    "wrapKey",
  ]
}

resource "azurerm_key_vault_access_policy" "example-disk" {
  key_vault_id = azurerm_key_vault.example.id

  tenant_id = azurerm_disk_encryption_set.example.identity.0.tenant_id
  object_id = azurerm_disk_encryption_set.example.identity.0.principal_id

  key_permissions = [
    "Get",
    "WrapKey",
    "UnwrapKey"
  ]
}

resource "azurerm_key_vault_access_policy" "example-user" {
  key_vault_id = azurerm_key_vault.example.id

  tenant_id = data.azurerm_client_config.current.tenant_id
  object_id = data.azurerm_client_config.current.object_id

  key_permissions = [
    "get",
    "create",
    "delete"
  ]
}

resource "azurerm_disk_encryption_set" "example" {
  name                = "des"
  resource_group_name = azurerm_resource_group.example.name
  location            = azurerm_resource_group.example.location
  key_vault_key_id    = azurerm_key_vault_key.example.id

  identity {
    type = "SystemAssigned"
  }
}

Before this PR, we generate the following example manifest for the resource:

apiVersion: compute.azure.upbound.io/v1beta1
kind: DiskEncryptionSet
metadata:
  name: example
spec:
  forProvider:
    identity:
    - type: SystemAssigned
    keyVaultKeyId: ${azurerm_key_vault_key.example.id}
    location: West Europe
    resourceGroupNameRef:
      name: example
  providerConfigRef:
    name: example

This PR produces the following without a manual reference configuration to vault keys:

apiVersion: compute.azure.upbound.io/v1beta1
kind: DiskEncryptionSet
metadata:
  name: example
spec:
  forProvider:
    identity:
    - type: SystemAssigned
    keyVaultKeyIdRef:
      name: example
    location: West Europe
    resourceGroupNameRef:
      name: example

and keyVaultKeyId is now a reference to the ID of a vault key:

type DiskEncryptionSetParameters struct {
...
	// +crossplane:generate:reference:type=github.com/upbound/official-providers/provider-azure/apis/keyvault/v1beta1.Key
	// +crossplane:generate:reference:extractor=github.com/upbound/upjet/pkg/pipeline.ExtractResourceID()
	// +kubebuilder:validation:Optional
	KeyVaultKeyID *string `json:"keyVaultKeyId,omitempty" tf:"key_vault_key_id,omitempty"`

	// +kubebuilder:validation:Optional
	KeyVaultKeyIDRef *v1.Reference `json:"keyVaultKeyIdRef,omitempty" tf:"-"`

	// +kubebuilder:validation:Optional
	KeyVaultKeyIDSelector *v1.Selector `json:"keyVaultKeyIdSelector,omitempty" tf:"-"`

The cross-resource reference to the vault key is deduced from the provider metadata.

@ulucinar ulucinar force-pushed the fix-7 branch 2 times, most recently from d5bf1bf to 22f1d53 Compare June 15, 2022 12:18

@muvaf muvaf left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed offline, the mental model of Upjet/Terrajet for provider authors is that some defaults are set and then users would override them to their liking - which could be through common configs or individual configs and then that'd be the final state of the configuration; there wouldn't be meaningful defaulting of information after that point.

In this case, that would translate into pre-populating the References field of the config.Resource in NewProvider function so that the information is available and can be overridden by configurations provided by the user. Right now, it seems that the reference inference is part of ExampleGenerator pipeline and it manipulates the user's config.Resource objects after they handed them over to Upjet. Even though it doesn't override entries user provided, it makes meaningful changes and does not allow users to customize whatever is produced. I think we should really try to keep to stick with the current architecture for this mechanism as well to keep it cohesive and easy to follow from provider author perspective (or start a discussion issue to change it).

As a general approach, I'd suggest starting small and with many assumptions to see how much coverage we're able to get and then discuss whether the remaining part is worth automating and where would be the best place to automate it, i.e. Upjet vs providers. For instance, we merged the other PR with value resolver that recursively scans other examples to find the static value, like location from resource_group that's TF-style referenced, but we don't know how prevalent this was and whether it's a case that we can generalize with a heuristic in providers; maybe just scanning for reference paths that end with .id would get us quite far that we don't need to resolve the static values in Upjet at all and get away with a few common configs. It's also hard to give feedback as reviewer to suggest a whole revamp for a PR that clearly has so much effort gone into and been there for a while.

Now, we're building on top of that mechanism to see whether a reference is cross-resource or static by trying to run it and marking it as cross-reference if it cannot be resolved to a static value. This is definitely more robust than having assumptions - definitely not suggesting to replace it with assumptions - but it's also more complex and we have the ability to programmatically extract statistics (like this or this) to see if it's worth adding that complexity. For example, if it takes us from 40% to 80%, then we'd allocate way more time to do it without having to take shortcuts but if it's from 90 to 95, then it may not be worth adding it. Not that we don't have some individual convictions around how frequent we encounter a case, but I've been surprised multiple times in that regard when I actually calculated because of the scale we're dealing with. Overall, I think taking progressive steps, if possible, helps to shorten the effort->value cycle and let us make better decisions/bets in other related discussions even though it has its own iteration cost.

Comment thread pkg/config/provider.go Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, we have discussed to use TF resource name instead of the CRD type path in Reference struct to avoid having a separate mapper since everything we need to construct the type path is available; ModulePath from Provider object and Name, Kind, Version, Group all from the individual configuration objects. Has there been a blocker that would prevent us from going down that path?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Removed the TypeMapper interface as discussed.

Comment thread pkg/config/resource.go Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this now since we control all providers using Upjet, no need for backward compatibility or deprecation process.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also agree the new higher level API Reference.TerraformName is less error-prone and we can remove the Reference.Type field. I propose to do so in a further PR to decrease the impact of this PR on the existing providers as Type is part of the configuration interface employed in those providers.

Comment thread pkg/config/resource.go Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resource struct represents this string with Name field. Would it make sense to either use Name, TerraformName to make it easier for folks to understand what's expected here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Renamed as TerraformedName.

@ulucinar

Copy link
Copy Markdown
Collaborator Author

Hi @muvaf,
Thank you for the review. As we have discussed, we now prepopulate referencers in config.NewProvider. Also to prevent circular dependencies, I had to move reference resolver to the config package.

@ulucinar ulucinar requested a review from muvaf June 20, 2022 07:00
@ytsarev

ytsarev commented Jun 21, 2022

Copy link
Copy Markdown
Member

@ulucinar make reviewable fails for me when I am trying to use upjet with this change and tip of the main branch of official-providers/provider-azure

✗ make reviewable
13:45:41 [ .. ] generating provider schema for hashicorp/azurerm 3.8.0
13:45:44 [ OK ] generating provider schema for hashicorp/azurerm 3.8.0
13:45:44 [ .. ] verify dependencies have expected content
all modules verified
13:45:44 [ OK ] go modules dependencies verified
13:45:45 [ .. ] go generate darwin_arm64
# github.com/upbound/official-providers/provider-azure/config
../config/provider.go:150:25: cannot use config.WithShortName("azure") (type config.ProviderOption) as type string in argument to config.NewProvider
apis/generate.go:30: running "go": exit status 2

@ulucinar

Copy link
Copy Markdown
Collaborator Author

@ulucinar make reviewable fails for me when I am trying to use upjet with this change and tip of the main branch of official-providers/provider-azure

✗ make reviewable
13:45:41 [ .. ] generating provider schema for hashicorp/azurerm 3.8.0
13:45:44 [ OK ] generating provider schema for hashicorp/azurerm 3.8.0
13:45:44 [ .. ] verify dependencies have expected content
all modules verified
13:45:44 [ OK ] go modules dependencies verified
13:45:45 [ .. ] go generate darwin_arm64
# github.com/upbound/official-providers/provider-azure/config
../config/provider.go:150:25: cannot use config.WithShortName("azure") (type config.ProviderOption) as type string in argument to config.NewProvider
apis/generate.go:30: running "go": exit status 2

config.NewProvider needs the rootDir with this PR. Could you please check the following branch for an example to consume the new API:
https://github.com/ulucinar/official-providers/tree/upjet-fix-7
https://github.com/ulucinar/official-providers/blob/7bea91631701cf99eeb496d4c1b16d654761089a/provider-azure/config/provider.go#L148

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>
Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>

@muvaf muvaf left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ulucinar ! I don't want to block this PR much longer given that we merged other pieces that are foundational to this one, like #19 and #27 . Opened a couple of issues to address later.

Comment thread pkg/config/provider.go Outdated

// ReferenceResolver injects cross-resource references across the resources
// of this Provider.
type ReferenceResolver interface {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you feel strongly about the reference resolver name? When I read it, the first thing comes to my mind is that it works on the references to resolve them to the real value whereas what actual implementation does is to actually infer & build references. Maybe something like ReferenceBuilder or ReferenceExtractor or ReferenceConfigurator or ReferenceInjector?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Renamed config.ReferenceResolver as config.ReferenceInjector.

Comment thread pkg/config/provider.go Outdated
if err := p.loadMetadata(metadata); err != nil {
panic(errors.Wrap(err, "cannot load provider metadata"))
}
for i, refResolver := range p.refResolvers {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a chance that the whole reference extraction could happen in a common resource configurator, which would let us not increase the number of concepts we are exposed to. But not for this PR, so opened #39 to track testing it.

Comment thread pkg/config/resource.go
// which will be referenced. The supplied resource name is
// converted to a type name of the corresponding CRD using
// the configured TerraformTypeMapper.
TerraformName string

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we had discussed this at one point and realized TerraformName is all we need. Is there a reason to keep the Type field? If it's about backward compatibility, we can take a pass in all official providers to change all to use TerraformName since they are the sole users of Upjet.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this would require having access to map[string]*config.Resources in the CRD builder. Opened #40 to track this separately from this PR.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is something we had better pursue. I think the new TerraformName introduced with this PR is a higher level API (than requiring the referred kind name as a string with its full package) but we need some modifications in the builder as you mentioned. Let's do it with a future PR.

Comment thread pkg/config/resource.go
// SkipReferencesTo configures attributes for which
// references will not be injected. Expected format
// for each element is <Terraform resource name>.<attribute name>.
SkipReferencesTo []string

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we seen cases where we need to skip references? For similar operations, we manipulate the schema in a configurator, which could be an option here, i.e. to remove the reference from metadata object in a configurator.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, one common case from provider-azure is that the example manifests contain references to azurerm_resource_group.location where the dependent resource does not necessarily reside in the same location. Being able to do a provider-wide configuration helps here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using existing configurators?

Comment thread pkg/config/provider.go Outdated

// refResolvers is an ordered list of `ReferenceResolver`s for
// injecting references across this Provider's resources.
refResolvers []ReferenceResolver

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a different possible implementation in mind?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I introduced this interface mainly for resolving circular imports but one idea could be to have something like what we have in provider-azure in terrajet as a configuration option so that other providers can configure their common references. We can discuss some ideas in #39.

@@ -0,0 +1,155 @@
/*

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole reference package needs unit tests. It's really hard to track what's going on and who does what otherwise given that we have to build recursion. But I know the whole work has taken longer and longer, so I opened #41 to track this.

Comment thread pkg/registry/reference/resolver.go Outdated

func (rr *Resolver) resolveReferences(params map[string]interface{}, resolutionContext map[string]*PavedWithManifest) ([]string, error) { // nolint:gocyclo
var resolvedParams []string
for k, v := range params {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a long and recursive for loop with switches, so using longer names would be more helpful. See https://github.com/crossplane/crossplane/blob/master/CONTRIBUTING.md#use-descriptive-variable-names-sparingly

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Renamed them as paramName and paramValue.

Comment thread pkg/resource/extractor.go

// ExtractParamPath extracts the value of `sourceAttr`
// from `spec.forProvider` allowing nested parameters.
func ExtractParamPath(sourceAttr string) xpref.ExtractValueFn {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add an example input here as comment?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thank you.

ulucinar added 2 commits July 26, 2022 18:44
….InjectReferences

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>
- Refactor pipeline.transformFields

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>
- Rename reference.Resolver as reference.Injector
- Rename config.WithReferenceResolvers as config.WithReferenceInjectors

Signed-off-by: Alper Rifat Ulucinar <ulucinar@users.noreply.github.com>
@ulucinar ulucinar merged commit 397a7de into crossplane:main Jul 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inject Cross-resource References using Scraped Provider Metadata

3 participants