Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPLAT-1160: AWS - Support Wavelength Zones with edge pool #7369

Merged
merged 8 commits into from
Nov 29, 2023

Conversation

mtulio
Copy link
Contributor

@mtulio mtulio commented Jul 25, 2023

This PR introduces the support of automating creating network resources in Wavelength Zones when the zone names are defined in the edge compute pool of install-config.yaml.

Enhancement: openshift/enhancements#1510

@mtulio
Copy link
Contributor Author

mtulio commented Jul 25, 2023

/test all

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 25, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 25, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@mtulio
Copy link
Contributor Author

mtulio commented Jul 25, 2023

/test all

Local tests full IPI is working, getting CI signals

$ cat lz-p2-318/install-config.yaml-bkp 
apiVersion: v1
publish: External
baseDomain: devcluster.openshift.com
metadata:
  name: "lz-p2-318"
platform:
  aws:
    region: us-east-1
networking:
  machineNetwork:
  - cidr: 10.0.0.0/19
controlPlane:   
  hyperthreading: Enabled 
  name: master
  platform:
    aws:
      zones: [us-east-1a,us-east-1b]
compute:
- name: worker
  platform:
    aws:
      zones: [us-east-1c,us-east-1d]
- name: edge
  platform:
    aws:
      zones:
      - us-east-1-scl-1a
      - us-east-1-wl1-was-wlz-1

$ oc get machines -n openshift-machine-api
NAME                                                 PHASE     TYPE         REGION      ZONE                      AGE
lz-p2-318-z6fst-edge-us-east-1-scl-1a-t9nvg          Running   c5.2xlarge   us-east-1   us-east-1-scl-1a          48m
lz-p2-318-z6fst-edge-us-east-1-wl1-was-wlz-1-j99x9   Running   r5.2xlarge   us-east-1   us-east-1-wl1-was-wlz-1   48m
lz-p2-318-z6fst-master-0                             Running   m6i.xlarge   us-east-1   us-east-1a                54m
lz-p2-318-z6fst-master-1                             Running   m6i.xlarge   us-east-1   us-east-1b                54m
lz-p2-318-z6fst-master-2                             Running   m6i.xlarge   us-east-1   us-east-1a                54m
lz-p2-318-z6fst-worker-us-east-1c-kgs2p              Running   m6i.xlarge   us-east-1   us-east-1c                48m
lz-p2-318-z6fst-worker-us-east-1c-tcc4g              Running   m6i.xlarge   us-east-1   us-east-1c                48m
lz-p2-318-z6fst-worker-us-east-1d-5n7bd              Running   m6i.xlarge   us-east-1   us-east-1d                48m

$ oc get nodes
NAME                          STATUS   ROLES                  AGE   VERSION
ip-10-0-10-194.ec2.internal   Ready    worker                 44m   v1.27.3+e8b13aa
ip-10-0-11-216.ec2.internal   Ready    worker                 45m   v1.27.3+e8b13aa
ip-10-0-14-227.ec2.internal   Ready    worker                 46m   v1.27.3+e8b13aa
ip-10-0-2-45.ec2.internal     Ready    control-plane,master   54m   v1.27.3+e8b13aa
ip-10-0-25-67.ec2.internal    Ready    edge,worker            10m   v1.27.3+e8b13aa
ip-10-0-27-51.ec2.internal    Ready    edge,worker            35m   v1.27.3+e8b13aa
ip-10-0-3-95.ec2.internal     Ready    control-plane,master   55m   v1.27.3+e8b13aa
ip-10-0-4-232.ec2.internal    Ready    control-plane,master   55m   v1.27.3+e8b13aa

@mtulio
Copy link
Contributor Author

mtulio commented Jul 26, 2023

/test tf-fmt
/test golint
/test unit
/test e2e-aws-ovn
/test e2e-aws-ovn-localzones

@mtulio mtulio changed the title WIP|Spike: Supporting AWS Wavelength Zones in the edge compute pool SPLAT-1045: spike: Supporting AWS Wavelength Zones in the edge compute pool Jul 26, 2023
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jul 26, 2023
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Jul 26, 2023

@mtulio: This pull request references SPLAT-1045 which is a valid jira issue.

In response to this:

https://issues.redhat.com/browse/SPLAT-1045

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mtulio mtulio force-pushed the edge-aws-wavelength-zone branch 2 times, most recently from 9f66e51 to 47ec84f Compare July 27, 2023 00:20
@mtulio
Copy link
Contributor Author

mtulio commented Jul 27, 2023

/test tf-fmt
/test e2e-aws-ovn-localzones

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 19, 2023
@mtulio mtulio force-pushed the edge-aws-wavelength-zone branch from 47ec84f to aef6033 Compare October 27, 2023 19:43
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 27, 2023
@mtulio mtulio changed the title SPLAT-1045: spike: Supporting AWS Wavelength Zones in the edge compute pool SPLAT-1160: spike: Supporting AWS Wavelength Zones in the edge compute pool Oct 27, 2023
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Oct 27, 2023

@mtulio: This pull request references SPLAT-1160 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set.

In response to this:

https://issues.redhat.com/browse/SPLAT-1045

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Oct 27, 2023

@mtulio: This pull request references SPLAT-1160 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Oct 27, 2023

@mtulio: This pull request references SPLAT-1160 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.15.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mtulio
Copy link
Contributor Author

mtulio commented Oct 27, 2023

/test e2e-aws-ovn-localzones
/test e2e-aws-ovn-shared-vpc-localzones
/test e2e-aws-ovn

@mtulio
Copy link
Contributor Author

mtulio commented Oct 28, 2023

Both Local Zone jobs have perm failing for a long time for a similar reason: pod monitor tests.

The initial investigation points out that those tests are not tolerating edge nodes. I opened an investigation into it:

cc @rvanderp3 @patrickdillon

@mtulio mtulio changed the title SPLAT-1160: spike: Supporting AWS Wavelength Zones in the edge compute pool SPLAT-1160: Supporting AWS Wavelength Zones edge compute pool Oct 28, 2023
@mtulio mtulio changed the title SPLAT-1160: Supporting AWS Wavelength Zones edge compute pool SPLAT-1160: Supporting AWS Wavelength Zones with edge pool Oct 28, 2023
@mtulio mtulio changed the title SPLAT-1160: Supporting AWS Wavelength Zones with edge pool SPLAT-1160: AWS - Support Wavelength Zones with edge pool Oct 28, 2023
@mtulio
Copy link
Contributor Author

mtulio commented Oct 30, 2023

Both Local Zone jobs have perm failing for a long time for a similar reason: pod monitor tests.

The initial investigation points out that those tests are not tolerating edge nodes. I opened an investigation into it:

cc @rvanderp3 @patrickdillon

The Bug has been opened to fix the Local Zone e2e tests: https://issues.redhat.com/browse/OCPBUGS-22703

@mtulio
Copy link
Contributor Author

mtulio commented Nov 3, 2023

Fix in the monitor tests was merged and perm failure must be fixed:

/test e2e-aws-ovn-localzones

The installer discover the offerings by zone from a supported set of instance
types, when none of those are available in the zone it emits a warning.

Edge compute pool diferentiates from regular zones when selecting the
EC2 Types, it may not share the same type by edge zone, instead every
zone type could have a different EC2 type, depending of the offerings in
the zone.

The warning for edge zone shows which zone does not have any in the
supported list, to help the user to update the machine set manifest in
later stage.
Adding terraform automation to create resources subnet and route
table associations in AWS Wavelength zones.

The AWS Wavelength Zones are identified as edge zones by installer.

The Wavelength Zones does not support Nat Gateway, for that reason
the terraform will create only subnet and associations to the
route table from the parent region, when exists, otherwise
the first private route table will be used in the association.

The subnets in Wavelength Zones will be created only when the zone
names are supplied in the install-config.yaml in the edge compute pool.

AWS Wavelength requires an different type of gateway when ingress/egress
traffic from the zone: Carrier Gateway.

When installer creates the VPC, the terraform creates:
- the Carrier Gateway associating to the VPC
- public edge route table
- public subnet in Wavelength Zone, associating to the public edge route
  table

The installer odes not create Machine Set configuration to launch edge
nodes to public subnets, but the user can do it in install time, for
that reason an different feature is required to support MAPI AWS
Provider, not covered and not blockes the full automation delivered by
installer.
@mtulio mtulio force-pushed the edge-aws-wavelength-zone branch from fd1c008 to 3594add Compare November 22, 2023 21:58
@mtulio
Copy link
Contributor Author

mtulio commented Nov 22, 2023

@r4f4 Thanks for the suggestion handling the carrier gateway resource is not found in the destroy flow. Fixed.

Destroying the Carrier Gateway, handling the NotFound when the
resource is not present.
@mtulio mtulio force-pushed the edge-aws-wavelength-zone branch from 3594add to 8612574 Compare November 22, 2023 23:05
Copy link
Contributor

@r4f4 r4f4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 22, 2023
@mtulio
Copy link
Contributor Author

mtulio commented Nov 23, 2023

/test e2e-aws-ovn-localzones

@mtulio
Copy link
Contributor Author

mtulio commented Nov 23, 2023

@yunjiang29 would you mind taking a look for pre-merge tests (SPLAT-1243) ?
/assign @yunjiang29

@patrickdillon
Copy link
Contributor

/approve

Copy link
Contributor

openshift-ci bot commented Nov 28, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: patrickdillon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 28, 2023
@mtulio
Copy link
Contributor Author

mtulio commented Nov 29, 2023

/test govet
/test golint
/test gofmt
/test e2e-aws-ovn

@mtulio
Copy link
Contributor Author

mtulio commented Nov 29, 2023

/test images
/test aro-unit

@mtulio
Copy link
Contributor Author

mtulio commented Nov 29, 2023

/test okd-images
/test okd-scos-images
/test okd-verify-codegen

@mtulio
Copy link
Contributor Author

mtulio commented Nov 29, 2023

/test okd-unit

2 similar comments
@mtulio
Copy link
Contributor Author

mtulio commented Nov 29, 2023

/test okd-unit

@mtulio
Copy link
Contributor Author

mtulio commented Nov 29, 2023

/test okd-unit

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 8ee36b7 and 2 for PR HEAD 8612574 in total

Copy link
Contributor

openshift-ci bot commented Nov 29, 2023

@mtulio: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-e2e-aws-ovn 2f399f9ce19959ae9be4cdcfd51dbe6697cf2b3e link false /test okd-e2e-aws-ovn
ci/prow/okd-scos-e2e-aws-ovn 2f399f9ce19959ae9be4cdcfd51dbe6697cf2b3e link false /test okd-scos-e2e-aws-ovn
ci/prow/e2e-aws-ovn-shared-vpc-wavelengthzones fd1c008400865f4041cd54db2448c0a5e11491b5 link false /test e2e-aws-ovn-shared-vpc-wavelengthzones

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD cfadd7c and 1 for PR HEAD 8612574 in total

@patrickdillon patrickdillon merged commit 153837a into openshift:master Nov 29, 2023
@mtulio mtulio deleted the edge-aws-wavelength-zone branch November 29, 2023 17:59
@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

This PR has been included in build ose-installer-altinfra-container-v4.15.0-202311292108.p0.g153837a.assembly.stream for distgit ose-installer-altinfra.
All builds following this will include this PR.

r4f4 added a commit to r4f4/installer that referenced this pull request Dec 15, 2023
Implement Wavelength zone support as it was done for the terraform
provisioning in openshift#7369.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants