-
Notifications
You must be signed in to change notification settings - Fork 558
Add availability zone support for masters #3864
Conversation
dd473b8
to
55837cf
Compare
Codecov Report
@@ Coverage Diff @@
## master #3864 +/- ##
==========================================
+ Coverage 56.25% 56.49% +0.23%
==========================================
Files 109 109
Lines 16494 16516 +22
==========================================
+ Hits 9279 9330 +51
+ Misses 6406 6378 -28
+ Partials 809 808 -1 |
pkg/acsengine/defaults.go
Outdated
@@ -242,7 +242,7 @@ func setPropertiesDefaults(cs *api.ContainerService, isUpgrade, isScale bool) (b | |||
|
|||
setStorageDefaults(properties) | |||
setExtensionDefaults(properties) | |||
setVMSSDefaults(properties) | |||
setVMSSDefaultsForAgents(properties) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a HasVMSSAgentPool
-type method? It'd be more expressive to do:
if properties.HasVMSSAgentPool() {
setVMSSDefaultsForAgents(properties)
}
pkg/acsengine/defaults.go
Outdated
@@ -582,10 +582,26 @@ func setMasterProfileDefaults(a *api.Properties, isUpgrade bool) { | |||
if a.MasterProfile.HTTPSourceAddressPrefix == "" { | |||
a.MasterProfile.HTTPSourceAddressPrefix = "*" | |||
} | |||
// Set VMSS Defaults for Masters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we're not doing this in a func like the vmss for agents defaults flow above?
pkg/api/types.go
Outdated
@@ -715,6 +717,17 @@ func (p *Properties) HasVirtualMachineScaleSets() bool { | |||
return false | |||
} | |||
|
|||
// HasAllZonesAgentPools will return true if all of the agent pools have zones |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's go full java and change this func name to HasZonesForAllAgentPools
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:'(
FYI got an E2E flake on 1.8: (just in case we get another flake or two we can start to debug more deeply) |
@jackfrancis @tariq1890 ready for another round of review |
pkg/api/vlabs/validate_test.go
Outdated
agentProfiles []*AgentPoolProfile | ||
expectedErr string | ||
}{ | ||
{ | ||
name: "Master profile with zones version", | ||
orchestratorVersion: "1.11.3", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should use orchestratorRelease
in unit tests whenever possible as individual patches will get deprecated as we add new ones (we now support the two latest patches of each minor version)
pkg/acsengine/defaults.go
Outdated
a.MasterProfile.SinglePlacementGroup = helpers.PointerToBool(api.DefaultSinglePlacementGroup) | ||
} | ||
if a.MasterProfile.SinglePlacementGroup == helpers.PointerToBool(false) { | ||
a.MasterProfile.StorageProfile = api.ManagedDisks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this going to silently override the user's choice if they select storage account? Should we validate + error if the user selects storageAccount + singlePlacement false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally think this is too much detail for end users. so we should set for them but I am ok with either. What would you recommend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The foundation of acs-engine is that it is highly customizable so I think we should assume that anything that is exposed to the user can and will be customized by users. If we want to set a value for the users we shouldn't expose it to them in the apimodel. I think a simple validation check should be enough in this case. That's my recommendation but I'll let you make the call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made it a validation instead of default
pkg/acsengine/defaults.go
Outdated
if a.MasterProfile.SinglePlacementGroup == helpers.PointerToBool(false) { | ||
a.MasterProfile.StorageProfile = api.ManagedDisks | ||
} | ||
if a.MasterProfile.HasAvailabilityZones() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here for overriding LB defaults
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now it's a validation instead of default
@@ -186,22 +186,29 @@ | |||
}, | |||
{{end}} | |||
{ | |||
"apiVersion": "[variables('apiVersionDefault')]", | |||
"apiVersion": "2017-08-01", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for using 2017-08-01
and not a more recent version like 2018-04-01
that we already use for accelerated networking?
In any case, we have a backlog item to rationalize the default version #3835 so the version will be bumped soon. If the version needs to be 2017-08-01
and is reused in multiple places I would recommend creating a variable for it.
This comment has been minimized.
This comment has been minimized.
pkg/acsengine/defaults.go
Outdated
a.MasterProfile.SinglePlacementGroup = helpers.PointerToBool(api.DefaultSinglePlacementGroup) | ||
} | ||
if a.MasterProfile.HasAvailabilityZones() && (a.OrchestratorProfile.KubernetesConfig == nil || a.OrchestratorProfile.KubernetesConfig.LoadBalancerSku == "") { | ||
a.OrchestratorProfile.KubernetesConfig.LoadBalancerSku = "Standard" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if a.OrchestratorProfile.KubernetesConfig == nil
is true it would cause a nil point dereference.
At this point this should never be true (we should be setting KubernetesConfig
before we reach this line) but the logic of the if statement is contradictory.
pkg/acsengine/defaults.go
Outdated
profile.StorageProfile = api.ManagedDisks | ||
} | ||
if profile.HasAvailabilityZones() { | ||
if profile.HasAvailabilityZones() && (a.OrchestratorProfile.KubernetesConfig == nil || a.OrchestratorProfile.KubernetesConfig.LoadBalancerSku == "") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
pkg/api/vlabs/validate.go
Outdated
@@ -274,6 +277,9 @@ func (a *Properties) validateOrchestratorProfile(isUpdate bool) error { | |||
return errors.Errorf("loadBalancerSku is only available in Kubernetes version %s or greater; unable to validate for Kubernetes version %s", | |||
minVersion.String(), o.OrchestratorVersion) | |||
} | |||
if *a.OrchestratorProfile.KubernetesConfig.ExcludeMasterFromStandardLB == false { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's use the helper function helpers.IsFalseBoolPointer
here instead of dereferencing the pointer which might be nil
pkg/api/vlabs/validate.go
Outdated
return errors.New("the node count and the number of availability zones provided can result in zone imbalance. To achieve zone balance, each zone should have at least 2 nodes or more") | ||
} | ||
} | ||
if a.OrchestratorProfile.KubernetesConfig != nil && a.OrchestratorProfile.KubernetesConfig.LoadBalancerSku != "Standard" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
&& a.OrchestratorProfile.KubernetesConfig.LoadBalancerSku != nil
Defined KubernetesConfig but no LoadBalancerSku defined should not fail
pkg/acsengine/defaults.go
Outdated
func setVMSSDefaults(a *api.Properties) { | ||
// setVMSSDefaultsForMasters | ||
func setVMSSDefaultsForMasters(a *api.Properties) { | ||
if a.MasterProfile.Count > 100 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only supported values for MasterProfile.Count
are 1, 3 and 5 so this is unreachable code
pkg/api/vlabs/types.go
Outdated
|
||
// IsClusterAllAvailabilityZones returns true if the cluster contains AZs for all agents and masters profiles | ||
func (p *Properties) IsClusterAllAvailabilityZones() bool { | ||
isAll := p.MasterProfile != nil && p.MasterProfile.HasAvailabilityZones() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this mean that this function should return false if MasterProfile is nil but all agents have AZs? Maybe it should be an || instead
What do you think about making this a single line? :
return (p.MasterProfile == nil || p.MasterProfile.HasAvailabilityZones()) && p.HasZonesForAllAgentPools()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function should return true if the cluster contains all profiles with AZs. so in this case, if MasterProfile is nil then it should return false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about making this a single line? :
return (p.MasterProfile == nil || p.MasterProfile.HasAvailabilityZones()) && p.HasZonesForAllAgentPools()
how about
return (p.MasterProfile != nil && p.MasterProfile.HasAvailabilityZones()) && p.HasZonesForAllAgentPools())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that means it won't be possible to use AZ for agent pool only (AKS) clusters? The way I see this if a cluster only has agents and all its agents use AZ then all the cluster nodes use AZ. If that's not the logic we are trying to achieve let's rename the function to something like MastersAndAgentsUseAvailabilityZones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AZ for agent pool only (AKS) clusters
That is the next thing we want to focus on after we have enabled AZ for masters and agents as it is done currently.
The way I see this if a cluster only has agents and all its agents use AZ then all the cluster nodes use AZ
This PR currently has validations to ensure all profiles (masters and agents) use AZ in the cluster. We do not allow a mix of AZ and non-AZ when creating a new cluster. Zones need to be explicitly configured hence the requirement to ask the user to configure AZ for both masters and agents. Let me know if you have other questions or recommendations.
To avoid confusion, I have added the following for the availabilityZones
feature as documented in clusterdefinition.md
. Please let me know if any of this is not clear or can be improved.
To protect your cluster from datacenter-level failures, you can enable the Availability Zones feature for your cluster by configuring "availabilityZones"
for the master profile and all of the agentPool profiles in the cluster definition. This feature only applies to Kubernetes clusters version 1.12+. Supported values are arrays of strings, each representing a supported availability zone in a region for your subscription. e.g. "availabilityZones": ["1","2"]
represents zone 1 and zone 2 can be used. To get supported zones for a region in your subscription, run az vm list-skus --location centralus --query "[?name=='Standard_DS2_v2'].[locationInfo, restrictions"] -o table
. You should see values like 'zones': ['2', '3', '1']
appear in the first column. If NotAvailableForSubscription
appears in the output, then create an Azure support ticket to enable zones for that region. NOTE: To ensure high availability, each profile must define at least two nodes per zone. e.g. An agent pool profile with 2 zones: "availabilityZones": ["1","2"]
must have at least 4 nodes total with "count": 4
. When "availabilityZones"
is configured, the "loadBalancerSku"
will default to Standard
as Standard LoadBalancer is required for availability zones.
a6c2da3
to
3ebf079
Compare
3ebf079
to
7151e10
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: CecileRobertMichon, ritazh The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* add support for k8s v1.12.0-rc.1 (Azure#3872) * Adding DeleteApp func to AzureClient and returning appObjectID in CreateApp (Azure#3869) * Update AKS base image to 0.15.0 (Azure#3870) * Disable AKS VHD for sovereign clouds (Azure#3874) * disable outbound internet check (Azure#3878) * Optimizing template conditional blocks in K8s templates (Azure#3871) * Enforce windows password complexity requirements in acs-engine client… (Azure#3854) * Enforce windows password complexity requirements in acs-engine client. Azure#2407 Added Windows Agent password complexity check as per guidelines in https://docs.microsoft.com/en-us/windows/security/threat-protection/security-policy-settings/password-must-meet-complexity-requirements. This will prevent generation of arm templates if password complexity for windows vm does not meet the complexity requirements. * Addressing review comments: 1). Adding negative test cases to ensure passwords whose complexity is not met are getting rejected. 2). Error message reworded to convey the password complexity requirements enforced currently by the implemented regex * Adding two test cases - password with 0 length and password same as username * Replace deprecated Azure SDK method calls (Azure#3881) * Adding test case for Generate Cluster ID (Azure#3879) * Add availability zone support for masters (Azure#3864) * use dockerhub.akscn.io in mooncake (Azure#3887) * Update docs for AZ (Azure#3886) * Update docs for AZ * Address comments * E2E - enable focused tests (Azure#3885) * re-enable CSE 50 (Azure#3892) * add support for k8s v1.12.0-rc.2 (Azure#3893) See https://github.com/kubernetes/kubernetes/releases/tag/v1.12.0-rc.2 * Handle iterated subtest execution correctly (Azure#3894) * freshen ubuntu image (Azure#3898) * Actually allow cloudprovider rate limit / backoff disabling (Azure#3891) * Extracting property values to make ARM output variables accessible (Azure#3877) * Fixing unreported gosimple lints (Azure#3901) * Update azure-sdk-for-go to v21.0.0 (Azure#3884) * Update azure-sdk-for-go to v20.2.0 See https://github.com/Azure/azure-sdk-for-go/releases/tag/v20.2.0 * Update azure-sdk-for-go to v21.0.0 See https://github.com/Azure/azure-sdk-for-go/releases/tag/v21.0.0 * Windows dns connectivity - e2e tests (Azure#3760) * Add shortcuts for some common command-line arguments (Azure#3904) * Add zip package to VHD image (Azure#3912) * Add zip package to VHD image * Alphabetize packages to install with apt-get * Update AKS base image to 0.16.0 (Azure#3913) * Stop ginkgo tests after first failure (Azure#3922) * Perform JSON escaping of strings (Azure#3919) * removed duplication of shellQuote function and added test cases. (Azure#3927) * Change 'windowsVersion' to 'imageVersion' in docs for deploying specific windows version (Azure#3928) * Add support for Kubernetes 1.9.11 (Azure#3934) See https://github.com/kubernetes/kubernetes/releases/tag/v1.9.11 * Simplify some upgrader version cases (Azure#3924) * Use `echo -n` to skip adding newline to external command output (Azure#3940) * Add warning message for VMSS master deployments (Azure#3936) * Add Kubernetes 1.12.0 to VHD image (Azure#3942) * Migrating Get Addon By Name and Get Container Index By Name methods (Azure#3938) * Fix accidentally shadowed variable in upgrade cluster. (Azure#3943) * Docs Docs Docs! Adding windowsAgent apimodel parameters (Azure#3939) * change default value for osImage (Azure#3944) * Add support for Kubernetes 1.12.0 (Azure#3918) * ip-masq-agent as addon (Azure#3916) * Update AKS base image to 0.17.0 (Azure#3949) Adds support binaries for Kubernetes 2.0.0 and 1.9.11. * Move utility methods to the helper package (Azure#3948) * exit 3 means resource group doesn’t exist (Azure#3954) * AKS distro is for Kubernetes only (Azure#3951) * use westus2 for swarm tests (Azure#3956) * add basic distro tests for swarm, swarmmode, dcos (Azure#3957) * Update go-dev tools image for go 1.11.1 (Azure#3947) * Refactor VM prefix to template functions (Azure#3925) * Migrating cloud spec config to api package (Azure#3953) * Accelerated networking for Windows (Azure#3908) * Add support for Kubernetes 1.12.1 (Azure#3963) * Cleanup Packer directory after VHD build (Azure#3964) * can't move the same file twice (Azure#3965) * sudo sudo sudo (Azure#3967) * retain existing AKS SNAT implementation (Azure#3966) * create cgroups needed by kubelet's --system-reserved and --kube-reserved flags (Azure#3915) * Dont set default distro when OSType is Windows (Azure#3950) * vmss needs systemConf too (Azure#3970) * update AKS VHD image to ver 0.18.0 (Azure#3969) * Fix urls to gofi.sh (Azure#3973) * Strengthen unit tests for cluster ID (Azure#3972) * optimize customData payload by removing comments (Azure#3971) * bump default from 1.8 to 1.10 (Azure#3946) * Using local rand object to generate cluster ids (Azure#3978) * Update node-labels to 1.6+ standard (Azure#3980) * E2E: retry kubectl delete job (Azure#3981) * E2E: actually fail when no InternalIP, ssh master tweaks, delete retries (Azure#3982) * 1.12 uses coredns (Azure#3987) * Refactor: Moving set defaults logic from package acsengine to package api (Azure#3974) * Enable the kubelet-monitor systemd unit (Azure#3983) * k8s component tests should happen before api tests (Azure#3991) * add kubernetes1.12 example (Azure#3992) * gosimple fixes (Azure#3993) * Azure CNI v1.0.12 (Azure#3989) * bump etcd version (Azure#3975) * swarmm = swarm mode (Azure#3995) * Update apiversion to make it consistent in k8s templates (Azure#3909) * E2E: set stability iterations to 10 by default (Azure#3997) * kube-dns 1.14.13 for k8s 1.8 and up (Azure#4004) * update kubernetes-dashboard to 1.10 (Azure#4005) * only schedule coredns pods on a linux node (Azure#4014) * add coredns image reference to components versions map (Azure#3998) * Remove redundant exechealthz references (Azure#4012) * health-monitor script doesn’t require docker (Azure#4028) * Updating the tag for omsagent container to use the latest production tag (Azure#4015) * Replace docker images with the official releases. (Azure#4026) * Fix linter errors reported by gosimple (Azure#4031) * Split Windows setup scripts, prepare for cleanup and multiple CRI (Azure#3994) * Image version bump (Azure#4033) * Doc style, minor updates pass (Azure#4017) * acsengine and deploy pass * Clean up the main README * pass over the kubernetes walkthrough doc - I think maybe just azure specific bits should stay here? * Review changes * Fix typo in prometheus-grafana-k8s extension (Azure#4039) * Add support for Kubernetes 1.10.9 (Azure#4040) See https://github.com/kubernetes/kubernetes/releases/tag/v1.10.9 * Add support for Kubernetes v1.13.0-alpha.1 (Azure#4036) * Fix the Authorization and ManagedIdentity api versions (Azure#4048) * schedule ip-masq-agent on masters (Azure#4049) * delay docker and kubelet health monitors for 30 mins (Azure#4050) * Don't block on Kubernetes installation cleanup operations (Azure#4056) * update to latest AKS VHD image (Azure#4054) * set default masterSubnet value for custom VNET (Azure#4058) * Updating oms agent tag to use the latest tag released (Azure#4059) * Don't test k8s 1.8 or 1.9 in CircleCI (Azure#4061) * Don't test k8s 1.8 or 1.9 in CircleCI * Add k8s 1.13 jobs to build_and_test_master task Also reordered the jobs so that maintainers are less likely to forget about adding both Linux and Windows jobs. * Azure CNI 1.0.12 should be in VHD image (Azure#4067) * 16.04:latest by default for Ubuntu distro flows (Azure#4068) * E2E: enable pod-svc connection test (Azure#4062) * E2E: reuse long-running apache pod, HPA stabilization (Azure#4073) * Update doc: keyvault-flexvol addon default flag (Azure#4072) * use latest AKS VHD (Azure#4074) * E2E: general hardening (Azure#4079) * test scale down as well (Azure#4087) * remove unused nsg for AKS (Azure#4085) * fix error log message (Azure#4088) * Fix issue where kubernetesDashboard params weren't being added despite e enabling the dashboard addon (Azure#4084) * use latest tag for flexvol versions (Azure#4091) * set FailureActions for docker, kubelet, kubeproxy (Azure#3905) * no more default stability test iterations (Azure#4095) * Update vmss master EncryptionWithExternalKms with userassignedidentity (Azure#4082) * update azure-npm to v1.0.13 (Azure#4094) * apt lock hygiene (Azure#4081) * Add userassignedidentity for EncryptionWithExternalKms (Azure#4089) * use azk8s.cn instead of akscn.io (Azure#4099) * Fix calico for k8s 1.12 (Azure#4090) * enable user-configurable Azure CNI URL (Azure#4097) * Fix standard lb with vmss master (Azure#4101) * Don't require vm tags (Azure#4100) * Moby container runtime (Azure#3896) * minor template optimization in kubernetesmastervarsvmss.t (Azure#4112) * Fix potential nil pointer dereference when VM tags are empty (Azure#4117) * add resilience to nvidia driver install/config (Azure#4113) * don’t timeout for apt (Azure#4121) * only install GPU if docker-engine (Azure#4122) * Make --profiling user configurable (Azure#4114) * suppressing sensitive openssl output (Azure#4123) * Configure Docker Version on Windows (Azure#4119) Tests passed. merging. thanks! * disable kubelet health monitor (Azure#4127) * Add support for Kubernetes v1.13.0-alpha.2 (Azure#4128) See https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.13.md#v1130-alpha2 * Add support for Kubernetes 1.11.4 (Azure#4130) See https://github.com/kubernetes/kubernetes/releases/tag/v1.11.4 * adjust pagefile size (Azure#4098) * Add support for Kubernetes 1.12.2 (Azure#4131) See https://github.com/kubernetes/kubernetes/releases/tag/v1.12.2 * Add external custom yaml for manifests (Azure#4092) * add AKSDockerEngine distro (Azure#4120) * Adding DOCKER_API_VERSION workaround (Azure#4141) * Adding DOCKER_API_VERSION workaround * Fix extra character added * restore exechealthz references (Azure#4145) * re-use ILB test deployment (Azure#4147) * Enable k8s features by default (Azure#4133) * Enable k8s features by default * Add test * Optimize CSE + FeatureFlags option for run in background (Azure#4104) * add VHD images w/ k8s 1.11.4 and k8s 1.12.2 (Azure#4146) * use china mirror in binary downloading (Azure#4137) * Make windows binary url configurable (Azure#4103) * Move the role assignment to the ARM template and fix api versions (Azure#4032) * Merging kubernetesmastervarsvmss into kubernetesmastervars (Azure#4116) * virtualNetworkName is needed for vmss masters (Azure#4159) * vmss masters listen on firstConsecutiveStaticIP (Azure#4162) * rationalize addons/kube-system e2e checks (Azure#4166) * vmss masters customvnet dependson lb (Azure#4167) * Remove unreachable NSG code (Azure#4164) * move k8s specific params to params_k8s.go (Azure#4156) * delete empty file (Azure#4180) * add Skip functionality for skipped tests (Azure#4181) * Output kernel version during VHD build (Azure#4176) * updated VHDs for aks and aks-docker-engine distros (Azure#4178) * distinct outbound test for mooncake clusters (Azure#4169) * update dependencies to point to latest k8s api release (Azure#4157) * Pre-pull hyperkube in VHD (Azure#4174) * use gcr.azk8s.cn for ip-masq-agent on Azure China (Azure#4190) * remove unnecessary bytes (Azure#4187) * cleanUpContainerImages (Azure#4195) * Add AKS container images to VHD build script (Azure#4194) * Update ip assignment and cert gen for vmss masters (Azure#4193) * Azure CNI v1.0.13 (Azure#4197) * Enable upgrade to next supported Kubernetes version (Azure#3968) * reduce customData overhead via streamlined boilerplate (Azure#4183) * Fix 1.8 cluster config (Azure#4200) * More validations for custom vnet and vmss masters (Azure#4199) * fix cilium cluster config (Azure#4202) * Calico support for azure-vnet-ipam (Azure#4154) * Update VHD image to 2018.11.06 (Azure#4201) * move kubeserviceCidr params to windowsparams tpl (Azure#4203) * Only cleanup AKS container images if cluster is not a hosted master cluster (Azure#4204) * 2 units of errata (Azure#4205) * setting default Images in addon defaults instead of params_k8s.go (Azure#4208) * mount xtables lock in proxy (Azure#4210) * test outbound for URLs that we know we need (Azure#4211) * imagePullPolicy: IfNotPresent for all versioned containers (Azure#4212) * don’t save _output as artifacts (Azure#4214) * fix standard lb (Azure#4217) * ensure N series clusters get aks-docker-engine (Azure#4221) * ensure addon image is overwritten during upgrade (Azure#4224) * Update to Docker 18.09 for Windows (Azure#4223) * ensure validate-dns job doesn’t already exist before creating (Azure#4230) * remove empty customdata yml file (Azure#4231) * adding back in double quotes one at a time (Azure#4235) * azure npm addon has differently named pods (Azure#4237) * E2E: ensure long-running-apache hpa doesn’t already exist before creating (Azure#4232) * Adding c:\tmp as needed to pass Kubernetes tests (Azure#4240) * up image to 1108 (Azure#4239) * Add no outbound internet feature flag (Azure#4222) * update azure-const.sh with new location of azure constants python file (Azure#4247) * Tigera Technical Advisory TTA-2018-001 (Azure#4244) * Enable pre-rendering of Container addons (Azure#4218) * Make orchestrator command Windows aware (Azure#4142) * Enable multiple Windows vmss agent pools - refactor pool names (Azure#3907) * consistent use of kubernetes image base (Azure#4233) * remove extraneous sed statements for mooncake (Azure#4253) * Add exechealthz to 1.12/13 section as 1.11 or earlier (Azure#4252) * append bug means we aren’t cleaning up! (Azure#4255) * *string PrincipalID needs to be nil-guarded (Azure#4258) * fix retrycmd_if_failure: $retries should be $r (Azure#4263) * Add DockerEngine feature flag (Azure#4262) * updating azureconst and adding PB6 skus (Azure#4265) * Fix outbound connection check for master VMSS (Azure#4267) * use mcr repos and disable smb flexvol addon (Azure#4266) * bash func definition needs () without “function” (Azure#4269) * Replace docker engine feature flag by existing cloud spec (Azure#4270) * remove DockerEngine FeatureFlag (Azure#4275) * E2E: rationalize node check + kube-system check, no kms (Azure#4273) * install gpu drivers before extracting hyperkube (Azure#4276) * [docs] Add documentation for GPU w/ docker-engine (Azure#4268) * Windows e2e scale up / down test Fixes#3632 (Azure#4264) * remove dead code. (Azure#4282) * remove one extra english paragraph in zh-cn readme. (Azure#4281) * rollback k8s client-go deps to v7.0.0 (Azure#4291) * Fix issue caused by updating azure.json (Azure#4279) * enable typha and add horizontal autoscaler (Azure#4290) * feat(perf): Invoke-WebRequest much slower then browser download (Azure#4294) * Set progresspreference to avoid progress bar and speed up downloads (Azure#4300) * Ensure we do have an error before testing it (Azure#4301) * update client-go to v9 (Azure#4296) * Update to Azure-CNI v1.0.14 (Azure#4297) * Make AvailabilitySet profile for master use Availability Zones (Azure#4286) * Updates from aks-engine spike (Azure#4302) * Fix prow set up * e2e changes * removing openshift artifacts * accelerated networking rationalization, with tests * remove additional sed statements for ip-masq addons * Update go-dev tools image for go 1.11.2 * remove unused azconst methods * add support PB6 vm skus * update azure_const unit test * update tiller versions in the recent versions of kubernetes * VSTS VHD pipeline hosted ubuntu pool * azureconst cruft * scale: persist scale down in api model * Add support for Kubernetes 1.11.5 * Fix docker-engine install in VHD pipeline * remove IsOpenShift from E2E * replace premature aks-engine reference * make validate-headers doesn’t exist, revert rename * all outbound checks are retried (Azure#4304) * fix bunch of warnings for arm templates. (Azure#4285) * Adding doc on how to set Azure CNI versions (Azure#4293) * Support Windows Server 2019 and make it default (Azure#4299) * fix malformed clusterautoscaler yaml bug (Azure#4322) * update kubernetes api to 1.12.3 (Azure#4315) * Prune non-go files from vendoring (Azure#4320) * Prune non-go files from vendoring * Work around errors from gosimple linter * Bump cluster-autoscaler to recommended version for 1.11.5 (Azure#4314) * Add kubelet system-reserved on Windows (Azure#3999) * Add system-reserved on Windows * Remove extra quotes * Add system-reserved on Windows * Update to match usage in #69960 * Re-escape quotes * Just add system reserved as planned at 2Gb * Bump VHD version to 2018.11.28 (Azure#4323) * Add test for docker based workflow (ContainerInventory) (Azure#4198) * Add copyright headers to source files (Azure#4324) * Add Copyright header * Add Copyright header to more files * Rearrange finicky package comments and enforce validate-headers in CI * Remove some extraneous diffs * Rename Makefile target to be more descriptive * deprecation notice (Azure#4335) * Use 2018.12.03 VHD images (Azure#4333) * we need newline (Azure#4341) * [BUG] orchestratorVersion should not get changed for ACS scale apiVersion 2017-07-01 (Azure#4346) * Enable Azure CNI 1.0.15 (Azure#4361) * clarified docs (Azure#4362) * chore: add config for "stale" bot service (Azure#4364) * chore: add config for "stale" bot service * fix: make PRs stale after a week This repo is deprecated and shouldn't be getting any PRs. * fix: rename stale config to have ".yml" extension
What this PR does / why we need it:
Adds availability zones support for masters
Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close that issue when PR gets merged): fixes #1919Special notes for your reviewer:
If applicable:
Release note:
cc @khenidak @feiskyer