Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PmTLS and tproxy improvements with failover and L7 traffic mgmt for k8s #17624

Merged
merged 7 commits into from
Jun 10, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 69 additions & 0 deletions build-support/scripts/oss-ent-shared-file-drift-detection.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
#!/usr/bin/env bash
pglass marked this conversation as resolved.
Show resolved Hide resolved
# Copyright (c) HashiCorp, Inc.
# SPDX-License-Identifier: MPL-2.0

set -euo pipefail

unset CDPATH

cd "$(dirname "$0")" # build-support/scripts
cd ../.. # <ROOT>

if [[ ! -f GNUmakefile ]] || [[ ! -f go.mod ]]; then
echo "not in root consul checkout: ${PWD}" >&2
exit 1
fi

GIT_BRANCH="${GIT_BRANCH:-main}"

readonly oss_branch="oss/${GIT_BRANCH}"
readonly ent_branch="origin/${GIT_BRANCH}"

echo "=============="
echo "OSS ${GIT_BRANCH}: $(git show-ref "${oss_branch}")"
echo "ENT ${GIT_BRANCH}: $(git show-ref "${ent_branch}")"
echo "=============="

# compute files in oss
readonly oss_files=$(git ls-tree --name-only -r "${oss_branch}")

set +e

echo "Changelog differences (all changelog entries should be in oss and synced to enterprise):"
echo "=============="
git diff "${oss_branch}..${ent_branch}" --numstat -- ':.changelog'
echo "=============="

echo "Files that are different in ENT than in OSS:"
echo " git diff ${oss_branch}..${ent_branch} --numstat -- [elided]"
echo "=============="
git diff "${oss_branch}..${ent_branch}" --numstat -- ${oss_files} \
':!.github' \
':!*/.gitignore' \
':!.gitignore' \
':!.release' \
':!build-support' \
':!Dockerfile' \
':!GNUmakefile' \
':!*/go.mod' \
':!*/go.sum' \
':!go.mod' \
':!go.sum'
echo "=============="

echo "Actual diff follows:"
echo " git diff ${oss_branch}..${ent_branch} -- [elided]"
echo "=============="
git diff "${oss_branch}..${ent_branch}" -- ${oss_files} \
':!.github' \
':!*/.gitignore' \
':!.gitignore' \
':!.release' \
':!build-support' \
':!Dockerfile' \
':!GNUmakefile' \
':!*/go.mod' \
':!*/go.sum' \
':!go.mod' \
':!go.sum'
exit 0
45 changes: 45 additions & 0 deletions website/content/docs/connect/failover/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
---
layout: docs
page_title: Failover configuration overview
description: Learn about failover strategies and service mesh features you can implement to route traffic if services become unhealthy or unreachable, including sameness groups, prepared queries, and service resolvers.
---

# Failover overview

Services in your mesh may become unhealthy or unreachable for many reasons, but you can mitigate some of the effects associated with infrastructure issues by configuring Consul to automatically route traffic to and from failover service instances. This topic provides an overview of approaches to the failover configurations you can implement with Consul.
trujillo-adam marked this conversation as resolved.
Show resolved Hide resolved

## Service failover strategies in Consul

There are several methods for implementing failover strategies between datacenters in Consul. You can adopt one of the following strategies based on your deployment configuration and network requirements:

- Configure the `Failover` stanza in a service resolver configuration entry to explicitly define which services should failover and the targeting logic they should follow.
- Make a prepared query for each service that you can use to automate geo-failover.
- Create a sameness group to identify partitions with identical namespaces and service instance names to establish default failover targets.
trujillo-adam marked this conversation as resolved.
Show resolved Hide resolved

The following table compares these strategies in deployments with multiple datacenters to help you determine the best approach for your service:

| Failover Strategy | Supports WAN Federation | Supports Cluster Peering | Multi-Datacenter Failover Strength | Multi-Datacenter Usage Scenario |
| :---------------: | :---------------------: | :----------------------: | :--------------------------------- | :------------------------------ |
| `Failover` stanza | &#9989; | &#9989; | Enables more granular logic for failover targeting | Configuring failover for a single service or service subset, especially for testing or debugging purposes |
| Prepared query | &#9989; | &#10060; | Central policies that can automatically target the nearest datacenter | WAN-federated deployments where a primary datacenter is configured |
| Sameness groups | &#10060; | &#9989; | Group size changes without edits to existing member configurations | Cluster peering deployments with consistently named services and namespaces |
trujillo-adam marked this conversation as resolved.
Show resolved Hide resolved

### Failover configurations for a service mesh with a single datacenter

You can implement a service resolver configuration entry and specify a pool of failover service instances that other services can exchange messages with when the primary service becomes unhealthy or unreachable. We recommend adopting this strategy as a minimum baseline when implementing Consul service mesh and layering additional failover strategies to build resilience into your application network.

Refer to the [`Failover` configuration ](/consul/docs/connect/config-entries/service-resolver#failover) for details on how to configure failover services in the service resolver configuration entry on VMs. For Kubernetes-orchestrated networks, refer to [Configure failover services instances]().
trujillo-adam marked this conversation as resolved.
Show resolved Hide resolved

### Failover configuration for WAN-federated datacenters

If your network has multiple Consul datacenters that are federated over the WAN, you can configure your applications to look for failover services with prepared queries. [Prepared queries](/consul/api-docs/) are configurations that enable you to define complex service discovery lookups. This strategy hinges on the secondary datacenter containing service instances that have the same name and residing in the same namespace as their counterparts in the primary datacenter.
trujillo-adam marked this conversation as resolved.
Show resolved Hide resolved

Refer to the [Automate geo-failover with prepared queries tutorial](/consul/tutorials/developer-discovery/automate-geo-failover) for additional information.

### Failover configuration for peered clusters and partitions

In networks with multiple datacenters or partitions that share a peer connection, each datacenter or partition functions as an independent unit. As a result, Consul does not correlate services that have the same name, even if they are in the same namespace.

You can configure sameness groups for this type of network. Sameness groups allow you to define a group of admin partitions where identical service instances are deployed in identical namespaces. After you configure the sameness group, you can reference the `SamenessGroup` parameter in service resolver, exported service, and service intention configuration entries, enabling you to add or remove cluster peers from the group without making changes to every cluster peer every time.
trujillo-adam marked this conversation as resolved.
Show resolved Hide resolved

Refer to [Sameness groups usage page](/consul/docs/connect/cluster-peering/usage/sameness-groups) for more information.
130 changes: 41 additions & 89 deletions website/content/docs/connect/l7-traffic/index.mdx
Original file line number Diff line number Diff line change
@@ -1,126 +1,78 @@
---
layout: docs
page_title: Service Mesh Traffic Management - Overview
page_title: Service mesh traffic management overview
description: >-
Consul can route, split, and resolve Layer 7 traffic in a service mesh to support workflows like canary testing and blue/green deployments. Learn about the three configuration entry kinds that define L7 traffic management behavior in Consul.
---

-> **1.6.0+:** This feature is available in Consul versions 1.6.0 and newer.
# Service mesh traffic management overview

# Service Mesh Traffic Management Overview
This topic provides overview information about the application layer, or layer 7 (L7), traffic management capabilities available in Consul service mesh.
trujillo-adam marked this conversation as resolved.
Show resolved Hide resolved

Layer 7 traffic management allows operators to divide L7 traffic between
different
[subsets](/consul/docs/connect/config-entries/service-resolver#service-subsets) of
service instances when using service mesh.
## Introduction

There are many ways you may wish to carve up a single datacenter's pool of
services beyond simply returning all healthy instances for load balancing.
Canary testing, A/B tests, blue/green deploys, and soft multi-tenancy
(prod/qa/staging sharing compute resources) all require some mechanism of
carving out portions of the Consul catalog smaller than the level of a single
service and configuring when that subset should receive traffic.
Consul service mesh allows you to divide application layer traffic between different subsets of service instances. You can leverage L7 traffic management capabilities to perform complex processes, such as configuring backup services for failover scenarios, canary and A-B testing, blue-green deployments, and soft multi-tenancy in which production, QA, and staging environments share compute resources. L7 traffic management with Consul service mesh allows you to carve out portions of the Consul catalog smaller than a single service and configure when that subset should receive traffic.
trujillo-adam marked this conversation as resolved.
Show resolved Hide resolved

-> **Note:** This feature is not compatible with the
[built-in proxy](/consul/docs/connect/proxies/built-in),
[native proxies](/consul/docs/connect/native),
and some [Envoy proxy escape hatches](/consul/docs/connect/proxies/envoy#escape-hatch-overrides).
You cannot manage L7 traffic with the [built-in proxy](/consul/docs/connect/proxies/built-in),
[native proxies](/consul/docs/connect/native), or some [Envoy proxy escape hatches](/consul/docs/connect/proxies/envoy#escape-hatch-overrides).

## Stages
## Discovery chain

Service mesh proxy upstreams are discovered using a series of stages: routing,
splitting, and resolution. These stages represent different ways of managing L7
traffic.
Consul uses a series of stages to discover service mesh proxy upstreams. Each stage represents different ways of managing L7 traffic. They are referred to as the _discovery chain_:

![screenshot of L7 traffic visualization in the UI](/img/l7-routing/full.png)
- routing
- splitting
- resolution

Each stage of this discovery process can be dynamically reconfigured via various
[configuration entries](/consul/docs/agent/config-entries). When a configuration
entry is missing, that stage will fall back on reasonable default behavior.
For information about integrating service mesh proxy upstream discovery using the discovery chain, refer to [Discovery Chain for Service Mesh Traffic Management](/consul/docs/connect/l7-traffic/discovery-chain).

### Routing
The Consul UI shows discovery chain stages in the **Routing** tab of the **Services** page:

A [`service-router`](/consul/docs/connect/config-entries/service-router) config
entry kind is the first configurable stage.
![screenshot of L7 traffic visualization in the UI](/img/l7-routing/full.png)

![screenshot of service router in the UI](/img/l7-routing/Router.png)
You can define how Consul manages each stage of the discovery chain in a Consul _configuration entry_. [Configuration entries](/consul/docs/connect/config-entries) modify the default behavior of the Consul service mesh.
trujillo-adam marked this conversation as resolved.
Show resolved Hide resolved

### Routing

A router config entry allows for a user to intercept traffic using L7 criteria
such as path prefixes or http headers, and change behavior such as by sending
traffic to a different service or service subset.
The first stage of the discovery chain is the service router. Routers intercept traffic according to a set of L7 attributes, such as path prefixes and HTTP headers, and route the traffic to a different service or service subset.

These config entries may only reference `service-splitter` or
`service-resolver` entries.
Apply a [service router configuration entry](/consul/docs/connect/config-entries/service-router) to implement a router. Service router configuration entries can only reference service splitter or service resolver configuration entries.

[Examples](/consul/docs/connect/config-entries/service-router#sample-config-entries)
can be found in the `service-router` documentation.
![screenshot of service router in the UI](/img/l7-routing/Router.png)

### Splitting

A [`service-splitter`](/consul/docs/connect/config-entries/service-splitter) config
entry kind is the next stage after routing.
The second stage of the discovery chain is the service splitter. Service splitters split incoming requests and route them to different services or service subsets. Splitters enable staged canary rollouts, versioned releases, and similar use cases.

![screenshot of service splitter in the UI](/img/l7-routing/Splitter.png)
Apply a [service splitter configuration entry](/consul/docs/connect/config-entries/service-splitter) to implement a splitter. Service splitters configuration entries can only reference other service splitters or service resolver configuration entries.

A splitter config entry allows for a user to choose to split incoming requests
across different subsets of a single service (like during staged canary
rollouts), or perhaps across different services (like during a v2 rewrite or
other type of codebase migration).

These config entries may only reference `service-splitter` or
`service-resolver` entries.
![screenshot of service splitter in the UI](/img/l7-routing/Splitter.png)

If one splitter references another splitter the overall effects are flattened
into one effective splitter config entry which reflects the multiplicative
union. For instance:
If multiple service splitters are chained, Consul flattens the splits so that they behave as a single service spitter. In the following equation, `splitter[A]` references `splitter[B]`:

splitter[A]: A_v1=50%, A_v2=50%
splitter[B]: A=50%, B=50%
---------------------
splitter[effective_B]: A_v1=25%, A_v2=25%, B=50%
```text
splitter[A]: A_v1=50%, A_v2=50%
splitter[B]: A=50%, B=50%
---------------------
splitter[effective_B]: A_v1=25%, A_v2=25%, B=50%
```

[Examples](/consul/docs/connect/config-entries/service-splitter#sample-config-entries)
can be found in the `service-splitter` documentation.

### Resolution

A [`service-resolver`](/consul/docs/connect/config-entries/service-resolver) config
entry kind is the last stage.
The third stage of the discovery chain is the service resolver. Service resolvers specify which instances of a service satisfy discovery requests for the provided service name. Service resolvers enable several use cases, including:

![screenshot of service resolver in the UI](/img/l7-routing/Resolver.png)

A resolver config entry allows for a user to define which instances of a
service should satisfy discovery requests for the provided name.

Examples of things you can do with resolver config entries:

- Control where to send traffic if all instances of `api` in the current
datacenter are unhealthy.

- Configure service subsets based on `Service.Meta.version` values.
- Designate failovers when service instances become unhealthy or unreachable.
- Configure service subsets based on DNS values.
- Route traffic to the latest version of a service.
- Route traffic to specific Consul datacenters.
- Create virtual services that route traffic to instances of the actual service in specific Consul datacenters.

- Send all traffic for `web` that does not specify a service subset to the
`version1` subset.
Apply a [service resolver configuration entry](/consul/docs/connect/config-entries/service-resolver) to implement a resolver. Service resolver configuration entries can only reference other service resolvers.

- Send all traffic for `api` to `new-api`.

- Send all traffic for `api` in all datacenters to instances of `api` in `dc2`.

- Create a "virtual service" `api-dc2` that sends traffic to instances of `api`
in `dc2`. This can be referenced in upstreams or in other config entries.

If no resolver config is defined for a service it is assumed 100% of traffic
flows to the healthy instances of a service with the same name in the current
datacenter/namespace and discovery terminates.

This should feel similar in spirit to various uses of Prepared Queries, but is
not intended to be a drop-in replacement currently.

These config entries may only reference other `service-resolver` entries.
![screenshot of service resolver in the UI](/img/l7-routing/Resolver.png)

[Examples](/consul/docs/connect/config-entries/service-resolver#sample-config-entries)
can be found in the `service-resolver` documentation.
If no resolver is configured for a service, Consul sends all traffic to healthy instances of the service that have the same name in the current datacenter or specified namespace and ends the discovery chain.

-> **Note:** `service-resolver` config entries kinds can function at L4 (unlike
`service-router` and `service-splitter` kinds). These can be created for
services of any protocol such as `tcp`.
Service resolver configuration entries can also process network layer, also called level 4 (L4), traffic. As a result, you can implement service resolvers for services that communicate over `tcp` and other non-HTTP protocols.
Loading