Skip to content

Commit

Permalink
Merge pull request projectcalico#8 from projectcalico/rlb-add-docker-…
Browse files Browse the repository at this point in the history
…build-rr

Add route reflector clustering, including ability to assign a cluster ID per RR
  • Loading branch information
robbrockbank committed Aug 13, 2015
2 parents b7477fa + be6a006 commit 6a79023
Show file tree
Hide file tree
Showing 6 changed files with 109 additions and 44 deletions.
94 changes: 76 additions & 18 deletions build_routereflector/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,9 @@ Reflector image, then you will not need to build the image locally.

When starting a cluster of route reflectors, the Calico BIRD Route Reflector
takes care of creating a full mesh between all of the route reflectors in the
cluster. The IP addresses for the full set of route reflectors is passed in as
an environment variable parameter on the `docker run` command and therefore
needs to be known in advance. If you need to add a new Route Reflector to the
cluster you will need to restart each Route Reflector that is already running,
updating the environment variable parameter before restarting the instance.
cluster. When adding a new Route Reflector instance, add an entry into etcd.
All Route Reflector instances watch for new Route Reflectors and update their
peerings accordingly.

### Route reflector peering with Calico Docker nodes

Expand All @@ -44,8 +42,6 @@ docker run -privileged -net=host -d \
-e IP=<IPv4_RR> \
[-e IP6=<IPv6_RR>] \
-e ETCD_AUTHORITY=<ETCD_IP:PORT> \
[-e RR_IP_ADDRS=<RR_IPv4_ADDRS>] \
[-e RR_IP6_ADDRS=<RR_IPv6_ADDRS>] \
calico/routereflector
```

Expand All @@ -58,16 +54,38 @@ Where:
binds to the hosts IPv6 address)
- `<ETCD_IP:PORT>` is the colon separated IPv4 address and port of an etcd
node in the etcd cluster.
- `<RR_IP_ADDRS>` is a comma delimited set of IPv4 addresses of all of the
Route Reflectors in the cluster. You can include the Route Reflectors own
IPv4 address in this list (but it will be ignored) - this means you can use
the same list for all Route Reflectors in the cluster. This may be omitted
if you only have a single Route Reflector.
- `<RR_IP6_ADDRS>` is a comma delimited set of IPv6 addresses of all of the
Route Reflectors in the cluster. You can include the Route Reflectors own
IPv6 address in this list (but it will be ignored) - this means you can use
the same list for all Route Reflectors in the cluster. This may be omitted
if you are not using IPv6, or you only have a single Route Reflector.

#### Adding the Route Reflector into etcd

Add an entry in etcd for this Route Reflector. This tells the Route Reflector
to participate in peering, and provides enough information to allow the Route
Reflector instances to automatically form a full BGP mesh.

The configuration for the Route Reflector is stored at:

/calico/bgp/v1/rr_v4/<RR IPv4 address>

and

/calico/bgp/v1/rr_v6/<RR IPv6 address>

In all cases, the data is a JSON blob in the form:

{
"ip": "<IP address of BGP Peer>",
"cluster_id": "<Cluster ID for this RR (see notes)>"
}

To add this entry into etcd, you could use the following command:

curl -L http://<ETCD_IP:PORT>:2379/v2/keys/calico/bgp/v1/rr_v4/<IPv4_RR> -XPUT -d value="{\"ip\":\"<IPv4_RR>\",\"cluster_id\":\"<CLUSTER_ID>\"}"

or

curl -L http://<ETCD_IP:PORT>:2379/v2/keys/calico/bgp/v1/rr_v6/<IPv6_RR> -XPUT -d value="{\"ip\":\"<IPv6_RR>\",\"cluster_id\":\"<CLUSTER_ID>\"}"

See [below](#topology-with-multiple-calico-bird-route-reflectors) for details
about large networks and the use and format of the cluster ID.

Repeat the above instructions for every Route Reflector in the cluster.

Expand Down Expand Up @@ -141,7 +159,7 @@ example, you may have:
- a network of 100,000 Calico Docker nodes
- each Calico Docker node is connected to two or three different Route
Reflectors.

### Configuring a node-specific Route Reflector peering

To configure a Route Reflector as a peer of a specific node, run the following
Expand All @@ -163,3 +181,43 @@ node.
[calico-docker]: http://github.com/projectcalico/calico-docker
[calicoctl]: https://github.com/projectcalico/calico-docker#how-does-it-work
[docker]: http://www.docker.com


## Topology with multiple Calico BIRD Route Reflectors

When the topology includes a cluster of Route Reflectors, BGP uses the concept
of a cluster ID to ensure there are no routing loops when distributing routes.

The Route Reflector image provided assumes that it has a fixed cluster ID for
each Route Reflector rather than being configurable on a per peer basis. This
simplifies the overall configuration of the network, but does place some
limitations on the topology as described here.

The topology is based on the Top of Rack model where you would have a set of
redundant route reflectors peering with all of the servers in the rack.

- Each rack is assigned its own cluster ID (a unique number in IPv4 address
format).
- Each node (server in the rack) peers with a redundant set of route
reflectors specific to that set rack.
- All of the Route Reflectors across all racks form a full BGP mesh (this is
handled automatically by the Calico BIRD Route Reflector image and does not
require additional configuration).

![Example scale topology](mesh-topology.png)

For example, to set up the topology described above, you would:

- Spin up nodes N1 - N9
- Spin up Route Reflectors RR1 - RR6
- Add [node specific peers](#configuring-a-node-specific-route-reflector-peering),
peering:
* N1, N2 and N3 with RR1 and RR2
* N4, N5 and N6 with RR3 and RR4
* N7, N8 and N9 with RR5 and RR6
- Add [etcd config](#adding-the-route-reflector-into-etcd) for the Route
Reflectors:
* RR1 and RR2 both using the cluster ID 1.0.0.1
* RR2 and RR3 both using the cluster ID 1.0.0.2
* RR4 and RR5 both using the cluster ID 1.0.0.3

Binary file added build_routereflector/mesh-topology.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 1 addition & 4 deletions build_routereflector/node_filesystem/conf.d/bird.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,5 @@
src = "bird.cfg.template"
dest = "/config/bird.cfg"
prefix = "/calico/bgp/v1"
keys = [
"/host",
"/global"
]
keys = ["/"]
reload_cmd = "pkill -HUP bird || true"
5 changes: 1 addition & 4 deletions build_routereflector/node_filesystem/conf.d/bird6.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,5 @@
src = "bird6.cfg.template"
dest = "/config/bird6.cfg"
prefix = "/calico/bgp/v1"
keys = [
"/host",
"/global"
]
keys = ["/"]
reload_cmd = "pkill -HUP bird6 || true"
12 changes: 8 additions & 4 deletions build_routereflector/node_filesystem/templates/bird.cfg.template
Original file line number Diff line number Diff line change
Expand Up @@ -18,19 +18,19 @@ template bgp bgp_template {
graceful restart; # See comment in kernel section about graceful restart.
}

{{$our_rr_key := printf "/rr_v4/%s" (getenv "IP")}}
{{if ls $our_rr_key}}{{$our_rr_data := json (getv $our_rr_key)}}

# ------------- RR-to-RR full mesh -------------
{{ if (getenv "RR_IP_ADDRS") }}
{{range $rr_ip := split (getenv "RR_IP_ADDRS") ","}}
{{if ls "/rr_v4"}}
{{range gets "/rr_v4/*"}}{{$data := json .Value}}{{$rr_ip := $data.ip}}
{{$nums := split $rr_ip "."}}{{$id := join $nums "_"}}
# For RR {{$rr_ip}}
{{if eq $rr_ip (getenv "IP") }}# Skipping ourselves
{{else if ne "" $rr_ip}}protocol bgp Mesh_{{$id}} from bgp_template {
local as {{getv "/global/as_num"}};
neighbor {{$rr_ip}} as {{getv "/global/as_num"}};
}{{end}}{{end}}
{{else}}
# No other route reflectors specified.
{{end}}


Expand All @@ -48,6 +48,7 @@ protocol bgp Global_{{$id}} from bgp_template {
local as {{$data.as_num}};
neighbor {{$cnode_ip}} as {{if exists $cnode_as_key}}{{getv $cnode_as_key}}{{else}}{{getv "/global/as_num"}}{{end}};
rr client;
{{if $our_rr_data.cluster_id}}rr cluster id {{$our_rr_data.cluster_id}};{{end}}
}
{{end}}
{{end}}
Expand All @@ -69,8 +70,11 @@ protocol bgp Node_{{$id}} from bgp_template {
local as {{$data.as_num}};
neighbor {{$cnode_ip}} as {{if exists $cnode_as_key}}{{getv $cnode_as_key}}{{else}}{{getv "/global/as_num"}}{{end}};
rr client;
{{if $our_rr_data.cluster_id}}rr cluster id {{$our_rr_data.cluster_id}};{{end}}
}
{{end}}
{{end}}
{{end}}
{{end}}

{{end}}
37 changes: 23 additions & 14 deletions build_routereflector/node_filesystem/templates/bird6.cfg.template
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ protocol device {
scan time 2; # Scan interfaces every 2 seconds
}

{{if eq "" (getenv "IP6")}}# IPv6 disabled on this node.
{{else}}
# Template for all BGP clients
template bgp bgp_template {
debug all;
Expand All @@ -19,35 +21,37 @@ template bgp bgp_template {
}


{{$our_rr_key := printf "/rr_v6/%s" (getenv "IP6")}}
{{if ls $our_rr_key}}{{$our_rr_data := json (getv $our_rr_key)}}

# ------------- RR-to-RR full mesh -------------
{{ if (getenv "RR_IP6_ADDRS") }}
{{range $rr_ip := split (getenv "RR_IP6_ADDRS") ","}}
{{$nums := split $rr_ip "."}}{{$id := join $nums "_"}}
{{if ls "/rr_v6"}}
{{range gets "/rr_v6/*"}}{{$data := json .Value}}{{$rr_ip := $data.ip}}
{{$nums := split $rr_ip ":"}}{{$id := join $nums "_"}}
# For RR {{$rr_ip}}
{{if eq $rr_ip (getenv "IP6") }}# Skipping ourselves
{{else if ne "" $rr_ip}}protocol bgp Mesh_{{$id}} from bgp_template {
local as {{getv "/global/as_num"}};
neighbor {{$rr_ip}} as {{getv "/global/as_num"}};
}{{end}}{{end}}
{{else}}
# No other route reflectors specified.
{{end}}


# ------------- RR as a global peer -------------
{{if ls "/global/bgp_peer_v6"}}
{{range gets "/global/bgp_peer_v6/*"}}{{$data := json .Value}}
{{if ls "/global/peer_v6"}}
{{range gets "/global/peer_v6/*"}}{{$data := json .Value}}
{{if eq $data.ip (getenv "IP6")}}
# This RR is a global peer with *all* calico nodes.
{{range $cnode := lsdir "/host"}}
{{$cnode_as_key := printf "/host/%s/bgp_as" $cnode}}
{{$cnode_ip_key := printf "/host/%s/bird6_ip" $cnode}}{{$cnode_ip := getv $cnode_ip_key}}
{{$cnode_as_key := printf "/host/%s/as_num" $cnode}}
{{$cnode_ip_key := printf "/host/%s/ip_addr_v6" $cnode}}{{$cnode_ip := getv $cnode_ip_key}}
{{$nums := split $cnode_ip "."}}{{$id := join $nums "_"}}
# Peering with Calico node {{$cnode}}
protocol bgp Global_{{$id}} from bgp_template {
local as {{$data.as_num}};
neighbor {{$cnode_ip}} as {{if exists $cnode_as_key}}{{getv $cnode_as_key}}{{else if exists "/global/bgp_as"}}{{getv "/global/bgp_as"}}{{else}}64511{{end}};
neighbor {{$cnode_ip}} as {{if exists $cnode_as_key}}{{getv $cnode_as_key}}{{else}}{{getv "/global/as_num"}}{{end}};
rr client;
{{if $our_rr_data.cluster_id}}rr cluster id {{$our_rr_data.cluster_id}};{{end}}
}
{{end}}
{{end}}
Expand All @@ -57,20 +61,25 @@ protocol bgp Global_{{$id}} from bgp_template {

# ------------- RR as a node-specific peer -------------
{{range $cnode := lsdir "/host"}}
{{$node_peers_key := printf "/host/%s/bgp_peer_v6" $cnode}}
{{$node_peers_key := printf "/host/%s/peer_v6" $cnode}}
{{if ls $node_peers_key}}
{{range $peer := gets (printf "%s/*" $node_peers_key)}}{{$data := json $peer.Value}}
{{if eq $data.ip (getenv "IP")}}
{{$cnode_as_key := printf "/host/%s/bgp_as" $cnode}}
{{$cnode_ip_key := printf "/host/%s/bird6_ip" $cnode}}{{$cnode_ip := getv $cnode_ip_key}}
{{$cnode_as_key := printf "/host/%s/as_num" $cnode}}
{{$cnode_ip_key := printf "/host/%s/ip_addr_v6" $cnode}}{{$cnode_ip := getv $cnode_ip_key}}
{{$nums := split $cnode_ip "."}}{{$id := join $nums "_"}}
# RR configured as a specific peer for calico node {{$peer.Key}}
protocol bgp Node_{{$id}} from bgp_template {
local as {{$data.as_num}};
neighbor {{$cnode_ip}} as {{if exists $cnode_as_key}}{{getv $cnode_as_key}}{{else if exists "/global/bgp_as"}}{{getv "/global/bgp_as"}}{{else}}64511{{end}};
neighbor {{$cnode_ip}} as {{if exists $cnode_as_key}}{{getv $cnode_as_key}}{{else}}{{getv "/global/as_num"}}{{end}};
rr client;
{{if $our_rr_data.cluster_id}}rr cluster id {{$our_rr_data.cluster_id}};{{end}}
}
{{end}}
{{end}}
{{end}}
{{end}}

{{end}}

{{end}}

0 comments on commit 6a79023

Please sign in to comment.