Skip to content

Commit d89383f

Browse files
committed
docs: complete fab.yaml file
Also remove mentions of vlab outside of vlab section. Take Pau's suggestion to document password hash generation. Add links to external telemetry. Signed-off-by: Logan Blyth <logan@githedgehog.com>
1 parent d203494 commit d89383f

File tree

4 files changed

+110
-175
lines changed

4 files changed

+110
-175
lines changed

docs/concepts/overview.md

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -51,16 +51,17 @@ Wiring Diagram consists of the following resources:
5151

5252
## Fabricator
5353

54-
Installer builder and VLAB.
55-
56-
* Installer builder based on a preset (currently: `vlab` for virtual and `lab` for physical)
57-
* Main input: Wiring Diagram
58-
* All input artifacts coming from OCI registry
59-
* Always full airgap (everything running from private registry)
54+
Creates installation media.
55+
56+
* Features of fabricator:
57+
* Inputs: [Wiring Diagram](../install-upgrade/build-wiring.md) and
58+
[Config](../install-upgrade/config.md)
59+
* All input artifacts delivered via OCI registry
60+
* Capable of full airgap (everything running from private registry)
61+
installation
6062
* Flatcar Linux for Control Node, generated `ignition.json`
6163
* Automatic K3s installation and private registry setup
6264
* All components and their dependencies running in Kubernetes
63-
* Integrated Virtual Lab (VLAB) management
6465
* Future:
6566
* In-cluster (control) Operator to manage all components
6667
* Upgrades handling for everything starting Control Node OS

docs/install-upgrade/build-wiring.md

Lines changed: 0 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -26,33 +26,6 @@ OPTIONS:
2626
--help, -h show help
2727
```
2828

29-
Or you can generate a wiring diagram for a VLAB environment with flags to customize number of switches, links, servers, etc.:
30-
31-
```console
32-
ubuntu@sl-dev:~$ hhfab vlab gen --help
33-
NAME:
34-
hhfab vlab generate - generate VLAB wiring diagram
35-
36-
USAGE:
37-
hhfab vlab generate [command options]
38-
39-
OPTIONS:
40-
--bundled-servers value number of bundled servers to generate for switches (only for one of the second switch in the redundancy group or orphan switch) (default: 1)
41-
--eslag-leaf-groups value eslag leaf groups (comma separated list of number of ESLAG switches in each group, should be 2-4 per group, e.g. 2,4,2 for 3 groups with 2, 4 and 2 switches)
42-
--eslag-servers value number of ESLAG servers to generate for ESLAG switches (default: 2)
43-
--fabric-links-count value number of fabric links if fabric mode is spine-leaf (default: 0)
44-
--help, -h show help
45-
--mclag-leafs-count value number of mclag leafs (should be even) (default: 0)
46-
--mclag-peer-links value number of mclag peer links for each mclag leaf (default: 0)
47-
--mclag-servers value number of MCLAG servers to generate for MCLAG switches (default: 2)
48-
--mclag-session-links value number of mclag session links for each mclag leaf (default: 0)
49-
--no-switches do not generate any switches (default: false)
50-
--orphan-leafs-count value number of orphan leafs (default: 0)
51-
--spines-count value number of spines if fabric mode is spine-leaf (default: 0)
52-
--unbundled-servers value number of unbundled servers to generate for switches (only for one of the first switch in the redundancy group or orphan switch) (default: 1)
53-
--vpc-loopbacks value number of vpc loopbacks for each switch (default: 0)
54-
```
55-
5629
### Sample Switch Configuration
5730
``` { .yaml .annotate linenums="1" }
5831
apiVersion: wiring.githedgehog.com/v1beta1

docs/install-upgrade/config.md

Lines changed: 101 additions & 140 deletions
Original file line numberDiff line numberDiff line change
@@ -1,149 +1,33 @@
11
# Fabric Configuration
22
## Overview
3-
The `fab.yaml` file is the configuration file for the fabric. It supplies the configuration of the users, their credentials, logging, telemetry, and other non wiring related settings. The `fab.yaml` file is composed of multiple YAML documents inside of a single file. Per the YAML spec 3 hyphens (`---`) on a single line separate the end of one document from the beginning of the next. There are two YAML documents in the `fab.yaml` file. For more information about how to use `hhfab init`, run `hhfab init --help`.
3+
The `fab.yaml` file is the configuration file for the fabric. It supplies
4+
the configuration of the users, their credentials, logging, telemetry, and
5+
other non wiring related settings. The `fab.yaml` file is composed of multiple
6+
YAML documents inside of a single file. Per the YAML spec 3 hyphens (`---`) on
7+
a single line separate the end of one object from the beginning of the next.
8+
There are two YAML objects in the `fab.yaml` file. For more information about
9+
how to use `hhfab init`, run `hhfab init --help`.
410

11+
## HHFAB workflow
512

6-
## Typical HHFAB workflows
13+
After `hhfab` has been [downloaded](../getting-started/download.md):
714

8-
### HHFAB for VLAB
9-
10-
For a VLAB user, the typical workflow with hhfab is:
11-
12-
1. `hhfab init --dev`
13-
1. `hhfab vlab gen`
14-
1. `hhfab vlab up`
15-
16-
The above workflow will get a user up and running with a spine-leaf VLAB.
17-
18-
### HHFAB for Physical Machines
19-
20-
It's possible to start from scratch:
21-
22-
1. `hhfab init` (see different flags to customize initial configuration)
15+
1. `hhfab init`(see different flags to customize initial configuration)
2316
1. Adjust the `fab.yaml` file to your needs
2417
1. `hhfab validate`
2518
1. `hhfab build`
2619

27-
Or import existing config and wiring files:
20+
Or import existing `fab.yaml` and wiring files:
2821

2922
1. `hhfab init -c fab.yaml -w wiring-file.yaml -w extra-wiring-file.yaml`
3023
1. `hhfab validate`
3124
1. `hhfab build`
3225

3326
After the above workflow a user will have a .img file suitable for installing the control node, then bringing up the switches which comprise the fabric.
3427

35-
## Fab.yaml
36-
37-
### Configure control node and switch users
38-
39-
Configuring control node and switch users is done either passing `--default-password-hash` to `hhfab init` or editing the resulting `fab.yaml` file emitted by `hhfab init`. You can specify users to be configured on the control node(s) and switches in the following format:
40-
41-
``` {.yaml .annotation linenums="1"}
42-
spec:
43-
config:
44-
control:
45-
defaultUser: # user 'core' on all control nodes
46-
password: "hashhashhashhashhash" # password hash
47-
authorizedKeys:
48-
- "ssh-ed25519 SecREKeyJumblE"
49-
50-
fabric:
51-
mode: spine-leaf # "spine-leaf" or "collapsed-core"
52-
53-
defaultSwitchUsers:
54-
admin: # at least one user with name 'admin' and role 'admin'
55-
role: admin
56-
#password: "$5$8nAYPGcl4..." # password hash
57-
#authorizedKeys: # optional SSH authorized keys
58-
# - "ssh-ed25519 AAAAC3Nza..."
59-
op: # optional read-only user
60-
role: operator
61-
#password: "$5$8nAYPGcl4..." # password hash
62-
#authorizedKeys: # optional SSH authorized keys
63-
# - "ssh-ed25519 AAAAC3Nza..."
64-
65-
```
28+
## Complete Fab.yaml Example File
6629

67-
Control node(s) user is always named `core`.
68-
69-
The role of the user,`operator` is read-only access to `sonic-cli` command on the switches. In order to avoid conflicts, do not use the following usernames: `operator`,`hhagent`,`netops`.
70-
71-
### NTP and DHCP
72-
The control node uses public ntp servers from cloudflare and google by default. The control node runs a dhcp server on the management network. See the [example file](#complete-example-file).
73-
74-
## Control Node
75-
The control node is the host that manages all the switches, runs k3s, and serves images. This is the YAML document configure the control node:
76-
``` {.yaml .annotation linenums="1"}
77-
apiVersion: fabricator.githedgehog.com/v1beta1
78-
kind: ControlNode
79-
metadata:
80-
name: control-1
81-
namespace: fab
82-
spec:
83-
bootstrap:
84-
disk: "/dev/sda" # disk to install OS on, e.g. "sda" or "nvme0n1"
85-
external:
86-
interface: enp2s0 # interface for external
87-
ip: dhcp # IP address for external interface
88-
management:
89-
interface: enp2s1 # interface for management
90-
91-
# Currently only one ControlNode is supported
92-
```
93-
The **management** interface is for the control node to manage the fabric switches, *not* end-user management of the control node. For end-user management of the control node specify the **external** interface name.
94-
95-
### Forward switch metrics and logs
96-
97-
There is an option to enable Grafana Alloy on all switches to forward metrics and logs to the configured targets using
98-
Prometheus Remote-Write API and Loki API. If those APIs are available from Control Node(s), but not from the switches,
99-
it's possible to enable HTTP Proxy on Control Node(s) that will be used by Grafana Alloy running on the switches to
100-
access the configured targets. It could be done by passing `--control-proxy=true` to `hhfab init`.
101-
102-
Metrics includes port speeds, counters, errors, operational status, transceivers, fans, power supplies, temperature
103-
sensors, BGP neighbors, LLDP neighbors, and more. Logs include agent logs.
104-
105-
Configuring the exporters and targets is currently only possible by editing the `fab.yaml` configuration file. An example configuration is provided below:
106-
107-
``` {.yaml .annotation linenums="1"}
108-
spec:
109-
config:
110-
...
111-
defaultAlloyConfig:
112-
agentScrapeIntervalSeconds: 120
113-
unixScrapeIntervalSeconds: 120
114-
unixExporterEnabled: true
115-
lokiTargets:
116-
grafana_cloud: # target name, multiple targets can be configured
117-
basicAuth: # optional
118-
password: "<password>"
119-
username: "<username>"
120-
labels: # labels to be added to all logs
121-
env: env-1
122-
url: https://logs-prod-021.grafana.net/loki/api/v1/push
123-
useControlProxy: true # if the Loki API is not available from the switches directly, use the Control Node as a proxy
124-
prometheusTargets:
125-
grafana_cloud: # target name, multiple targets can be configured
126-
basicAuth: # optional
127-
password: "<password>"
128-
username: "<username>"
129-
labels: # labels to be added to all metrics
130-
env: env-1
131-
sendIntervalSeconds: 120
132-
url: https://prometheus-prod-36-prod-us-west-0.grafana.net/api/prom/push
133-
useControlProxy: true # if the Loki API is not available from the switches directly, use the Control Node as a proxy
134-
unixExporterCollectors: # list of node-exporter collectors to enable, https://grafana.com/docs/alloy/latest/reference/components/prometheus.exporter.unix/#collectors-list
135-
- cpu
136-
- filesystem
137-
- loadavg
138-
- meminfo
139-
collectSyslogEnabled: true # collect /var/log/syslog on switches and forward to the lokiTargets
140-
```
141-
142-
For additional options, see the `AlloyConfig` [struct in Fabric repo](https://github.com/githedgehog/fabric/blob/master/api/meta/alloy.go).
143-
144-
## Complete Example File
145-
146-
``` {.yaml .annotation linenums="1" title="fab.yaml"}
30+
``` { .yaml .annotate title="fab.yaml" linenums="1"}
14731
apiVersion: fabricator.githedgehog.com/v1beta1
14832
kind: Fabricator
14933
metadata:
@@ -159,25 +43,25 @@ spec:
15943
- time.cloudflare.com
16044
- time1.google.com
16145
162-
defaultUser: # user 'core' on all control nodes
163-
password: "hash..." # password hash
46+
defaultUser: # username 'core' on all control nodes
47+
password: "hash..." # generate hash with openssl passwd -5
16448
authorizedKeys:
165-
- "ssh-ed25519 hash..."
49+
- "ssh-ed25519 key..." # generate ssh key with ssh-keygen
16650
16751
fabric:
16852
mode: spine-leaf # "spine-leaf" or "collapsed-core"
16953
includeONIE: true
17054
defaultSwitchUsers:
17155
admin: # at least one user with name 'admin' and role 'admin'
17256
role: admin
173-
password: "hash..." # password hash
57+
password: "hash..." # generate hash with openssl passwd -5
17458
authorizedKeys:
175-
- "ssh-ed25519 hash..."
59+
- "ssh-ed25519 key..."
17660
op: # optional read-only user
17761
role: operator
178-
password: "hash..." # password hash
62+
password: "hash..." # generate hash with openssl passwd -5
17963
authorizedKeys:
180-
- "ssh-ed25519 hash..."
64+
- "ssh-ed25519 key..." # generate ssh key with ssh-keygen
18165
18266
defaultAlloyConfig:
18367
agentScrapeIntervalSeconds: 120
@@ -187,13 +71,11 @@ spec:
18771
lokiTargets:
18872
lab:
18973
url: http://url.io:3100/loki/api/v1/push
190-
useControlProxy: true
19174
labels:
19275
descriptive: name
19376
prometheusTargets:
19477
lab:
19578
url: http://url.io:9100/api/v1/push
196-
useControlProxy: true
19779
labels:
19880
descriptive: name
19981
sendIntervalSeconds: 120
@@ -208,10 +90,89 @@ spec:
20890
bootstrap:
20991
disk: "/dev/sda" # disk to install OS on, e.g. "sda" or "nvme0n1"
21092
external:
211-
interface: eno2 # interface for external
93+
interface: eno2 # customer interface to manage control node
21294
ip: dhcp # IP address for external interface
213-
management:
95+
management: # interface that manages switches in private management network
21496
interface: eno1
21597
21698
# Currently only one ControlNode is supported
21799
```
100+
101+
### Configure Control Node and Switch Users
102+
103+
#### Control Node Users
104+
Configuring control node and switch users is done either passing
105+
`--default-password-hash` to `hhfab init` or editing the resulting `fab.yaml`
106+
file emitted by `hhfab init`. The default username on the control node is
107+
core.
108+
109+
#### Switch Users
110+
There are two users on the switches, `admin` and `operator`. The `operator` user has
111+
read-only access to `sonic-cli` command on the switches. The `admin` user has
112+
broad administrative power on the switch.
113+
In order to avoid conflicts, do not use the following usernames: `operator`,`hhagent`,`netops`.
114+
115+
### NTP and DHCP
116+
The control node uses public ntp servers from cloudflare and google by default.
117+
The control node runs a dhcp server on the management network. See the [example file](#complete-example-file).
118+
119+
### Control Node
120+
The control node is the host that manages all the switches, runs k3s, and serves images.
121+
The **management** interface is for the control node to manage the fabric
122+
switches, *not* end-user management of the control node. For end-user
123+
management of the control node specify the **external** interface name.
124+
125+
### Telemetry
126+
127+
There is an option to enable [Grafana
128+
Alloy](https://grafana.com/docs/alloy/latest/) on all switches to forward metrics and logs to the configured targets using
129+
[Prometheus Remote-Write
130+
API](https://prometheus.io/docs/specs/prw/remote_write_spec/) and Loki API. Metrics includes port speeds, counters,
131+
errors, operational status, transceivers, fans, power supplies, temperature
132+
sensors, BGP neighbors, LLDP neighbors, and more. Logs include Hedgehog agent logs.
133+
134+
Telemetry can be enabled after installation of the fabric. Open the following
135+
YAML file in an editor on the control node. Modify the fields as needed. Logs
136+
can be pushed to a grafana instance at the customer environment, or to grafana
137+
cloud. To enable the telemetry after install use:
138+
139+
``` shell
140+
kubectl patch -n fab --type merge fabricator/default --patch-file telemetry.yaml
141+
```
142+
143+
```{ .yaml title="telemetry.yaml" linenums="1" }
144+
spec:
145+
config:
146+
fabric:
147+
defaultAlloyConfig:
148+
agentScrapeIntervalSeconds: 120
149+
unixScrapeIntervalSeconds: 120
150+
unixExporterEnabled: true
151+
lokiTargets:
152+
grafana_cloud: # target name, multiple targets can be configured
153+
basicAuth: # optional
154+
password: "<password>"
155+
username: "<username>"
156+
labels: # labels to be added to all logs
157+
env: env-1
158+
url: https://logs-prod-021.grafana.net/loki/api/v1/push
159+
prometheusTargets:
160+
grafana_cloud: # target name, multiple targets can be configured
161+
basicAuth: # optional
162+
password: "<password>"
163+
username: "<username>"
164+
labels: # labels to be added to all metrics
165+
env: env-1
166+
sendIntervalSeconds: 120
167+
url: https://prometheus-prod-36-prod-us-west-0.grafana.net/api/prom/push
168+
unixExporterCollectors: # list of node-exporter collectors to enable, https://grafana.com/docs/alloy/latest/reference/components/prometheus.exporter.unix/#collectors-list
169+
- cpu
170+
- filesystem
171+
- loadavg
172+
- meminfo
173+
collectSyslogEnabled: true # collect /var/log/syslog on switches and forward to the lokiTargets
174+
175+
```
176+
177+
For additional options, see the `AlloyConfig` [struct in Fabric repo](https://github.com/githedgehog/fabric/blob/master/api/meta/alloy.go).
178+

docs/install-upgrade/install.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ for writing to a USB flash drive or mounting via IPMI virtual media. The first `
4343
run is `hhfab init`. This will generate the main configuration file, `fab.yaml`. `fab.yaml` is
4444
responsible for almost every configuration of the fabric with the exception of the wiring. Each
4545
command and subcommand have usage messages, simply supply the `-h` flag to your command or sub
46-
command to see the available options. For example `hhfab vlab -h` and `hhfab vlab gen -h`.
46+
command to see the available options. For example `hhfab init -h`.
4747

4848
### HHFAB commands to make a bootable image
4949

0 commit comments

Comments
 (0)