Skip to content

Configuration

Michael Dombrowski edited this page Sep 25, 2023 · 3 revisions

Overview

Configuration in Greengrass is a nested map which serves as the source of truth for everything Greengrass knows on the device. It contains all device configuration, component configuration, and component lifecycle definitions among other things. On disk the configuration is stored in a JSON newline separated transaction log format. Configuration is notably case-insensitive.

Configuration is modeled as a single root Topics node where Topics is a Node which has child nodes.

All configuration nodes (either Topic or Topics) are subclasses of Node. Every Node has a name, full path to the node as a string, full path to the node as an array of strings, set of "watchers", nullable parent Topics, modification timestamp, and a reference to the Context.

Topics is a Node that has child Nodes. The children are stored in a ConcurrentHashMap which maps from a CaseInsensitiveString to Node. CaseInsensitiveString is used to keep the Configuration case-insensitive for the keys.

Topic represents a leaf node of the configuration meaning that it does not have any children and instead just holds a value. As such, this Node simply adds a value field which can be any Object type. In practice, the value should only be a JSON primitive type (string, number, boolean, or null). A Topic can also store a list of JSON primitives, however lists should be avoided because the way it is stored on disk in the transaction log is inefficient and they are more error-prone. Lists are error-prone because changing a value in a list does not make the list itself change; you'd need to manually call Topic.withValue after editing the list to ensure the new value is updated on disk and not just in memory. Lists are inefficient because the transaction log format has no special handling for lists, instead the entire list is serialized every time the Topic changes, even if only one value in the list changed. For these reasons, avoid lists and use maps instead, even using a map with an integer key would be better than using a list.

Timestamps

Configuration works using timestamps for each node within the tree of data. When updating Configuration, it will (unless you do something different) merge based on the timestamp of the incoming data compared to the timestamps of the data within the current Configuration. For example, if the Configuration currently has x=1, timestamp=100 and we merge x=2, timestamp=50, then the value of x will still be 1 because the incoming timestamp was older. When merging nested data like {x: {a: 1 } } merge with {x: {b: 2 } } the outcome would be {x: {a: 1, b: 2} } because merging will not remove any keys, only add or update keys. In order to remove keys, use mergeMap by providing an UpdateBehaviorTree input. The UpdateBehaviorTree can choose for each key and subtree whether the behavior will be the standard merge or to replace with the incoming values instead. UpdateBehaviorTree is used to great effect during deployments which you can see here.

A Watcher is an empty interface which is extended by Validator, Subscriber, and ChildChanged. This is how parts of the Nucleus can react to configuration changes automatically by subscribing to parts of the configuration with a callback which executes whenever it changes. An example of this is how GenericExternalServices react to lifecycle definition changes.

When subscribing to changes it is a common mistake that you'd only subscribe to the exact Topic that you care about changes to, that's only natural. However, this is unfortunately fraught because that Topic may be removed from the Configuration which would then lose your subscriber. Even if the Topic you subscribed to is eventually recreated, your subscriber is forever lost. To make this more concrete, if you subscribe to d in {a: {b: {c: {d: "important node" } } } } and then configuration is updated to {a: {} }, then the subscriber is gone at this point. Now later configuration updates to {a: {b: {c: {d: "important change" } } } }, your subscriber will not execute because it was already removed, even though the node you cared about (d) is back and is changed from its previous value. To avoid this peril, you must subscribe to a node which will not be removed (such as the root configuration node of a service) and then check within the subscription callback what specifically changed (using childOf) to see if you care about the change and then react to it. Greengrass could fix this problem by tracking subscribers in a separate data structure which would not remove any subscribers when the nodes they subscribe to are removed.

Clone this wiki locally