-
Couldn't load subscription status.
- Fork 10.1k
Description
Terraform currently allows the declaration of explicit inter-resource dependencies using depends_on:
resource "example" "example1" {
}
resource "example" "example2" {
depends_on = ["example.example1"]
}The presence of the depends_on in the above example causes the graph builder to create a dependency edge from example2 to example1, which ensures that example1 is visited first during any graph traversal.
This mechanism does not generalize to other constructs within Terraform. In particular, it doesn't generalize to modules, since a module is not represented as a single node in the graph. Instead, each individual variable and output in a module is its own graph node, which allows us to optimize our parallelism by getting started on some aspects of a module before all of the input variables are ready, and to begin processing resources that depend on a module before all of its outputs are complete. Even though variables and outputs are in the graph, we do not currently support referring to them in depends_on.
The following proposal describes a generalization of the depends_on mechanism to apply to both resources and modules, with the goal of satisfying the use-cases discussed in #10462, allowing explicit dependencies on module variables and outputs, along with a syntax that creates the effect of an entire-module dependency.
New addressing forms for depends_on
We currently allow references to managed and data resources in depends_on. To support dependencies with modules, we must extend this to support the following forms:
aws_instance.example- managed resource dependency, as todayaws_instance.another_example[2]- a particular instance of a managed resource withcountsetdata.template_file.example- data resource dependency, as todayvar.foo- dependency on an input variable passed by a parent modulemodule.example.foo- dependency on an output of a named child modulemodule.example- dependency on an entire module
Our improved configuration language parser (which, at the time of writing, is in the process of being integrated into Terraform Core) allows us to improve the depends_on syntax through direct use of expressions, rather than requiring these references to be inside quoted strings:
# DESIGN SKETCH: not yet implemented and may change before release
resource "example" "example2" {
depends_on = [
aws_instance.example,
aws_instance.another_example[2]
data.template_file.example,
var.foo,
module.example.foo,
module.example,
]
}This syntax will be used for the examples in the remainder of this proposal.
Support depends_on as a module block argument
The above allows modules to be used as explicit dependencies, but we need to additionally support depends_on inside module blocks in order to allow modules to have dependencies:
# DESIGN SKETCH: not yet implemented and may change before release
module "example" {
depends_on = [
aws_instance.example,
]
}Depending on a Module Variable
At first glance, an explicit dependency on a var.foo expression feels a little strange: variables don't have externally-visible side-effects, so it's strange to want to depend on them without using their result.
However, allowing explicit dependencies on variables creates a mechanism for the author of a more-complex reusable module to create custom depends_on-like attributes that serve to block subsets of the functionality of the module. For example:
# DESIGN SKETCH: not yet implemented and may change before release
### in root module
module "database" {
}
module "app" {
ami_id = "ami-1234"
app_server_depends_on = [
module.database,
]
}
### in module "app"
variable "app_server_depends_on" {
default = []
}
resource "aws_security_group" "foo" {
# Work on _this_ resource can begin immediately
# ...
}
resource "aws_instance" "app_server" {
ami = var.ami_id
# ...
# We can't create this resource until the caller tells us that it's
# prepared some hidden dependencies.
depends_on = [
var.app_server_depends_on,
]
}This makes it possible to create a re-usable module for deploying arbitrary applications (parameterized by an AMI to deploy, etc), which can immediately create supporting resources like the security group in this example, but defer creating the actual compute resources until some arbitrary, caller-defined dependencies have been dealt with. The caller knows that ami-1234 expects to have a database available to it on boot, while the re-usable module has no direct knowledge of that database.
The actual value of app_server_depends_on in the above example is not actually significant. Instead, we effectively pass the dependencies of that expression through to the module by creating a transitive dependency relationship in the graph.
Depending on a Whole Module
As noted above, modules are not represented directly by graph nodes today, so whole-module dependencies (either as dependencies or dependents) require some new graph-building functionality.
The most likely user intent for a dependency of the form module.example is to wait until everything in the module has completed before continuing. This behavior would have a severe impact on Terraform's ability to achieve parallelism though, and so this proposal suggests a compromise for when depends_on references a whole module: treat this as an alias for depending on each of the module's outputs, but not on any resources or nested modules.
The biggest consequence of this compromise is that in the above example null_resource.example will block until module.example.null_resource.example2 is complete, but will not wait for module.example.null_resource.example3 because none of the module's outputs depend on that resource.
This consequence gives a measure of flexibility and control for the module author, however: if the author knows that the module performs a time-consuming operation but that this operation does not block access to the objects that the caller will depend on then this can be expressed by making that operation not be a dependency of the outputs. From the module caller's perspective, the module can still be thought of as a black box, with the module author designing it such that all significant effects of the module are referenced in an output. In effect, the module author uses output blocks to define what it means for the module to be considered "complete".
The improved configuration language, whose integration is in progress as we write this, allows passing the result of an entire module as a value into another module:
# DESIGN SKETCH: not yet implemented and may change before release
### root module
module "example1" {
}
module "example2" {
example1 = module.example1
}
### module example1
output "id" {
value = "placeholder-id"
}
### module example2
variable "example1" {
}
resource "null_resource" "example" {
triggers = {
example1_id = var.example1.id
}
}
This new usage creates an implicit dependency between module.example2.var.example1 and all of the outputs of module.example1, since they must all be complete before the language runtime can construct the value of module.example1 to assign. This implicit usage further reinforces the idea that only the outputs are dependencies in this case, because that is what is necessary to construct the object value returned by module.example1.
Whole-module depends_on
Using depends_on in a module block will also limit parallelism, but the impact is less severe in this case because the effect is under the direct control of the caller module, and so its author can make a tradeoff to decide at what point the limited parallelism hurts enough to warrant more precise dependency handling:
# DESIGN SKETCH: not yet implemented and may change before release
### root module
variable "baz" {
}
resource "null_resource" "example1" {
triggers = {
example = "hello"
}
}
module "example" {
foo = var.baz
depends_on = [
null_resource.example1,
]
}
### module "example"
variable "foo" {
}
resource "null_resource.example2" {
triggers = {
foo = var.foo
}
}
resource "null_resource.example3" {
}
module "example2" {
}
### module "example2"
resource "null_resource.example4" {
}Dependencies away from the module require the creation of a new "begin" graph node for the module that declares depends_on, which must then be a dependency of every resource in the module and of any downstream modules. To reduce the number of graph edges, a "begin" node will be created for each of the downstream modules too, so that only one additional edge needs to be added between the modules (to connect the "begin" nodes).
A "begin" graph node takes no action when visited during a walk and so just serves as an aggregation point to reduce the number of dependency edges. For a module block without depends_on the "begin" graph node can be safely optimized away, along with its incoming dependency edges, during graph construction.
depends_on in other contexts
depends_on can be useful for any Terraform construct that causes externally-visible side-effects, as a means to influence the ordering of those side-effects.
Provider initialization also sometimes has side effects, such as reaching out to an external network service to begin a session or to validate credentials. depends_on could therefore also be useful in provider blocks, as described in #2430. However, providers are special in that they need to be instantiated in all phases of Terraform's operation, and thus it is not always possible to force an ordering for provider initialization relative to resource creation as described in #4149. Implementation of depends_on for modules should not block on the implementation of "partial apply", but we should reserve the depends_on argument for provider blocks as part of implementing this proposal to minimize the risk that a provider in the wild will introduce its own depends_on configuration argument that would then be in conflict.
output, variable and locals blocks do not have any externally-visible side-effects and so depends_on would not serve any useful purpose for these blocks; it is always safe to evaluate the corresponding graph nodes as soon as their implicit dependencies become ready.
provisioner blocks within managed resources are not currently represented as separate graph nodes, and so they are processed as part of a create action for their parent resource node.

