Skip to content

data-platform-hq/terraform-databricks-databricks-runtime-premium

Repository files navigation

Databricks Premium Workspace Terraform module

Terraform module used for management of Databricks Premium Resources

Usage

Requires Workspace with "Premium" SKU

The main idea behind this module is to deploy resources for Databricks Workspace with Premium SKU only.

Here we provide some examples of how to provision it with a different options.

In example below, these features of given module would be covered:

  1. Workspace admins assignment, custom Workspace group creation, group assignments, group entitlements
  2. Clusters (i.e., for Unity Catalog and Shared Autoscaling)
  3. Workspace IP Access list creation
  4. ADLS Gen2 Mount
  5. Secret scope and its secrets
  6. SQL Endpoint creation and configuration
  7. Create Cluster policy and assign permissions to custom groups
  8. Create Secret Scope and assign permissions to custom groups
  9. Connect to already existing Unity Catalog Metastore
# Prerequisite resources

# Databricks Workspace with Premium SKU
data "azurerm_databricks_workspace" "example" {
  name                = "example-workspace"
  resource_group_name = "example-rg"
}

# Databricks Provider configuration
provider "databricks" {
  alias                       = "main"
  host                        = data.azurerm_databricks_workspace.example.workspace_url
  azure_workspace_resource_id = data.azurerm_databricks_workspace.example.id
}

# Key Vault where Service Principal's secrets are stored. Used for mounting Storage Container
data "azurerm_key_vault" "example" {
  name                = "example-key-vault"
  resource_group_name = "example-rg"
}

# Example usage of module for Runtime Premium resources.
module "databricks_runtime_premium" {
  source  = "data-platform-hq/databricks-runtime-premium/databricks"

  project  = "datahq"
  env      = "example"
  location = "eastus"

  # Parameters of Service principal used for ADLS mount
  # Imports App ID and Secret of Service Principal from target Key Vault
  key_vault_id             =  data.azurerm_key_vault.example.id
  sp_client_id_secret_name = "sp-client-id" # secret's name that stores Service Principal App ID
  sp_key_secret_name       = "sp-key" # secret's name that stores Service Principal Secret Key
  tenant_id_secret_name    = "infra-arm-tenant-id" # secret's name that stores tenant id value

  # Databricks clusters configuration  
  databricks_cluster_configs = [{
    cluster_name       = "shared autoscaling"
    data_security_mode = "NONE"
    availability       = "SPOT_AZURE"
    spot_bid_max_price = -1
    permissions = [{group_name = "dev", permission_level = "CAN_MANAGE"}]
  }]

  # Databricks cluster policies
  custom_cluster_policies = [{
    name     = "custom_policy_1",
    can_use  =  "DEVELOPERS", # custom workspace group name, that is allowed to use this policy
    definition = {
      "autoscale.max_workers": {
        "type": "range",
        "maxValue": 3,
        "defaultValue": 2
      },
    }
  }]
  # Workspace could be accessed only from these IP Addresses:
  ip_rules = {
    "ip_range_1" = "10.128.0.0/16",
    "ip_range_2" = "10.33.0.0/16",
  }
  
  # ADLS Gen2 Mount
  mountpoints = {
    storage_account_name = data.azurerm_storage_account.example.name
    container_name       = "example_container"
  }

  # Here is the map of users and theirs object ids. 
  # This step is optional, in case of Service Principal assignment to workspace, 
  # please only required to provide APP ID as it's value
  user_object_ids = {
    "example-service-principal" = "ebfasddf-05sd-4sdc-aasa-ddffgs83c299"
    "user1@example.com"         = "ebfasddf-05sd-4sdc-aasa-ddffgs83c256"
    "user2@example.com"         = "ebfasddf-05sd-4sdc-aasa-ddffgs83c865"
  }
  
  # To connect to already existing metastore you have to provide it's id.
  # An example of new Metastore creation provided below
  databricks_external_metastore_id = "<uuid-of-metastore>"
  
  # Workspace admins
  workspace_admins = {
    user = ["user1@example.com"]
    service_principal = ["example-app-id"]
  }
  
  # Custom Workspace group with assigned users/service_principals.
  # In addition, provides an ability to create group entitlements and assign permission to a custom group on default cluster.
  iam = {
    DEVELOPERS = {
      user = [
        "user1@example.com",
        "user2@example.com"
      ]
    "service_principal" = []
    entitlements = ["allow_instance_pool_create","allow_cluster_create","databricks_sql_access"]
    }
  }
  
  # Additional Secret Scope
  secret_scope = [{
    scope_name = "extra-scope"
    acl        = [{ principal = "DEVELOPERS", permission = "READ" }] # Only custom workspace group names are allowed. If left empty then only Workspace admins could access these keys
    secrets    = [{ key = "secret-name", string_value = "secret-value"}]
  }]

  providers = {
    databricks = databricks.main
  }
}

Requirements

Name Version
terraform >= 1.0.0
databricks >= 1.14.2
azurerm >= 3.40.0

Providers

Name Version
databricks 1.14.2
azurerm 3.40.0

Modules

No modules.

Resources

Name Type
databricks_cluster.cluster resource
databricks_group.admin data
databricks_group.this resource
databricks_user.this resource
databricks_service_principal.this resource
databricks_group_member.admin resource
databricks_group_member.this resource
databricks_entitlements.this resource
databricks_cluster_policy.this resource
databricks_permissions.clusters resource
databricks_permissions.sql_endpoint resource
databricks_secret_acl.this resource
azurerm_key_vault_secret.sp_client_id data
azurerm_key_vault_secret.sp_key data
azurerm_key_vault_secret.tenant_id data
databricks_workspace_conf.pat resource
databricks_token.pat resource
databricks_ip_access_list.this resource
databricks_sql_global_config.this resource
databricks_sql_endpoint.this resource
databricks_metastore_assignment.this resource
databricks_mount.adls resource
databricks_secret_scope.main resource
databricks_secret_scope.this resource
databricks_secret.this resource
azurerm_key_vault_access_policy.databricks resource
databricks_secret_scope.external resource
databricks_secret_acl.external resource

Inputs

Name Description Type Default Required
workspace_id Id of Azure Databricks workspace string n/a yes
ip_rules Map of IP addresses permitted for access to DB map(string) {} no
suffix Optional suffix that would be added to the end of resources names. string "" no
user_object_ids Map of AD usernames and corresponding object IDs map(string) {} no
workspace_admins Provide users or service principals to grant them Admin permissions in Workspace.
 object({ 
user = list(string)
service_principal = list(string)
})
 { 
user = null
service_principal = null
}
no
iam_account_groups List of objects with group name and entitlements for this group
list(object({
group_name = optional(string)
entitlements = optional(list(string))
}))
[] no
iam_workspace_groups Used to create workspace group. Map of group name and its parameters, such as users and service principals added to the group. Also possible to configure group entitlements.
map(object({
user = optional(list(string))
service_principal = optional(list(string))
entitlements = optional(list(string))
}))
{} no
sql_endpoint Set of objects with parameters to configure SQL Endpoint and assign permissions to it for certain custom groups
 map(object({ 
cluster_size = string
min_num_clusters = optional(number)
max_num_clusters = optional(number)
auto_stop_mins = optional(string)
enable_photon = optional(bool)
enable_serverless_compute = optional(bool)
}))
[] no
external_metastore_id Unity Catalog Metastore Id that is located in separate environment. Provide this value to associate Databricks Workspace with target Metastore string " " no
sp_client_id_secret_name The name of Azure Key Vault secret that contains ClientID of Service Principal to access in Azure Key Vault string n/a yes
sp_key_secret_name The name of Azure Key Vault secret that contains client secret of Service Principal to access in Azure Key Vault string n/a yes
secret_scope Provides an ability to create custom Secret Scope, store secrets in it and assigning ACL for access management
list(object({
scope_name = string
acl = optional(list(object({
principal = string
permission = string
})))
secrets = optional(list(object({
key = string
string_value = string
})))
}))
default = [{
scope_name = null
acl = null
secrets = null
}]
no
key_vault_id ID of the Key Vault instance where the Secret resides string n/a yes
tenant_id_secret_name The name of Azure Key Vault secret that contains tenant ID secret of Service Principal to access in Azure Key Vault string n/a yes
mountpoints Mountpoints for databricks
map(object({
storage_account_name = string
container_name = string
}))
{} no
assign_unity_catalog_metastore Boolean flag provides an ability to assign Unity Catalog Metastore to this Workspace bool false no
custom_cluster_policies Provides an ability to create custom cluster policy, assign it to cluster and grant CAN_USE permissions on it to certain custom groups
list(object({
name = string
can_use = list(string)
definition = any
}))
[{
name = null
can_use = null
definition = null
}]
no
clusters Set of objects with parameters to configure Databricks clusters and assign permissions to it for certain custom groups
set(object({
cluster_name = string
spark_version = optional(string)
spark_conf = optional(map(any))
cluster_conf_passthrought = optional(bool)
spark_env_vars = optional(map(any))
data_security_mode = optional(string)
node_type_id = optional(string)
autotermination_minutes = optional(number)
min_workers = optional(number)
max_workers = optional(number)
max_workers = optional(number)
availability = optional(string)
first_on_demand = optional(number)
spot_bid_max_price = optional(number)
cluster_log_conf_destination = optional(string)
permissions = optional(set(object({
group_name = string
permission_level = string
})), [])
}))
set(object({
cluster_name = string
spark_version = optional(string, "11.3.x-scala2.12")
spark_conf = optional(map(any), {})
cluster_conf_passthrought = optional(bool, false)
spark_env_vars = optional(map(any), {})
data_security_mode = optional(string, "USER_ISOLATION")
node_type_id = optional(string, "Standard_D3_v2")
autotermination_minutes = optional(number, 30)
min_workers = optional(number, 1)
max_workers = optional(number, 2)
max_workers = optional(number, 2)
availability = optional(string, "ON_DEMAND_AZURE")
first_on_demand = optional(number, 0)
spot_bid_max_price = optional(number, 1)
cluster_log_conf_destination = optional(string, null)
permissions = optional(set(object({
group_name = string
permission_level = string
})), [])
}))
no
pat_token_lifetime_seconds The lifetime of the token, in seconds. If no lifetime is specified, the token remains valid indefinitely number 315569520 no
mount_adls_passthrough Boolean flag to use mount options for credentals passthrough. Should be used with mount_cluster_name, specified cluster should have option cluster_conf_passthrought == true bool false no
mount_cluster_name Name of the cluster that will be used during storage mounting. If mount_adls_passthrough == true, cluster should also have option cluster_conf_passthrought == true. When mount_cluster_name is not specified, it will create the smallest possible cluster in the default availability zone with name equal to or starting with terraform-mount for the shortest possible amount of time. string null no
key_vault_secret_scope Object with Azure Key Vault parameters required for creation of Azure-backed Databricks Secret scope.
object({
name = optional(string)
key_vault_id = optional(string)
dns_name = optional(string)
})
[]
no

Outputs

Name Description
sql_endpoint_jdbc_url JDBC connection string of SQL Endpoint
sql_endpoint_data_source_id ID of the data source for this endpoint
token Databricks Personal Authorization Token
clusters Provides name and unique identifier for the clusters

License

Apache 2 Licensed. For more information please see LICENSE

Packages

No packages published

Contributors 8

Languages