This repository contains terraform
code to deploy Databricks workspace for training purpose in Azure.
- Microsoft Entra ID Users and Groups (region-agnostic)
- Instructors
- Students
- Azure Storage Account for Databricks Unity Catalog (region-specific)
- Important! One Azure region can only setup one Databricks Unity Catalog. If you want to reuse the existing Databricks Unity Catalog, then change the
terraform
code accordingly.
- Important! One Azure region can only setup one Databricks Unity Catalog. If you want to reuse the existing Databricks Unity Catalog, then change the
- Azure Databricks Workspace (region-specific)
- Azure Databricks Clusters
- Instructors' Clusters
- Data Engineering
- Machine Learning
- Students' Clusters
- Data Engineering
- Machine Learning
- Instructors' Clusters
- Azure Databricks Training Materials ((c) Databricks)
- Azure
Service Principal
with access granted below.Domain.Read.All
Group.ReadWrite.All
User.ReadWrite.All
- Azure
Subscription
with resource provider registered below.Microsoft.Compute
Microsoft.Databricks
Microsoft.ManagedIdentity
Microsoft.Storage
- The Azure
Service Principal
from step 2 has access to manage resources in AzureSubscription
from step 3. - Databricks account on Azure (can be found with link here), which is already created by following this documentation.
- Databricks Group
Databricks Unity Catalog Administrators
(this is created separately from this project). - Azure
Service Principal
have been added to Databricks Account.
region = "<Azure region>"
tenant_id = "<Azure tenant ID>"
subscription_id = "<Azure subscription ID that contains all resources>"
client_id = "<Azure client (app) ID>"
client_secret = "<Azure client (app) secret>"
databricks_account_id = "<Azure Databricks account ID>"
- Install Azure CLI
az
&terraform
- Login Azure CLI, run
az login --service-principal -u <app-id> -p <password-or-cert> --tenant <tenant-id>
cd
to the correct sub-folder first, e.g.cd ./20231101
- Install terraform providers, run
terraform init
- Check and see if there is anything wrong, run
terraform plan -var-file='<file>.tfvars' -out='<file>.tfplan'
- Deploy the infra, run
terraform apply '<file>.tfplan'
- To remove the whole deployment, run
terraform plan -destroy -var-file='<file>.tfvars' -out='<file-destroy>.tfplan'
and thenterraform apply '<file-destroy>.tfplan'
In region eastasia
, there is an issue to create Unity Catalog directly with terraform
, thus requires manual creation in Databricks Account page, and then terraform import -var-file='<file>.tfvars' module.databricks.databricks_metastore.this '<metastore_id>'
The user list can be modified to suit your needs, e.g. number of users required.
As this repository is served for creating training workspace, therefore the users are divided into 2 groups, Instructors
and Students
.
The example format of the users are
student01.databricks.<training-date-yyyyMMdd>@<your Azure domain>
Pre-requisite steps documents are listed in the links below.
- Azure Databricks administration introduction - Azure Databricks | Microsoft Learn
- Provision a service principal for Azure Databricks automation - Terraform - Azure Databricks | Microsoft Learn
- Databricks Terraform provider - Azure Databricks | Microsoft Learn
- Deploy an Azure Databricks workspace using Terraform - Azure Databricks | Microsoft Learn
- Docs overview | databricks/databricks | Terraform | Terraform Registry
hashicorp/azuread
hashicorp/azurerm
databricks/databricks