Skip to content

Commit c75f355

Browse files
fabio-paduaCopilot
andauthored
Update Databricks cli version and Databricks deploy process (#73)
* Rebase (#3) * chore: enviornments variables adjustment * chore: schema update * fix: add SP version on job variable * chore: force powershell version * chore: change task order to publish DB secrets * chore: Adjustments on Databricks Secrets deploy * chore: adjust variables info on parameter * chore: adjustment on Databricks scripts * chore: dataLakeName variable adjustment * chore: dataComputeRS variable change * chore: documentation adjustment att image * chore: change keyvalt set permissions on SP * chore: Databricks task order * chore: databricks adjustments * chore: var output display removal * chore: databricks adjustments * chore: documentation adjustment * chore: documentation change * chore: script adjustment * chore: add command to list scope content * chore: add SP into Admin role * chore: adjust Administrator Role SP * chore: add connect to AD first * chore: Remove connection do Azure AD * chore: documentation adjustment * chore: fix uppercase extensions on some images (#42) * chore: fix uppercase extensions on some images * chore: directory changes * chore: directory rename * Issue 43 - Templates Changes (#44) * chore: fix uppercase extensions on some images * chore: directory changes * chore: directory rename * chore: Service Principal secrets first adjustments * fix: adjustment of SP * chore: save secret on output * chore: syntax fix * chore: token adjustment * chore: store secret inside output hol file * chore: put SP inside parameters * chore: template adjustment * chore: lower parameter for fix * chore: syntax adjustment * chore: Add logs * chore: add log info * chore: log info * Issue #43 - Adjust the pipeline run to use first time SP secret (#45) * chore: fix uppercase extensions on some images * chore: directory changes * chore: directory rename * chore: Service Principal secrets first adjustments * fix: adjustment of SP * chore: save secret on output * chore: syntax fix * chore: token adjustment * chore: store secret inside output hol file * chore: put SP inside parameters * chore: template adjustment * chore: lower parameter for fix * chore: syntax adjustment * chore: Add logs * chore: add log info * chore: log info * chore: pipeline execution change * chore: documentation adjustments * feat: Issue templates creations - Bug Issue template - New Feature request issue template * Update issue templates * Issue#35 - Change the store of sample-data to current repo (#46) * Rebase (#2) * chore: enviornments variables adjustment * chore: schema update * fix: add SP version on job variable * chore: force powershell version * chore: change task order to publish DB secrets * chore: Adjustments on Databricks Secrets deploy * chore: adjust variables info on parameter * chore: adjustment on Databricks scripts * chore: dataLakeName variable adjustment * chore: dataComputeRS variable change * chore: documentation adjustment att image * chore: change keyvalt set permissions on SP * chore: Databricks task order * chore: databricks adjustments * chore: var output display removal * chore: databricks adjustments * chore: documentation adjustment * chore: documentation change * chore: script adjustment * chore: add command to list scope content * chore: add SP into Admin role * chore: adjust Administrator Role SP * chore: add connect to AD first * chore: Remove connection do Azure AD * chore: documentation adjustment * chore: fix uppercase extensions on some images (#42) * chore: fix uppercase extensions on some images * chore: directory changes * chore: directory rename * Issue 43 - Templates Changes (#44) * chore: fix uppercase extensions on some images * chore: directory changes * chore: directory rename * chore: Service Principal secrets first adjustments * fix: adjustment of SP * chore: save secret on output * chore: syntax fix * chore: token adjustment * chore: store secret inside output hol file * chore: put SP inside parameters * chore: template adjustment * chore: lower parameter for fix * chore: syntax adjustment * chore: Add logs * chore: add log info * chore: log info * Issue #43 - Adjust the pipeline run to use first time SP secret (#45) * chore: fix uppercase extensions on some images * chore: directory changes * chore: directory rename * chore: Service Principal secrets first adjustments * fix: adjustment of SP * chore: save secret on output * chore: syntax fix * chore: token adjustment * chore: store secret inside output hol file * chore: put SP inside parameters * chore: template adjustment * chore: lower parameter for fix * chore: syntax adjustment * chore: Add logs * chore: add log info * chore: log info * chore: pipeline execution change * chore: documentation adjustments * chore: sample-data upload to main-stream * chore: unzip and upload sample-data * fix: adjustment on pipe run * Issue#47 - Change Databricks version (#48) * Rebase (#2) * chore: enviornments variables adjustment * chore: schema update * fix: add SP version on job variable * chore: force powershell version * chore: change task order to publish DB secrets * chore: Adjustments on Databricks Secrets deploy * chore: adjust variables info on parameter * chore: adjustment on Databricks scripts * chore: dataLakeName variable adjustment * chore: dataComputeRS variable change * chore: documentation adjustment att image * chore: change keyvalt set permissions on SP * chore: Databricks task order * chore: databricks adjustments * chore: var output display removal * chore: databricks adjustments * chore: documentation adjustment * chore: documentation change * chore: script adjustment * chore: add command to list scope content * chore: add SP into Admin role * chore: adjust Administrator Role SP * chore: add connect to AD first * chore: Remove connection do Azure AD * chore: documentation adjustment * chore: fix uppercase extensions on some images (#42) * chore: fix uppercase extensions on some images * chore: directory changes * chore: directory rename * Issue 43 - Templates Changes (#44) * chore: fix uppercase extensions on some images * chore: directory changes * chore: directory rename * chore: Service Principal secrets first adjustments * fix: adjustment of SP * chore: save secret on output * chore: syntax fix * chore: token adjustment * chore: store secret inside output hol file * chore: put SP inside parameters * chore: template adjustment * chore: lower parameter for fix * chore: syntax adjustment * chore: Add logs * chore: add log info * chore: log info * Issue #43 - Adjust the pipeline run to use first time SP secret (#45) * chore: fix uppercase extensions on some images * chore: directory changes * chore: directory rename * chore: Service Principal secrets first adjustments * fix: adjustment of SP * chore: save secret on output * chore: syntax fix * chore: token adjustment * chore: store secret inside output hol file * chore: put SP inside parameters * chore: template adjustment * chore: lower parameter for fix * chore: syntax adjustment * chore: Add logs * chore: add log info * chore: log info * chore: pipeline execution change * chore: documentation adjustments * chore: sample-data upload to main-stream * chore: unzip and upload sample-data * fix: adjustment on pipe run * chore: changed databrick version * PowerShell version changed when run pipeline (#50) * Rebase (#2) * chore: enviornments variables adjustment * chore: schema update * fix: add SP version on job variable * chore: force powershell version * chore: change task order to publish DB secrets * chore: Adjustments on Databricks Secrets deploy * chore: adjust variables info on parameter * chore: adjustment on Databricks scripts * chore: dataLakeName variable adjustment * chore: dataComputeRS variable change * chore: documentation adjustment att image * chore: change keyvalt set permissions on SP * chore: Databricks task order * chore: databricks adjustments * chore: var output display removal * chore: databricks adjustments * chore: documentation adjustment * chore: documentation change * chore: script adjustment * chore: add command to list scope content * chore: add SP into Admin role * chore: adjust Administrator Role SP * chore: add connect to AD first * chore: Remove connection do Azure AD * chore: documentation adjustment * chore: fix uppercase extensions on some images (#42) * chore: fix uppercase extensions on some images * chore: directory changes * chore: directory rename * Issue 43 - Templates Changes (#44) * chore: fix uppercase extensions on some images * chore: directory changes * chore: directory rename * chore: Service Principal secrets first adjustments * fix: adjustment of SP * chore: save secret on output * chore: syntax fix * chore: token adjustment * chore: store secret inside output hol file * chore: put SP inside parameters * chore: template adjustment * chore: lower parameter for fix * chore: syntax adjustment * chore: Add logs * chore: add log info * chore: log info * Issue #43 - Adjust the pipeline run to use first time SP secret (#45) * chore: fix uppercase extensions on some images * chore: directory changes * chore: directory rename * chore: Service Principal secrets first adjustments * fix: adjustment of SP * chore: save secret on output * chore: syntax fix * chore: token adjustment * chore: store secret inside output hol file * chore: put SP inside parameters * chore: template adjustment * chore: lower parameter for fix * chore: syntax adjustment * chore: Add logs * chore: add log info * chore: log info * chore: pipeline execution change * chore: documentation adjustments * chore: sample-data upload to main-stream * chore: unzip and upload sample-data * fix: adjustment on pipe run * chore: changed databrick version * chore: changes into PS Version on Pipe execution * [Fix] - Remove logs (#51) * Rebase (#2) * chore: enviornments variables adjustment * chore: schema update * fix: add SP version on job variable * chore: force powershell version * chore: change task order to publish DB secrets * chore: Adjustments on Databricks Secrets deploy * chore: adjust variables info on parameter * chore: adjustment on Databricks scripts * chore: dataLakeName variable adjustment * chore: dataComputeRS variable change * chore: documentation adjustment att image * chore: change keyvalt set permissions on SP * chore: Databricks task order * chore: databricks adjustments * chore: var output display removal * chore: databricks adjustments * chore: documentation adjustment * chore: documentation change * chore: script adjustment * chore: add command to list scope content * chore: add SP into Admin role * chore: adjust Administrator Role SP * chore: add connect to AD first * chore: Remove connection do Azure AD * chore: documentation adjustment * chore: fix uppercase extensions on some images (#42) * chore: fix uppercase extensions on some images * chore: directory changes * chore: directory rename * Issue 43 - Templates Changes (#44) * chore: fix uppercase extensions on some images * chore: directory changes * chore: directory rename * chore: Service Principal secrets first adjustments * fix: adjustment of SP * chore: save secret on output * chore: syntax fix * chore: token adjustment * chore: store secret inside output hol file * chore: put SP inside parameters * chore: template adjustment * chore: lower parameter for fix * chore: syntax adjustment * chore: Add logs * chore: add log info * chore: log info * Issue #43 - Adjust the pipeline run to use first time SP secret (#45) * chore: fix uppercase extensions on some images * chore: directory changes * chore: directory rename * chore: Service Principal secrets first adjustments * fix: adjustment of SP * chore: save secret on output * chore: syntax fix * chore: token adjustment * chore: store secret inside output hol file * chore: put SP inside parameters * chore: template adjustment * chore: lower parameter for fix * chore: syntax adjustment * chore: Add logs * chore: add log info * chore: log info * chore: pipeline execution change * chore: documentation adjustments * chore: sample-data upload to main-stream * chore: unzip and upload sample-data * fix: adjustment on pipe run * chore: changed databrick version * chore: changes into PS Version on Pipe execution * chore: tests adjustments * chore: log adjustments * chore: documentation change * fix: change the Databricks cli installation and version * Update infrastructure-as-code/scripts/DatabricksSecrets.ps1 Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update infrastructure-as-code/scripts/DatabricksClusters.ps1 Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Fabio Padua <61516935+fabiohaifa@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
1 parent cbaed6f commit c75f355

14 files changed

+298
-137
lines changed

azure-pipelines/databricks/databricks-ci.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,18 +4,18 @@ variables:
44
NOTEBOOK_WORKING_DIR: $(System.DefaultWorkingDirectory)/data-platform/notebooks
55

66
pool:
7-
vmImage: 'Ubuntu-18.04'
7+
vmImage: 'ubuntu-latest'
88

99
jobs:
1010
- job: 'validate_notebooks'
1111
displayName: 'Validate Databricks Notebooks'
1212
steps:
1313
- task: UsePythonVersion@0
1414
inputs:
15-
versionSpec: 3.6
15+
versionSpec: '3.9'
1616
addToPath: true
1717
architecture: 'x64'
18-
displayName: 'Use Python Version: 3.6'
18+
displayName: 'Use Python Version: 3.9'
1919

2020
- script: |
2121
python -m pip install --upgrade pip

azure-pipelines/databricks/templates/databricks-deploy-library-job-template.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,12 @@ parameters:
88

99
stages:
1010
- stage: publish_library_${{ parameters.environment }}
11-
#condition: eq(variables['Build.SourceBranch'], '${{ parameters.branch }}')
1211
displayName: 'Deploy to ${{ parameters.environment }} Databricks'
1312
jobs:
1413
- deployment: publish_library_${{ parameters.environment }}
1514
displayName: 'Deploy to ${{ parameters.environment }} Databricks'
1615
pool:
17-
vmImage: 'Ubuntu-18.04'
16+
vmImage: 'ubuntu-latest'
1817
environment: databricks-${{ parameters.environment }}
1918
variables:
2019
- group: dataops-iac-cd-output-${{ parameters.environment }}

azure-pipelines/databricks/templates/databricks-deploy-notebooks-job-template.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ stages:
1313
- deployment: publish_static_artifacts_${{ parameters.environment }}
1414
displayName: 'Deploy to ${{ parameters.environment }} Databricks'
1515
pool:
16-
vmImage: 'Ubuntu-18.04'
16+
vmImage: 'ubuntu-latest'
1717
environment: databricks-${{ parameters.environment }}
1818
variables:
1919
- group: ${{ parameters.iacCdVariableGroupPrefix }}-${{ parameters.environment }}

azure-pipelines/databricks/templates/databricks-setup-environment-template.yml

Lines changed: 11 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,17 @@ steps:
22

33
- task: UsePythonVersion@0
44
inputs:
5-
versionSpec: 3.6
5+
versionSpec: '3.9'
66
addToPath: true
77
architecture: 'x64'
8-
displayName: 'Use Python Version: 3.6'
9-
8+
displayName: 'Use Python Version: 3.9'
9+
1010
- script: |
11-
python -m pip install --upgrade pip
12-
pip install databricks-cli
13-
displayName: 'Setup Agent'
11+
echo "Downloading Databricks CLI last version..."
12+
curl -fsSL https://raw.githubusercontent.com/databricks/setup-cli/main/install.sh | sh
13+
chmod +x databricks
14+
sudo mv databricks /usr/local/bin/databricks
15+
16+
echo "Verifying Databricks CLI version..."
17+
databricks version
18+
displayName: 'Install Databricks CLI last version (Binary)'

azure-pipelines/iac/templates/stage.acceptance-test.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ stages:
1313
- job: test
1414
displayName: 'Acceptance Test for ${{ parameters.environment }}'
1515
pool:
16-
vmImage: 'Ubuntu-20.04'
16+
vmImage: 'ubuntu-latest'
1717
variables:
1818
azPowershellVersion: 7.5.0
1919
steps:
@@ -39,4 +39,4 @@ stages:
3939
testResultsFormat: 'NUnit'
4040
testResultsFiles: '**/testResults.xml'
4141
failTaskOnFailedTests: true
42-
testRunTitle: 'Pester Acceptance Tests (${{ parameters.environment }})'
42+
testRunTitle: 'Pester Acceptance Tests (${{ parameters.environment }})'

azure-pipelines/iac/templates/stage.deploy.yml

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ stages:
1616
displayName: 'Deploy ARM to ${{ parameters.environment }}'
1717
condition: succeeded()
1818
pool:
19-
vmImage: 'Ubuntu-20.04'
19+
vmImage: 'ubuntu-latest'
2020
variables:
2121
azPowershellVersion: 7.5.0
2222
environment: ${{ parameters.environment }}
@@ -85,10 +85,11 @@ stages:
8585
condition: succeeded()
8686
dependsOn: deploy_arm
8787
pool:
88-
vmImage: 'Ubuntu-20.04'
88+
vmImage: 'ubuntu-latest'
8989
environment: ${{ parameters.environment }}
9090
variables:
9191
- group: dataops-iac-cd-output-${{ parameters.environment }}
92+
- group: databricks-sp-auth-${{ parameters.environment }}
9293
strategy:
9394
runOnce:
9495
deploy:
@@ -109,8 +110,9 @@ stages:
109110
-DataLakeName $(dataLakeName)
110111
-DatabricksName $(databricksName)
111112
-KeyVaultName $(keyVaultName)
112-
-DATABRICKS_TOKEN $(DATABRICKS_TOKEN)
113113
azurePowerShellVersion: latestVersion
114+
env:
115+
DATABRICKS_TOKEN: $(DATABRICKS_TOKEN)
114116

115117
- task: PowerShell@2
116118
displayName: Deploy DBW Clusters
@@ -121,9 +123,16 @@ stages:
121123
arguments: >
122124
-Environment ${{ parameters.environment }}
123125
-DeploymentOutputFile "dbwOutput.json"
126+
# Service Principal authentication (preferred)
124127
env:
125128
DATABRICKS_HOST: https://$(databricksWorkspaceUrl)
126-
DATABRICKS_TOKEN: $(DATABRICKS_TOKEN)
129+
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
130+
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
131+
ARM_TENANT_ID: $(ARM_TENANT_ID)
132+
# OR Token-based authentication (not both)
133+
# env:
134+
# DATABRICKS_HOST: https://$(databricksWorkspaceUrl)
135+
# DATABRICKS_TOKEN: $(DATABRICKS_TOKEN)
127136
- task: PowerShell@2
128137
displayName: 'Publish Outputs'
129138
inputs:
@@ -137,4 +146,4 @@ stages:
137146
pwsh: true
138147
showWarnings: true
139148
env:
140-
AzureDevOpsPAT: $(System.AccessToken)
149+
AzureDevOpsPAT: $(System.AccessToken)

azure-pipelines/iac/templates/stage.plan.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ stages:
1515
- job: plan
1616
displayName: 'Plan for ${{ parameters.environment }}'
1717
pool:
18-
vmImage: 'Ubuntu-20.04'
18+
vmImage: 'ubuntu-latest'
1919
variables:
2020
azPowershellVersion: 7.5.0
2121
steps:

azure-pipelines/iac/templates/stage.validate.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ stages:
55
- job: lint
66
displayName: 'Lint'
77
pool:
8-
vmImage: 'Ubuntu-20.04'
8+
vmImage: 'ubuntu-latest'
99
steps:
1010
- template: step.install-arm-template-toolkit.yml
1111
parameters:

azure-pipelines/iac/templates/step.install-databricks-cli.yml

Lines changed: 57 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -3,30 +3,65 @@ parameters:
33
type: string
44

55
steps:
6-
76
- task: UsePythonVersion@0
87
inputs:
9-
versionSpec: 3.6
8+
versionSpec: 3.9
109
addToPath: true
1110
architecture: 'x64'
12-
displayName: 'Use Python Version: 3.6'
13-
11+
displayName: 'Use Python Version: 3.9'
12+
1413
- script: |
15-
python -m pip install --upgrade pip
16-
pip install databricks-cli
17-
displayName: 'Setup Agent'
18-
19-
- task: AzureCLI@2
20-
displayName: Get Databricks token
21-
inputs:
22-
azureSubscription: ${{ parameters.azureServiceConnection }}
23-
scriptType: bash
24-
scriptLocation: inlineScript
25-
inlineScript: |
26-
databricks_resource_id="2ff814a6-3304-4ab8-85cb-cd0e6f879c1d" # This is the official databricks resource id
27-
accessToken=$(curl -X GET -H 'Content-Type: application/x-www-form-urlencoded' \
28-
-d "grant_type=client_credentials&client_id=$servicePrincipalId&resource=$databricks_resource_id&client_secret=$servicePrincipalKey" \
29-
https://login.microsoftonline.com/$tenantId/oauth2/token \
30-
| jq -r .access_token)
31-
echo "##vso[task.setvariable variable=DATABRICKS_TOKEN;isSecret=true]$accessToken"
32-
addSpnToEnvironment: true
14+
# Use the latest stable version from GitHub releases
15+
CLI_VERSION="0.252.0"
16+
17+
echo "Downloading Databricks CLI v${CLI_VERSION} (v2.x)..."
18+
19+
# Download with proper error handling
20+
curl -fsSL --output databricks.tar.gz "https://github.com/databricks/cli/releases/download/v${CLI_VERSION}/databricks_cli_${CLI_VERSION}_linux_amd64.tar.gz"
21+
22+
if [ $? -ne 0 ]; then
23+
echo "##vso[task.logissue type=error]Failed to download Databricks CLI"
24+
exit 1
25+
fi
26+
27+
# Extract and install
28+
tar -xzf databricks.tar.gz
29+
chmod +x databricks
30+
sudo mv databricks /usr/local/bin/
31+
32+
# Verify installation
33+
echo "Verifying Databricks CLI version..."
34+
databricks version
35+
36+
if [ $? -ne 0 ]; then
37+
echo "##vso[task.logissue type=error]Databricks CLI installation failed"
38+
exit 1
39+
fi
40+
displayName: 'Install Databricks CLI v2.x (binary)'
41+
42+
- script: |
43+
echo "Configuring Databricks CLI authentication..."
44+
45+
# Create config directory if it doesn't exist
46+
mkdir -p ~/.databricks
47+
48+
# Create config file with service principal credentials
49+
cat > ~/.databrickscfg << EOF
50+
[DEFAULT]
51+
host = https://$(databricksWorkspaceUrl)
52+
azure_client_id = $(ARM_CLIENT_ID)
53+
azure_client_secret = $(ARM_CLIENT_SECRET)
54+
azure_tenant_id = $(ARM_TENANT_ID)
55+
azure_use_msi = false
56+
EOF
57+
58+
# Test authentication
59+
databricks workspace list /
60+
61+
if [ $? -ne 0 ]; then
62+
echo "##vso[task.logissue type=error]Databricks CLI authentication failed"
63+
exit 1
64+
fi
65+
66+
echo "Authentication successful!"
67+
displayName: 'Authenticate with Databricks CLI'
Lines changed: 10 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,21 @@
11
{
22
"cluster_name": "interactive-cluster",
3-
"autoscale": {
4-
"min_workers": 2,
5-
"max_workers": 4
6-
},
7-
"spark_version": "9.1.x-scala2.12",
8-
"spark_conf": {
9-
"spark.databricks.delta.preview.enabled": "true"
10-
},
3+
"spark_version": "15.4.x-scala2.12",
114
"azure_attributes": {
125
"first_on_demand": 1,
136
"availability": "ON_DEMAND_AZURE",
147
"spot_bid_max_price": -1
158
},
169
"node_type_id": "Standard_DS3_v2",
17-
"driver_node_type_id": "Standard_DS3_v2",
18-
"ssh_public_keys": [],
19-
"custom_tags": {},
2010
"spark_env_vars": {
2111
"PYSPARK_PYTHON": "/databricks/python3/bin/python3"
2212
},
23-
"autotermination_minutes": 60,
24-
"enable_elastic_disk": true,
25-
"cluster_source": "API",
26-
"init_scripts": []
27-
}
13+
"autotermination_minutes": 120,
14+
"single_user_name": "alopezmoreno@mngenvmcap602086.onmicrosoft.com",
15+
"data_security_mode": "SINGLE_USER",
16+
"runtime_engine": "PHOTON",
17+
"autoscale": {
18+
"min_workers": 2,
19+
"max_workers": 4
20+
}
21+
}

0 commit comments

Comments
 (0)