Skip to content

Commit

Permalink
Merge branch 'feat/e2e-fabric-dataops-sample-v0-2' into kitsune/noteb…
Browse files Browse the repository at this point in the history
…ook_and_pipeline_updates
  • Loading branch information
camaderal committed Dec 17, 2024
2 parents d1b3ab2 + a75c402 commit 077918b
Show file tree
Hide file tree
Showing 15 changed files with 1,070 additions and 12 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -465,6 +465,7 @@ MigrationBackup/
plan.tfplan
terraform.tfstate
terraform.tfstate.backup
.terraform.tfstate.lock.info

# metadata files
*/*.DS_Store
Expand Down
12 changes: 8 additions & 4 deletions e2e_samples/fabric_dataops_sample/.envtemplate
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,17 @@ export GIT_ORGANIZATION_NAME="The Azure DevOps organization."
export GIT_PROJECT_NAME="The Azure Devops project."
export GIT_REPOSITORY_NAME="Your repository under the Azure DevOps project."
export GIT_BRANCH_NAME="The branch where Fabric items will be committed to."
export GIT_DIRECTORY_NAME="The folder where Fabric items will be committed" # Note: Other than the root folder "/", the directory must already exist. Must start with a forward-slash. Example: "/fabric"
## Note: Other than the root folder "/", the directory must already exist. Must start with a forward-slash. Example: "/fabric"
export GIT_DIRECTORY_NAME="The folder where Fabric items will be committed"
# Workspace admin variables
export FABRIC_WORKSPACE_ADMIN_SG_NAME="The name of the Entra security groups with admin members."
# Fabric Capacity variables
export EXISTING_FABRIC_CAPACITY_NAME="" # The name of an existing Fabric capacity. If this is empty, then a new capacity will be created.
export FABRIC_CAPACITY_ADMINS="yourusername@yourdomain,sp_mi_object_id" # Comma separated list. When creating a new Fabric capacity, these users/apps would be added as capacity admin. For users, mention "userPrincipalName". For principals (sp/mi), mention "Object ID". Don't add spaces after the comma.
## The name of an existing Fabric capacity. If this is empty, then a new capacity will be created.
export EXISTING_FABRIC_CAPACITY_NAME=""
## Comma separated list. When creating a new Fabric capacity, these users/apps would be added as capacity admin. For users, mention "userPrincipalName". For principals (sp/mi), mention "Object ID". Don't add spaces after the comma.
export FABRIC_CAPACITY_ADMINS="yourusername@yourdomain,sp_mi_object_id"
# ADLS Gen2 connection variable
export ADLS_GEN2_CONNECTION_ID="" # The connection ID for the ADLS Gen2 Cloud Connection. If not provided, the ALDS Gen2 shortcut creation would be skipped.
## The connection ID for the ADLS Gen2 Cloud Connection. If not provided, the ALDS Gen2 shortcut creation would be skipped.
export ADLS_GEN2_CONNECTION_ID=""
# REST connection variable
export REST_DATA_SOURCE_CONNECTION_ID="" # The connection ID for the REST Data source Connection.
51 changes: 44 additions & 7 deletions e2e_samples/fabric_dataops_sample/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,14 @@ This sample aims to provide customers with a reference end-to-end (E2E) implemen

## Contents <!-- omit in toc -->

- [Architecture](#architecture)
- [Solution Overview](#solution-overview)
- [Architecture](#architecture)
- [Continuous Integration and Continuous Delivery (CI/CD)](#continuous-integration-and-continuous-delivery-cicd)
- [How to use the sample](#how-to-use-the-sample)
- [High-level deployment sequence](#high-level-deployment-sequence)
- [Deployed resources](#deployed-resources)
- [How to use the sample](#how-to-use-the-sample)
- [Pre-requisites](#pre-requisites)
- [Familiarize yourself with known issues, limitations, and workarounds](#familiarize-yourself-with-known-issues-limitations-and-workarounds)
- [Deploying infrastructure](#deploying-infrastructure)
- [Verifying the infrastructure deployment](#verifying-the-infrastructure-deployment)
- [Cleaning up](#cleaning-up)
Expand All @@ -24,9 +27,23 @@ This sample aims to provide customers with a reference end-to-end (E2E) implemen
- [What is the significance of `use_cli` and `use_msi` flags?](#what-is-the-significance-of-use_cli-and-use_msi-flags)
- [References](#references)

## Architecture
## Solution Overview

### Architecture

This sample utilizes a [standard medallion architecture](https://learn.microsoft.com/en-us/fabric/onelake/onelake-medallion-lakehouse-architecture). The following shows at a high-level the overall data pipeline architecture built on Microsoft Fabric, along with associated Azure components.

![Microsoft Fabric Architecture](./images/fabric-archi.png)

### Continuous Integration and Continuous Delivery (CI/CD)

![Microsoft Fabric Architecture](./images/fabric_archi.png)
Microsoft Fabric has a number of CI/CD workflow options as documented [here](https://learn.microsoft.com/fabric/cicd/manage-deployment). This sample utilizes [Option 1: Git-based deployment](https://learn.microsoft.com/fabric/cicd/manage-deployment#option-1---git--based-deployments).

The diagram below illustrates the complete end-to-end CI/CD process:

![Fabric CI/CD diagram](./images/fabric-cicd-option1.png)

## How to use the sample

### High-level deployment sequence

Expand Down Expand Up @@ -67,8 +84,6 @@ Here is a list of resources that are deployed:
- Fabric workspace GIT integration
- Azure Role assignments to entra security group and workspace identity

## How to use the sample

### Pre-requisites

- An Entra user that can access Microsoft Fabric (Free license is enough).
Expand Down Expand Up @@ -106,6 +121,10 @@ Here is a list of resources that are deployed:
- Contributor permissions to an Azure Repo in such Azure DevOps environment.
- A branch and a folder in the repository where the Fabric items will be committed. The folder must already exist.

### Familiarize yourself with known issues, limitations, and workarounds

Refer to the [known issues, limitations, and workarounds](docs/issues_limitations_and_workarounds.md) page for details. Reviewing this page is highly recommended to understand the limitations, issues, and challenges you may encounter while building CI/CD pipelines for Fabric. It also provides workarounds and alternative approaches to overcome these challenges. This information will also help you understand why certain approaches are used in the infrastructure deployment scripts and Azure DevOps pipelines.

### Deploying infrastructure

- Clone the repository:
Expand Down Expand Up @@ -255,7 +274,25 @@ _**Note: Please note that the Fabric notebook and pipeline deployed are placehol
## Cleaning up
Coming up soon...
Once you have finished with the sample, you can delete the deployed resources by running the cleanup script.
The [cleanup script](./cleanup.sh) performs the following actions:
- Deletes all the deployed Azure and Fabric resources.
- Deletes Fabric connection to ADLS Gen2 storage.
- Resets corresponding `ADLS_GEN2_CONNECTION_ID` variable in the .env file.
- Ensures that the Azure Key Vault is purged.
- Removes intermediate Terraform files created during deployment process including state files.
You will need to authenticate **with user context** and run the cleanup script.
```bash
source .env
az config set core.login_experience_v2=off
az login --tenant $TENANT_ID
az config set core.login_experience_v2=on
./cleanup.sh
```
## Frequently asked questions
Expand Down
35 changes: 35 additions & 0 deletions e2e_samples/fabric_dataops_sample/cleanup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
#!/bin/bash
set -o errexit
set -o pipefail
set -o nounset
# set -o xtrace # For debugging

#. ./scripts/common.sh
#. ./scripts/verify_prerequisites.sh
#. ./scripts/init_environment.sh

#######################
source ./.env

# Log all outputs and errors to a log file
log_file="cleanup_${BASE_NAME}_$(date +"%Y%m%d_%H%M%S").log"
exec > >(tee -a "$log_file")
exec 2>&1

for env_name in dev; do
ENVIRONMENT_NAME=$env_name \
TENANT_ID=$TENANT_ID \
RESOURCE_GROUP_NAME=$RESOURCE_GROUP_NAME \
BASE_NAME=$BASE_NAME \
APP_CLIENT_ID=$APP_CLIENT_ID \
APP_CLIENT_SECRET=$APP_CLIENT_SECRET \
GIT_ORGANIZATION_NAME=$GIT_ORGANIZATION_NAME \
GIT_PROJECT_NAME=$GIT_PROJECT_NAME \
GIT_REPOSITORY_NAME=$GIT_REPOSITORY_NAME \
GIT_BRANCH_NAME=$GIT_BRANCH_NAME \
FABRIC_WORKSPACE_ADMIN_SG_NAME=$FABRIC_WORKSPACE_ADMIN_SG_NAME \
EXISTING_FABRIC_CAPACITY_NAME=$EXISTING_FABRIC_CAPACITY_NAME \
FABRIC_CAPACITY_ADMINS=$FABRIC_CAPACITY_ADMINS \
ADLS_GEN2_CONNECTION_ID=$ADLS_GEN2_CONNECTION_ID \
bash -c "./scripts/cleanup_infrastructure.sh"
done
54 changes: 54 additions & 0 deletions e2e_samples/fabric_dataops_sample/docs/adr_fabric_cicd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Fabric CICD

## Status

Accepted

## Context

This ADR aims to document the decision(s) around the overall CI/CD architecture of the Fabric E2E sample.

Currently, there a number of documented options to do CI/CD for Fabric as explained [here](https://learn.microsoft.com/en-us/fabric/cicd/manage-deployment).

Key considerations:

- Technical simplicity - simpler is better.
- Viability of an E2E solution - can we build a full E2E DataOps sample with this option?
- Functionality offered - any key capabilities needed as informed by real-world requirements?
- Long-term relevance - alignment to [Microsoft Fabric roadmap](https://learn.microsoft.com/en-us/fabric/release-plan/) to ensure longer-term relevance of the sample.
- Existing samples - are there existing samples already?

| | Option 1 | Option 2 | Option 3 | Option 4 |
|---------------------------|----------|----------|----------|----------|
| Technical simplicity | Medium | Low | High | Low |
| Viability of E2E solution | High | High | Low | Medium |
| Functionality offered | High | High | Medium | High |
| Long-term relevance | High | Medium | High | Medium |
| Existing samples | No | No | Yes | No |

## Proposal details

Based on the key considerations, both **Option 1** and **Option 2** were extensively analyzed. However, due to the significant technical complexity of Option 2, particularly in managing item dependencies and manually tracking individual item IDs, Option 1 has been selected as the preferred approach for this milestone.

The following are CI/CD flow diagrams built as part of this ADR:

Option 1:

![Option 1: Fabric CI/CD](../images/fabric-cicd-option1.png)

Option 2:

![Option 2: Fabric CI/CD](../images/fabric-cicd-option2.png)

Propose option is "option #2 - Git-based deployments using Build environments".

## Decision

To be agreed

## Next steps

If accepted:

- Proceed with build of the sample accordingly.
- Spikes to validate any assumptions, specifically around automated deployments and ephemeral build/test workspaces.
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ This document describes the naming conventions for the Fabric E2E sample. It inc

## Status

Proposed
Accepted

## Context

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Known issues, limitations, and workarounds <!-- omit in toc -->

This document lists the known issues and limitations specific to the sample, as well as to Fabric in general. These issues and limitations are based on the current state of the Fabric REST APIs and Fabric deployment pipelines. The document also provides recommendations on how to handle these challenges.
362 changes: 362 additions & 0 deletions e2e_samples/fabric_dataops_sample/images/fabric-cicd-option1.drawio

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
328 changes: 328 additions & 0 deletions e2e_samples/fabric_dataops_sample/images/fabric-cicd-option2.drawio

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,7 @@ module "key_vault_secret_001" {
key_vault_id = module.keyvault.keyvault_id
content_type = "Application Insights Connection String"
tags = local.tags
depends_on = [module.keyvault_secrets_officer_assignment_001]
}

# Below modules currently do not support service principal/managed identity execution context.
Expand Down
Loading

0 comments on commit 077918b

Please sign in to comment.