Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Anuj] Adding Fabric CI/CD sample #695

Merged
merged 77 commits into from
May 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
8673f67
feat: add fabric cicd initial single tech sample
devlace Mar 20, 2024
05b593f
feat: add folder structures
devlace Mar 20, 2024
0c6852d
Initial bootstrap changes
promisinganuj Mar 21, 2024
6cfa20b
Adding code to handle pipelines
promisinganuj Mar 21, 2024
c703465
Merge pull request #679 from Azure-Samples/promisinganuj/675-add-boot…
promisinganuj Mar 21, 2024
126fb5d
Bug fixes in bootstrap
promisinganuj Mar 21, 2024
8cbbb8b
Updated the limitations
promisinganuj Mar 21, 2024
96b7168
Adding CD release code and documentation
ydaponte Mar 21, 2024
9e738d5
Starting to update README.md
promisinganuj Mar 21, 2024
eb62eb8
Update single_tech_samples/fabric/fabric_ci_cd/README.md
ydaponte Mar 21, 2024
1035e3c
Update single_tech_samples/fabric/fabric_ci_cd/README.md
ydaponte Mar 21, 2024
f9b7547
Merge pull request #681 from Azure-Samples/ydaponte/678-add-cd-releas…
ydaponte Mar 21, 2024
550d429
Few more fixes
promisinganuj Mar 21, 2024
883991d
Merge branch 'feat/fabric_cicd' into promisinganuj/675-add-bootstrap
promisinganuj Mar 21, 2024
0741a9e
Updating README.md
promisinganuj Mar 22, 2024
fb1cea4
Updating README.md
promisinganuj Mar 22, 2024
f2511f4
Updating README.md
promisinganuj Mar 22, 2024
b969bde
Updating README.md
promisinganuj Mar 22, 2024
930cade
Updating README.md
promisinganuj Mar 22, 2024
2f13590
Updating README.md
promisinganuj Mar 22, 2024
18cf967
Merge pull request #680 from Azure-Samples/promisinganuj/675-add-boot…
promisinganuj Mar 22, 2024
42e8610
Adding architecture diagram
promisinganuj Mar 22, 2024
e3941aa
Adding architecture diagram
promisinganuj Mar 22, 2024
4f11620
Adding architecture diagram
promisinganuj Mar 22, 2024
bad6e9d
Adding architecture diagram
promisinganuj Mar 22, 2024
52aa43f
Merge pull request #682 from Azure-Samples/promisinganuj/675-add-boot…
promisinganuj Mar 22, 2024
8d9d0a1
Adding timeout, final modifications to the ReadMe
ydaponte Mar 22, 2024
202847e
Merge pull request #683 from Azure-Samples/ydaponte/678-adding-timeout
ydaponte Mar 22, 2024
dbd73f4
minor edits and fixed typos
sreedhar-guda Mar 22, 2024
ae67bac
Minor updates/typos.
sreedhar-guda Mar 22, 2024
b8a5c5c
Update README.md
jose-perales Mar 22, 2024
610a6a0
Merge pull request #684 from Azure-Samples/sguda/fabric-cicd-data-art…
sreedhar-guda Mar 22, 2024
8537ce7
Updating architecture diagram
promisinganuj Mar 25, 2024
8ad7859
Updating architecture diagram
promisinganuj Mar 25, 2024
e9bc156
Merge pull request #686 from Azure-Samples/promisinganuj/675-add-boot…
promisinganuj Mar 25, 2024
b8f3f38
Updating Diagram
promisinganuj Mar 25, 2024
b62ec73
Merge pull request #687 from Azure-Samples/promisinganuj/675-add-boot…
promisinganuj Mar 25, 2024
fb4f080
added template file, token verification, json for curl calls for headers
sreedhar-guda Mar 25, 2024
b432fcc
Minor updates
promisinganuj Mar 26, 2024
884f318
Adding a fix
promisinganuj Mar 26, 2024
70ca905
Merge pull request #688 from Azure-Samples/sguda/fabric-ci-cd-run1
promisinganuj Mar 26, 2024
e3e79d7
Correct link to scripts
ydaponte Mar 26, 2024
e389a32
Correct capital case
ydaponte Mar 26, 2024
06f3790
Link correction, scripts folder rename
ydaponte Mar 26, 2024
c5d8f04
Merge pull request #690 from Azure-Samples/ydaponte/689-correct-link-…
ydaponte Mar 26, 2024
5318984
added code to update admin privs on workspaces and deployment pipeline
sreedhar-guda Mar 26, 2024
7c5cd12
Update .envtemplate with admin info and comments
sreedhar-guda Mar 26, 2024
a90df65
Update README.md to change the order az login commands
sreedhar-guda Mar 26, 2024
ad5ed02
Reducing number of environment variables
promisinganuj Mar 28, 2024
a60bc5d
Merge pull request #693 from Azure-Samples/anuj/692-boostrap-script-e…
promisinganuj Apr 2, 2024
a9c1da5
Adding domain creation logic
promisinganuj Apr 3, 2024
d1656af
Adding domain creation logic
promisinganuj Apr 3, 2024
c7f5945
Adding domain creation logic
promisinganuj Apr 3, 2024
7663762
Adding domain creation logic
promisinganuj Apr 3, 2024
9846793
Adding domain creation logic
promisinganuj Apr 3, 2024
4cbc724
Adding domain creation options
promisinganuj Apr 4, 2024
6399c35
Adding domain creation options
promisinganuj Apr 4, 2024
13a6f57
Merge pull request #694 from Azure-Samples/anuj/692-boostrap-script-e…
promisinganuj Apr 4, 2024
b8d78bb
Merge branch 'feat/fabric_cicd' of https://github.com/Azure-Samples/m…
promisinganuj Apr 4, 2024
ccf7abf
Updating README.md
promisinganuj Apr 4, 2024
2a11a98
Updating README.md
promisinganuj Apr 4, 2024
bb45649
Updating README.md
promisinganuj Apr 4, 2024
194d435
Updating README.md
promisinganuj Apr 4, 2024
4202fa4
Updating README.md
promisinganuj Apr 4, 2024
9ad0add
Adding functionality to add admins
promisinganuj Apr 5, 2024
46b9809
Fixing downstream linting issues
promisinganuj Apr 7, 2024
85332b6
Fixing downstream linting issues
promisinganuj Apr 7, 2024
1afcdb4
Merge pull request #691 from Azure-Samples/sguda/add-ownership-code
promisinganuj Apr 8, 2024
9642a2f
Adding new environment vairable for adding existing capacity name
promisinganuj Apr 17, 2024
fbcb5ce
Merge pull request #700 from Azure-Samples/anuj/adding-existing-capac…
promisinganuj Apr 18, 2024
d02c71e
Updating README.md and architecture diagram
promisinganuj May 20, 2024
7e3b1f1
fixing linting error
promisinganuj May 20, 2024
f37f68e
Incorporating review comments + adding utility script to upload file …
promisinganuj May 22, 2024
c4353b2
Updating README.md
promisinganuj May 23, 2024
bd4074a
Updated to README.md
promisinganuj May 23, 2024
62f97f4
Updated to README.md
promisinganuj May 23, 2024
51d2493
Updating pipleline definitions to use correct variable
promisinganuj May 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .markdownlinkcheck.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@
{
"pattern": "^https://azure.microsoft.com/en-us/products/"
},
{
"pattern": "^https://learn.microsoft.com/azure/"
},
{
"pattern": "^https://dev.azure.com"
}
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ The samples are either focused on a single azure service (**Single Tech Samples*
## Single Technology Samples

- [Microsoft Fabric](./single_tech_samples/fabric/README.md)
- [CI/CD - Microsoft Fabric](./single_tech_samples/fabric/fabric_ci_cd/README.md)
- [Feature engineering on Microsoft Fabric](./single_tech_samples/fabric/feature_engineering_on_fabric/README.md)
- [Azure SQL database](./single_tech_samples/azuresql/README.md)
- [CI/CD - Azure SQL database](./single_tech_samples/azuresql/azuresql_ci_cd/README.md)
Expand Down
2 changes: 1 addition & 1 deletion archive/e2e_samples/mdw_governance/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ This sample demonstrates how to provision end-to-end modern data warehouse solut

## Solution Overview

This solution sets up an Azure Data Lake storage account, with two containers: Datalake and Dropzone. The folder structure in datalake is structured to enable data tiering (Bronze, Silver, Gold), hold shared data (Reference) and shared libraries (Sys). Azure Data Factory instance with linked services connecting to the Azure Data Lake, Azure Key Vault, Azure Databricks and Azure Purview. Application Insights is used for event logging and Office365 API Connection in combination with Logic Apps is used for sending notification emails. A Virtual Network and Private Endpoints are deployed for Data Lake and Key Vault, however the firewall on the services is left open in the initial deployment. Additionally the Databricks virtual network is not peered to the solution's and the databricks public IP is enabled. To fully secure the solution, the Virtual Network should be peered to an internal corporate network, and firewall closed on those services, and follow the [Databricks single tech sample](https://github.com/Azure-Samples/modern-data-warehouse-dataops/tree/main/single_tech_samples/databricks/sample2_enterprise_azure_databricks_environment) for more information on locking down a databricks environment.
This solution sets up an Azure Data Lake storage account, with two containers: Datalake and Dropzone. The folder structure in datalake is structured to enable data tiering (Bronze, Silver, Gold), hold shared data (Reference) and shared libraries (Sys). Azure Data Factory instance with linked services connecting to the Azure Data Lake, Azure Key Vault, Azure Databricks and Azure Purview. Application Insights is used for event logging and Office365 API Connection in combination with Logic Apps is used for sending notification emails. A Virtual Network and Private Endpoints are deployed for Data Lake and Key Vault, however the firewall on the services is left open in the initial deployment. Additionally the Databricks virtual network is not peered to the solution's and the databricks public IP is enabled. To fully secure the solution, the Virtual Network should be peered to an internal corporate network, and firewall closed on those services, and follow the [Databricks single tech sample](./../../single_tech_samples/databricks_enterprise_azure_databricks_environment/README.md) for more information on locking down a databricks environment.
promisinganuj marked this conversation as resolved.
Show resolved Hide resolved

The Azure Data Factory contains an ADF Pipeline that is stored in a git repository, that is taking data from the Dropzone and ingesting it into the bronze folder, after anonymizing its content using [Presidio](https://github.com/microsoft/presidio). The data files for running the pipeline can be found in [Data](./data) folder of this repository.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,6 @@ In this sample we focus on hardening the security around the Azure Databricks en

3.[Azure Private Links](https://docs.microsoft.com/en-us/azure/private-link/private-link-overview) - To secure connectivity with dependent PaaS services.

This sample is also aligned to the implementation pattern published by Databricks around [Data Exfiltration Protection with Azure Databricks](https://databricks.com/blog/2020/03/27/data-exfiltration-protection-with-azure-databricks.html).

The sample implements automating the provisioning of the required services and configurations using the [Infrastructure as Code](https://docs.microsoft.com/en-us/dotnet/architecture/cloud-native/infrastructure-as-code) pattern.

### 1.1. Scope
Expand Down
8 changes: 4 additions & 4 deletions e2e_samples/parking_sensors/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ The following summarizes key learnings and best practices demonstrated by this s
### 7. Monitor infrastructure, pipelines and data

- A proper monitoring solution should be in-place to ensure failures are identified, diagnosed and addressed in a timely manner. Aside from the base infrastructure and pipeline runs, data quality should also be monitored. A common area that should have data monitoring is the malformed record store.
- As an example this repository showcases how to use open source framework [Great Expectations](https://docs.greatexpectations.io/docs/) to define, measure and report data quality metrics at different stages of the data pipeline. Captured Data Quality metrics are reported to Azure Monitor for further visualizing and alerting. Take a look at sample [Data Quality report](docs/images/data_quality_report.png) generated with Azure Monitor workbook. Great Expectations can be configured to generate HTML reports and host directly as static site on Azure Blob Storage. Read more on [How to host and share Data Docs on Azure Blob Storage](https://legacy.docs.greatexpectations.io/en/latest/guides/how_to_guides/configuring_data_docs/how_to_host_and_share_data_docs_on_azure_blob_storage.html).
- As an example this repository showcases how to use open source framework [Great Expectations](https://docs.greatexpectations.io/docs/) to define, measure and report data quality metrics at different stages of the data pipeline. Captured Data Quality metrics are reported to Azure Monitor for further visualizing and alerting. Take a look at sample [Data Quality report](docs/images/data_quality_report.png) generated with Azure Monitor workbook. Great Expectations can be configured to generate HTML reports and host directly as static site on Azure Blob Storage. Read more on [How to host and share Data Docs on Azure Blob Storage](https://docs.greatexpectations.io/docs/oss/guides/setup/configuring_data_docs/host_and_share_data_docs/).

## Key Concepts

Expand Down Expand Up @@ -199,13 +199,13 @@ More resources:

#### Databricks

- [Monitoring Azure Databricks with Azure Monitor](https://docs.microsoft.com/en-us/azure/architecture/databricks-monitoring/)
- [Monitoring Azure Databricks with Azure Monitor](https://learn.microsoft.com/azure/architecture/databricks-monitoring/)
- [Monitoring Azure Databricks Jobs with Application Insights](https://msdn.microsoft.com/en-us/magazine/mt846727.aspx)

#### Data Factory

- [Monitor Azure Data Factory with Azure Monitor](https://docs.microsoft.com/en-us/azure/data-factory/monitor-using-azure-monitor)
- [Alerting in Azure Data Factory](https://azure.microsoft.com/en-us/blog/create-alerts-to-proactively-monitor-your-data-factory-pipelines/)
- [Monitor Azure Data Factory with Azure Monitor](https://learn.microsoft.com/azure/data-factory/monitor-data-factory)
- [Alerting in Azure Data Factory](https://azure.microsoft.com/blog/create-alerts-to-proactively-monitor-your-data-factory-pipelines/)

## How to use the sample

Expand Down
4 changes: 3 additions & 1 deletion single_tech_samples/fabric/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# Microsoft Fabric

[Microsoft Fabric](https://learn.microsoft.com/en-us/fabric/get-started/microsoft-fabric-overview) is an all-in-one analytics solution for enterprises that covers everything from data movement to data science, Real-Time Analytics, and business intelligence. It offers a comprehensive suite of services, including data lake, data engineering, and data integration, all in one place.
[Microsoft Fabric](https://learn.microsoft.com/fabric/get-started/microsoft-fabric-overview) is an all-in-one analytics solution for enterprises that covers everything from data movement to data science, Real-Time Analytics, and business intelligence. It offers a comprehensive suite of services, including data lake, data engineering, and data integration, all in one place.

## Samples

- [Feature Engineering on Microsoft Fabric](./feature_engineering_on_fabric/README.md) - This sample demonstrates how to use Azure ML managed feature store and Microsoft Fabric to build a feature engineering system. It also shows how to track and monitor the data lineage of the features and the model training process using Microsoft Purview. The provided sample also encompasses data validation and exploratory data analysis (EDA) within Fabric notebooks.

- [CI/CD - Microsoft Fabric](./fabric_ci_cd/README.md) - This sample demonstrates how to implement a CI/CD process for Microsoft Fabric using Azure DevOps and Fabric Deployment Pipelines.
30 changes: 30 additions & 0 deletions single_tech_samples/fabric/fabric_ci_cd/.envtemplate
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# The Azure subscription Id.
AZURE_SUBSCRIPTION_ID=""
# The location where the Azure resources will be created.
AZURE_LOCATION=""
# The email address of the Fabric capacity admin. This should be from the same tenant where capacity is being created.
CAPACITY_ADMIN_EMAIL=""
# The name of the existing Fabric capacity. This name is required if you want to use an existing capacity instead of creating a new one.
EXISTING_CAPACITY_NAME=""
# The name of the Fabric project. This name is used for naming the Fabric resources.
FABRIC_PROJECT_NAME=""
# The name of the Fabric domain. It can be an existing domain.
FABRIC_DOMAIN_NAME=""
# The name of the Fabric subdomain. It can be an existing subdomain.
FABRIC_SUBDOMAIN_NAME=""
# The bearer token for calling the Fabric APIs.
FABRIC_BEARER_TOKEN=""
# The Azure DevOps organization name.
ORGANIZATION_NAME=""
# The Azure DevOps project name.
PROJECT_NAME=""
# The Azure DevOps repository name.
REPOSITORY_NAME=""
# Azure DevOps branch name. This branch should already exist in the repository.
BRANCH_NAME=""
# The directory used by Fabric to sync the workspace code. It can be "/" or any other sub-directory. If specifying a sub-directory, it must exist in the repository.
DIRECTORY_NAME=""
# UserPrincipalName (UPN) list of the workspace admins. These users will have admin access to the Fabric workspaces. The values are separated by space.
WORKSPACE_ADMIN_UPNS=("user1@contoso.com" "user2@contoso.com" "...")
# UserPrincipalName (UPN) list of the pipeline admins. These users will have admin access to the Fabric deployment pipeline. The values are separated by space.
PIPELINE_ADMIN_UPNS=("user1@contoso.com" "user2@contoso.com" "...")
Loading
Loading