Skip to content

Commit

Permalink
Archiving Old Samples (#664)
Browse files Browse the repository at this point in the history
* Archiving old samples

* Archiving old samples

* Archiving old samples

* Fixing linting errors

* Fixing linting errors

* Fixing YML error

* making AAD - Microsoft Entra changes
  • Loading branch information
promisinganuj authored Jan 30, 2024
1 parent ba7bf31 commit 4c88a06
Show file tree
Hide file tree
Showing 883 changed files with 843 additions and 4,307 deletions.
51 changes: 25 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@ languages:
- bicep
products:
- azure
- microsoft-fabric
- azure-sql-database
- azure-data-factory
- azure-databricks
- azure-stream-analytics
- azure-data-lake-gen2
- azure-functions
- azure-data-share
description: "Code samples showcasing how to apply DevOps concepts to the Modern Data Warehouse Architecture leveraging different Azure Data Technologies."
- azure-synapse-analytics
description: "Code samples showcasing how to apply DevOps concepts to the modern data warehouse architecture leveraging different Azure data technologies."
---

# DataOps for the Modern Data Warehouse
Expand All @@ -29,33 +29,32 @@ The samples are either focused on a single azure service (**Single Tech Samples*

## Single Technology Samples

- [Azure SQL](single_tech_samples/azuresql/)
- [CI/CD - AzureSQL](single_tech_samples/azuresql/)
- [Data Factory](single_tech_samples/datafactory/)
- [CI/CD - ADF](single_tech_samples/datafactory/sample1_cicd)
- [Azure Synapse Analytics](single_tech_samples/synapseanalytics)
- [Microsoft Fabric](./single_tech_samples/fabric/README.md)
- [Feature engineering on Microsoft Fabric](./single_tech_samples/fabric/feature_engineering_on_fabric/README.md)
- [Azure SQL database](./single_tech_samples/azuresql/README.md)
- [CI/CD - Azure SQL database](./single_tech_samples/azuresql/azuresql_ci_cd/README.md)
- [Azure Databricks](single_tech_samples/databricks/)
- [IaC - Basic Azure Databricks deployment](single_tech_samples/databricks/sample1_basic_azure_databricks_environment/)
- [IaC - Enterprise Security and Data Exfiltration Protection Deployment](single_tech_samples/databricks/sample2_enterprise_azure_databricks_environment/)
- [IaC - Cluster Provisioning and Secure Data Access](single_tech_samples/databricks/sample3_cluster_provisioning_and_data_access/)
- [CI/CD - Databricks](single_tech_samples/databricks/sample4_ci_cd/)
- [Stream Analytics](single_tech_samples/streamanalytics/)
- [Azure Purview](single_tech_samples/purview/)
- [IaC - Azure Purview](single_tech_samples/purview/)
- [Azure Data Share](single_tech_samples/datashare/)
- [IaC - Basic deployment](single_tech_samples/databricks/databricks_ci_cd/README.md)
- [Azure Data Factory](./single_tech_samples/datafactory/README.md)
- [CI/CD - Auto publish](./single_tech_samples/datafactory/adf_cicd_auto_publish/README.md)
- [Data pre-processing using Azure Batch](./single_tech_samples/datafactory/adf_data_pre_processing_with_azure_batch/README.md)
- [Azure Synapse Analytics](./single_tech_samples/synapseanalytics/README.md)
- [Serverless best practices](./single_tech_samples/synapseanalytics/synapse_serverless/README.md)
- [Azure Stream Analytics](./single_tech_samples/streamanalytics/README.md)
- [CI/CD - Azure Stream Analytics](./single_tech_samples/streamanalytics/streamanalytics_ci_cd/README.md)

## End to End samples

- **Parking Sensor Solution** - This demonstrates batch, end-to-end data pipeline following the MDW architecture, along with a corresponding CI/CD process.
### Parking Sensor Solution

This demonstrates batch, end-to-end data pipeline following the MDW architecture, along with a corresponding CI/CD process.

![Architecture](docs/images/CI_CD_process_simplified.png?raw=true "Architecture")
This has two version of the solution:
- [Azure Data Factory and Azure Databricks Version](e2e_samples/parking_sensors/)
- [Azure Synapse Version](e2e_samples/parking_sensors_synapse/)
- [**Temperature Events Solution**](e2e_samples/temperature_events) - This demonstrate a high-scale event-driven data pipeline with a focus on how to implement Observability and Load Testing.
![Architecture](e2e_samples/temperature_events/images/temperature-events-architecture.png?raw=true "Architecture")
- [**Dataset Versioning Solution**](e2e_samples/dataset_versioning) - This demonstrates how to use DataFactory to Orchestrate DataFlow, to do DeltaLoads into DeltaLake On DataLake(DoDDDoD).
- [**MDW Data Governance and PII data detection**](e2e_samples/mdw_governance) - This sample demonstrates how to deploy the Infrastructure of an end-to-end MDW Pipeline using [Azure DevOps pipelines](https://azure.microsoft.com/en-us/products/devops/pipelines/) along with a focus around Data Governance and PII data detection.
- *Technology stack*: Azure DevOps, Azure Data Factory, Azure Databricks, Azure Purview, [Presidio](https://github.com/microsoft/presidio)

This has two version of the solution:

- [Azure Data Factory and Azure Databricks Version](e2e_samples/parking_sensors/)
- [Azure Synapse Version](e2e_samples/parking_sensors_synapse/)

## Contributing

Expand Down
27 changes: 27 additions & 0 deletions archive/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Archive

This directory contains the archived samples that are no longer maintained. These are kept here for reference purposes.

## Samples

Here is a list of the archived samples:

### End-to-end samples

- [Deployment Stamps](./../archive/e2e_samples/deployment_stamps/README.md)
- [Dataset versioning](./../archive/e2e_samples/dataset_versioning/README.md)
- [MDW Governance](./../archive/e2e_samples/mdw_governance/README.md)
- [Temperature Events](./../archive/e2e_samples/temperature_events/README.md)

### Single technology samples

- [Azure Databricks - Basic IaC](./../archive/single_tech_samples/databricks_basic_azure_databricks_environment/README.md)
- [Azure Databricks - Data exfiltration](./../archive/single_tech_samples/databricks_enterprise_azure_databricks_environment/README.md)
- [Azure Databricks - IaC cluster provisioning](./../archive/single_tech_samples/databricks_cluster_provisioning_and_data_access/README.md)
- [Azure Data Factory - CI/CD](./../archive/single_tech_samples/adf_cicd/README.md)
- [Azure Data Share](./../archive/single_tech_samples/datashare_automated_data_sharing/README.md)
- [Microsoft Purview IaC](./../archive/single_tech_samples/purview_iac/README.md)
- [Github action to set Microsoft Purview permissions](./../archive/single_tech_samples/purview_managing_data_plane_permissions/README.md)
- [Azure Storage lifecycle management](./../archive/single_tech_samples/storage_lifecycle_management/README.md)
- [Metadata driven module loading in Azure Synapse](./../archive/single_tech_samples/synapse_loading_dynamic_modules/README.md)
- [Azure Synapse integration testing](./../archive/single_tech_samples/synapse_integration_testing/README.md)
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ ACLs give you the ability to apply "finer grain" level of access to directories

The [official doc](https://docs.microsoft.com/en-us/azure/databricks/security/data-governance#secure-access-to-azure-data-lake-storage) mentioned:

- Will you be accessing your data in a more interactive, ad-hoc way, perhaps developing an ML model or building an operational dashboard? In that case, we recommend that you use Azure Active Directory (Azure AD) credential passthrough.
- Will you be accessing your data in a more interactive, ad-hoc way, perhaps developing an ML model or building an operational dashboard? In that case, we recommend that you use Microsoft Entra ID credential passthrough.
- Will you be running automated, scheduled workloads that require one-off access to the containers in your data lake? Then using service principals to access Azure Data Lake Storage is preferred.

## Setup and Deployment
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,14 @@ This sample demonstrates how to scale-out a multi-tenant telemetry processing so
- [How to use the sample](#how-to-use-the-sample)
- [Prerequisites](#prerequisites)
- [Setup and Deployment](#setup-and-deployment)
- [Azure AD for Authentication](#azure-ad-for-authentication)
- [Microsoft Entra ID for Authentication](#microsoft-entra-id-for-authentication)
- [Azure DevOps to Run the Pipeline](#azure-devops-to-run-the-pipeline)
- [Create Sample Data](#create-sample-data)
- [Send Telemetries](#send-telemetries)
- [Access through API](#access-through-api)
- [Debug API and Function App Code Locally](#debug-api-and-function-app-code-locally
)
- [For API APP](#for-api-app)
- [For Function APP](#for-function-app)
- [Debug API and Function App Code Locally](#debug-api-and-function-app-code-locally)
- [For API App](#for-api-app)
- [For Function App](#for-function-app)

## Solution Overview

Expand Down Expand Up @@ -55,8 +54,8 @@ In the above architecture, Stamp1 serves three Tenants A, B and C. Devices which

When an end-user of any of the three Tenants, (i.e., User1, who belongs to Tenant B), sends a request to the API of Stamp1 to get latest telemetry data, the API will:

- first interact with Azure AD to check whether User1 belongs to Tenant B and whether the user has access to the data;
> in this sample, Azure AD App registration, App roles, Users and Groups, Enterprise App are used to manage the multi-tenant environment. For more details of how to setup the authentication environment, please check [Authentication and Authorization](docs/AUTH.md).
- first interact with Microsoft Entra ID to check whether User1 belongs to Tenant B and whether the user has access to the data;
> in this sample, Microsoft Entra App registration, App roles, Users and Groups, Enterprise App are used to manage the multi-tenant environment. For more details of how to setup the authentication environment, please check [Authentication and Authorization](docs/AUTH.md).
- then query tenant details information from CosmosDB to check whether the specified device id belongs to Tenant B;
- once the access right has been confirmed, the latest telemetry data of the specified device id saved in Table storage will be returned to User1.

Expand All @@ -68,21 +67,21 @@ The same follow applies to Stamp2 and Stamp3 too. In this current version of sam

1. [Azure DevOps account](https://azure.microsoft.com/en-us/products/devops/)
2. [Azure Account](https://azure.microsoft.com/en-us/free/)
*Permissions needed*: ability to create and deploy to an azure resource group, a service principal, and grant the collaborator role to the service principal over the resource group; ability to manage Azure AD to create App registration, Users, Groups and Enterprise App Registration.
*Permissions needed*: ability to create and deploy to an azure resource group, a service principal, and grant the collaborator role to the service principal over the resource group; ability to create App registration, Users, Groups and Enterprise App Registration in Microsoft Entra ID.

### Setup and Deployment

#### Azure AD for Authentication
#### Microsoft Entra ID for Authentication

Before creating each of the stamp, follow [Authentication and Authorization](docs/AUTH.md) to create a center Azure AD Tenant first.
Before creating each of the stamp, follow [Authentication and Authorization](docs/AUTH.md) to create a center Microsoft Entra Tenant first.

The center Tenant will be used by the API app to check authentication and authorization.

- Single Tenant Stamps
- With the center tenant setup, the sample will already be able to work as a stamp serving a single tenant (like the "stamp2" in the Architecture). In this case, Tenant D and the center Tenant are the same Tenant. Test user 2 should be added to the center Tenant (which is also Tenant D).

- Multi-Tenants Stamps
- For creating a multi-tenants stamp (like "stamp1" and "stamp3"), Azure AD Tenants besides the center Tenant should also be created and the test users under them should be added to the center Tenant as guest users ([Azure AD - Add guest users](https://docs.microsoft.com/en-us/azure/active-directory/external-identities/b2b-quickstart-add-guest-users-portal)).
- For creating a multi-tenants stamp (like "stamp1" and "stamp3"), Microsoft Entra Tenants besides the center Tenant should also be created and the test users under them should be added to the center Tenant as guest users ([Microsoft Entra ID - Add guest users](https://learn.microsoft.com/en-us/entra/external-id/b2b-quickstart-add-guest-users-portal)).

- For example, if you want to do a tryout of "stamp1" in the Architecture, create a center Tenant first; the center Tenant can be used as any of the three Tenants A, B and C; then create the other two Tenants and add their test users to the center Tenant as guest users.

Expand Down Expand Up @@ -170,7 +169,7 @@ Update following values in the `appsettings.json` with resources in the created

- Cosmos DB connection string
- Storage Account connection string
- Azure AD tenant and client ID
- Microsoft Entra tenant and client ID

Change directory to **.\api\WebApi\\** and run command `dotnet run`.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Authentication and Authorization

This sample uses Azure AD as identity provider. You need to:
This sample uses Microsoft Entra ID as identity provider. You need to:

- Provision Azure AD (or use existing one)
- Provision Microsoft Entra tenant (or use existing one)
- Add users and groups
- Register an application
- Setup authentication
Expand All @@ -11,17 +11,17 @@ This sample uses Azure AD as identity provider. You need to:
- Setup API permission
- Setup Users and groups in Enterprise Application

## Provision Azure AD
## Provision Microsoft Entra tenant

If you don't have any Azure AD yet or need to create new one for testing, follow [Create a new tenant in Azure AD](https://docs.microsoft.com/en-us/azure/active-directory/fundamentals/active-directory-access-create-new-tenant) to provision one.
If you don't have any Microsoft Entra tenant yet or need to create new one for testing, follow [Create a new tenant in Microsoft Entra ID](https://learn.microsoft.com/en-us/entra/fundamentals/create-new-tenant) to provision one.

## Add users and groups

### Users

To test the application, you need at least one user. You can use your own account or create new one.

To create a new user, follow [Add or delete users using Azure Active Directory](https://docs.microsoft.com/en-us/azure/active-directory/fundamentals/add-users-azure-active-directory).
To create a new user, follow [Add or delete users using Microsoft Entra ID](https://learn.microsoft.com/en-us/entra/fundamentals/add-users).

### Groups

Expand All @@ -32,13 +32,13 @@ To test the application, you need two groups.

You can name them as you want.

To create a new Group, follow [Create a basic group and add members using Azure Active Directory](https://docs.microsoft.com/en-us/azure/active-directory/fundamentals/active-directory-groups-create-azure-portal#create-a-basic-group-and-add-members).
To create a new Group, follow [Create a basic group and add members using Microsoft Entra ID](https://learn.microsoft.com/en-us/entra/fundamentals/how-to-manage-groups#create-a-basic-group-and-add-members).

## Register an application

Follow the steps below to register new application.

1. Go to "App registrations" menu in Azure AD portal.
1. Go to "App registrations" menu in Microsoft Entra admin center.

1. Click "New registration".

Expand Down Expand Up @@ -91,7 +91,7 @@ The application permission are used for integration test account. Scope for thes

Once you register an application, you can map app role and groups.

1. Go to Azure AD portal and select "Enterprise Applications".
1. Go to Microsoft Entra admin center and select "Enterprise Applications".
1. Select created application from the list.
1. Select "Users and groups" menu.
1. Click "Add user/group".
Expand All @@ -103,7 +103,7 @@ Once you register an application, you can map app role and groups.

Once you setup group/app role mapping, you can now add users into groups.

1. Go to Azure Portal and select groups.
1. Go to Microsoft Entra admin center and select groups.
1. Select admin group and assign any users.
1. Also assign registered application to the group so that an integration test can use the application id as user. (You can create separate application for this purpose if you want.)
1. Repeat the step for user group.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ parameters:
displayName: Tenant Id (id of the center tenant)
type: string
- name: client_id
displayName: Client Id (client/application id of the Azure AD app registration to handle authentication)
displayName: Client Id (client/application id of the Microsoft Entra ID app registration to handle authentication)
type: string
- name: stamp_id
displayName: Stamp Id (tag to filter resources)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ Each environment has an identical set of resources
- **resourcePrefix** - this prefix will be appended to all resource names to make them unique (notice: lower-case letters only, no special characters allowed
- **location** - location where resources should be deployed
- **locationFormatted** - full name of location where resources should be deployed
- **purviewAdmins** - an [objectID](https://docs.microsoft.com/en-us/azure/marketplace/find-tenant-object-id) of the user/user group to be assigned admin rights to Purview (e.g. your Azure AD objectID). To find the object id of the current logged in user in `az cli`, run `az ad signed-in-user show --output json | jq -r '.objectId'`
- **purviewAdmins** - an [objectID](https://docs.microsoft.com/en-us/azure/marketplace/find-tenant-object-id) of the user/user group to be assigned admin rights to Purview (e.g. your Microsoft Entra object ID). To find the object id of the current logged in user in `az cli`, run `az ad signed-in-user show --output json | jq -r '.objectId'`
- **azureResourceManagerConnection** - name of the service connection (by default this is derived from resource group name e.g. "SC-My-Resource-Group")
9. Create a new pipeline from existing yml in Azure DevOps by selecting the repository and importing the `create-infrastructure.yml` YAML file (see [this post](https://stackoverflow.com/a/59067271) to learn how).
10. Run the `create-infrastructure.yml` pipeline.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@
"type": "string",
"defaultValue": "[subscription().tenantId]",
"metadata": {
"description": "Specifies the Azure Active Directory tenant ID that should be used for authenticating requests to the key vault. Get it by using Get-AzSubscription cmdlet."
"description": "Specifies the Microsoft Entra tenant ID that should be used for authenticating requests to the key vault. Get it by using Get-AzSubscription cmdlet."
}
},
"logAnalyticsName": {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ There are 3 major steps to running the sample. Follow each sub-page in order:

- Accounts
- [Github account](https://github.com/) [Optional]
- [Azure Account](https://azure.microsoft.com/en-au/free/)
- [Azure Account](https://azure.microsoft.com/en-us/free/)
- *Permissions needed*: ability to create and deploy to an azure [resource group](https://docs.microsoft.com/en-us/azure/azure-resource-manager/management/overview), a [service principal](https://docs.microsoft.com/en-us/azure/active-directory/develop/app-objects-and-service-principals), and grant the [collaborator role](https://docs.microsoft.com/en-us/azure/role-based-access-control/overview) to the service principal over the resource group.
- [Azure DevOps Project](https://azure.microsoft.com/en-us/products/devops/)
- *Permissions needed*: ability to create [service connections](https://docs.microsoft.com/en-us/azure/devops/pipelines/library/service-endpoints?view=azure-devops&tabs=yaml), [pipelines](https://docs.microsoft.com/en-us/azure/devops/pipelines/get-started/pipelines-get-started?view=azure-devops&tabs=yaml) and [variable groups](https://docs.microsoft.com/en-us/azure/devops/pipelines/library/variable-groups?view=azure-devops&tabs=yaml).
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@
"keyvault_owner_object_id": {
"type": "String",
"metadata": {
"description": "Active Directory ObjectId to be granted full rights to KV"
"description": "Microsoft Entra Object ID to be granted full rights to KV"
}
},
"datalake_storage_account_name": {
Expand Down
Loading

0 comments on commit 4c88a06

Please sign in to comment.