An E2E solution of the Data Resources on Azure using the Snapshot Serengeti dataset. This E2E solution focuses Azure Synapse Analytics, Power Bi & the Azure Data Factory.
-
An active Azure Subscription. if you do not have one you can create a free Azure Subscription.
-
Appropriate permissions within the Azure subscription that will allow for creating resources, assigning roles, registering providers and deleting resources.
To proceed you need to deploy the following azure resources:
- Microsoft.KeyVault
- Microsoft.Synapse
- Microsoft.ContainerRegistry
- Microsoft.Storage
- Microsoft.MachineLearningServices
- Microsoft.Insights
- Microsoft.OperationalInsights
- Microsoft.Sql
⚠️ In case any of these resources providers are not registered, follow the steps from the documentation to register them. -
Fork this repository to your GitHub account so that you can link it to the synapse workspace.
-
Right-click or
Ctrl + click
the button below to open the Azure Portal in a new tab and begin deployment. -
On the opened azure portal custom deployment page select the subscription from the drop-down, next click on the create new and provide a unique name to your resource group then select a valid location for the resources.
-
Provide the SQL login password which should contain at least
8 characters, 1 uppercase, 1 lowercase, 1 number and 1 special character
then click on the Review + create button. -
Once the validation is done, click on the Create button to start the deployment.
-
The deployment should take approximately 10 minutes to complete. Once the deployment is completed, you can navigate to the resource group to check the deployed resources.
-
If Successful you should see 10 resources in your resource group.
-
Click on the Synapse Workspace resource and then launch Synapse Studio.
-
We'll need to link the synapse workspace to the repo you forked in the pre-requisites so that we can import the necessary notebooks and scripts. Click on
Manage
>Git Configuration
>Configure
-
On the wizard that opens, select the Repository type to be
GitHub
the GitHub repository owner asyour GitHub username
, then proceed to authenticate to your GitHub. -
After successful authentication, select the repository name from the dropdown. For the
Collaboration branch
select the defaultmain
branch and similarly for thePublish branch
select themain
branch. -
The
Root folder
inputsynapse-worspace
then finally click Apply -
When this completes select your working branch then save.
ℹ️ To learn more about Git & source control in a Synapse Workspace read more here
To save up on your cloud costs,delete the resource group that was created for this lab, after completing the workshop. To do so, navigate to the resource group and click on the delete button.