Skip to content

Commit

Permalink
Feat/single tech/feature engineering on fabric br/setup env notebook (#…
Browse files Browse the repository at this point in the history
…654)

* feat: Add optional setup environment notebook for
simplified Azure resource setup
---------

Co-authored-by: Chenshu Cai <cchenshu5@gmail.com>
Co-authored-by: Anuj Parashar <promisinganuj@gmail.com>
Co-authored-by: Siliang Jiao <sijiao@microsoft.com>
  • Loading branch information
4 people authored Nov 20, 2023
1 parent c45d186 commit 974bde8
Show file tree
Hide file tree
Showing 2 changed files with 475 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,8 @@ As described above, the sample uses Microsoft Fabric as the data analytics platf

For granting access to Microsoft Purview access, assign [Data curators](https://learn.microsoft.com/en-us/purview/how-to-create-and-manage-collections#roles) role in the Purview root collection to this SP. For details, refer to [Microsoft documentation](https://learn.microsoft.com/purview/how-to-create-and-manage-collections#add-role-assignments).

> Note: *An optional (and novel) alternative for deploying the specified Azure resources involves utilizing the Fabric notebook. The [deploy-azure-resources.ipynb](./src/infra/deploy-azure-resources.ipynb) notebook can be used to interactively create the aforementioned resources. It utilizes Azure's Python SDK along with Azure CLI to facilitate the resource creation process.*
### Microsoft Fabric setup

1. Create a Microsoft Fabric workspace
Expand Down Expand Up @@ -140,11 +142,11 @@ As described above, the sample uses Microsoft Fabric as the data analytics platf
![new](./images/featurestore_5.png)
![env](./images/featurestore_4.png)

In this environment, you will setup the python environment under 'Public Libraries'. This can be done by adding `azureml-featurestore` package using PyPI, or by clicking 'Add from yml' and selecting `./src/environment/Publiclibrary.yml` from this repo.
In this environment, you will setup the python environment under 'Public Libraries'. This can be done by adding `azureml-featurestore` package using PyPI, or by clicking 'Add from yml' and selecting [src/environment/Publiclibrary.yml](./src/environment/Publiclibrary.yml) from this repo.

![pip](./images/featurestore_3.png)

You will also need to edit the Spark properties. The yaml template with the required properties is available at `./src/environment/sparkProperties.yaml`. Replace the value with the resources that you created.
You will also need to edit the Spark properties. The yaml template with the required properties is available at [src/environment/sparkProperties.yml](./src/environment/sparkProperties.yml). Replace the value with the resources that you created.

```yaml
runtime_version: '1.1'
Expand Down Expand Up @@ -387,8 +389,7 @@ Even though these can be included in the same data pipeline, the training and in

The model training notebook is available at [model_training](./src/notebooks/model_training.ipynb). This notebook uses the features registered in the previous step to train a machine learning model. Like previous notebooks, it also registers the lineage in Microsoft Purview. But this time, it's the model training lineage.

> Note: the model_training will need to retrieve data from the feature store, which requires credential to access the feature store. Make sure the `client_secret` parameter is set in the notebook `feature_set_retrieval`.
>
> *Note: the model_training will need to retrieve data from the feature store, which requires credential to access the feature store. Make sure the `client_secret` parameter is set in the notebook `feature_set_retrieval`*.
To run the notebook, Open it and and click `Run all`. The model will be trained and registered as an `ML model` in the Fabric workspace.

Expand Down
Loading

0 comments on commit 974bde8

Please sign in to comment.