Skip to content

Commit

Permalink
Fix links in documentation (#1426)
Browse files Browse the repository at this point in the history
* Fix up links in docs

* Fix one more link

* Use relative links
  • Loading branch information
pamelafox authored Mar 15, 2024
1 parent d487104 commit a8b6618
Show file tree
Hide file tree
Showing 6 changed files with 27 additions and 20 deletions.
2 changes: 1 addition & 1 deletion .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ check the box below and notify the tutorial author. A Microsoft employee can do

## Code quality checklist

See [CONTRIBUTING.md](../CONTRIBUTING.md#submit-pr) for more details.
See [CONTRIBUTING.md](https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/CONTRIBUTING.md#submit-pr) for more details.

- [ ] The current tests all pass (`python -m pytest`).
- [ ] I added tests that prove my fix is effective or that my feature works
Expand Down
2 changes: 1 addition & 1 deletion docs/azd.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Deploying with the Azure Developer CLI

This guide includes advanced topics that are not necessary for a basic deployment. If you are new to the project, please consult the main [README](../README.md#deploying-from-scratch) for steps on deploying the project.
This guide includes advanced topics that are not necessary for a basic deployment. If you are new to the project, please consult the main [README](../README.md#deploying) for steps on deploying the project.

[📺 Watch: Deployment of your chat app](https://www.youtube.com/watch?v=mDFZdmn7nhk)

Expand Down
8 changes: 5 additions & 3 deletions docs/deploy_features.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,13 +36,15 @@ To enable integrated vectorization with this sample:

## Enabling authentication

By default, the deployed Azure web app will have no authentication or access restrictions enabled, meaning anyone with routable network access to the web app can chat with your indexed data. You can require authentication to your Azure Active Directory by following the [Add app authentication](https://learn.microsoft.com/azure/app-service/scenario-secure-app-authentication-app-service) tutorial and set it up against the deployed web app.
By default, the deployed Azure web app will have no authentication or access restrictions enabled, meaning anyone with routable network access to the web app can chat with your indexed data. If you'd like to automatically setup authentication and user login as part of the `azd up` process, see [this guide](./login_and_acl.md).

Alternatively, you can manually require authentication to your Azure Active Directory by following the [Add app authentication](https://learn.microsoft.com/azure/app-service/scenario-secure-app-authentication-app-service) tutorial and set it up against the deployed web app.

To then limit access to a specific set of users or groups, you can follow the steps from [Restrict your Azure AD app to a set of users](https://learn.microsoft.com/azure/active-directory/develop/howto-restrict-your-app-to-a-set-of-users) by changing "Assignment Required?" option under the Enterprise Application, and then assigning users/groups access. Users not granted explicit access will receive the error message -AADSTS50105: Your administrator has configured the application <app_name> to block users unless they are specifically granted ('assigned') access to the application.-

## Enabling login and document level access control

By default, the deployed Azure web app allows users to chat with all your indexed data. You can enable an optional login system using Azure Active Directory to restrict access to indexed data based on the logged in user. Enable the optional login and document level access control system by following [this guide](docs/login_and_acl.md).
By default, the deployed Azure web app allows users to chat with all your indexed data. You can enable an optional login system using Azure Active Directory to restrict access to indexed data based on the logged in user. Enable the optional login and document level access control system by following [this guide](./login_and_acl.md).

## Enabling CORS for an alternate frontend

Expand All @@ -60,7 +62,7 @@ Both these repositories adhere to the same [HTTP protocol for RAG chat apps](htt

## Using local parsers

If you want to decrease the charges by using local parsers instead of Azure Document Intelligence, you can set environment variables before running the [data ingestion script](/docs/data_ingestion.md). Note that local parsers will generally be not as sophisticated.
If you want to decrease the charges by using local parsers instead of Azure Document Intelligence, you can set environment variables before running the [data ingestion script](./data_ingestion.md). Note that local parsers will generally be not as sophisticated.

1. Run `azd env set USE_LOCAL_PDF_PARSER true` to use the local PDF parser.
1. Run `azd env set USE_LOCAL_HTML_PARSER true` to use the local HTML parser.
2 changes: 1 addition & 1 deletion docs/deploy_lowcost.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ However, if your goal is to minimize costs while prototyping your application, f

4. Use the free tier of Azure Document Intelligence (used in analyzing files):


```shell
azd env set AZURE_DOCUMENTINTELLIGENCE_SKU F0
```
Expand Down
24 changes: 12 additions & 12 deletions docs/login_and_acl.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ In both the chat and ask a question modes, under **Developer settings** optional
### Troubleshooting Azure AD Setup

* If any Azure AD apps need to be recreated, you can avoid redeploying the app by [changing the app settings in the portal](https://learn.microsoft.com/azure/app-service/configure-common?tabs=portal#configure-app-settings). Any of the [required environment variables](#environment-variables-reference) can be changed. Once the environment variables have been changed, restart the web app.
* It's possible a consent dialog will not appear when you log into the app for the first time. If this consent dialog doesn't appear, you will be unable to use the security filters because the API server app does not have permission to read your authorization information. A consent dialog can be forced to appear by adding `"prompt": "consent"` to the `loginRequest` property in [`authentication.py`](./app/backend/core/authentication.py)
* It's possible a consent dialog will not appear when you log into the app for the first time. If this consent dialog doesn't appear, you will be unable to use the security filters because the API server app does not have permission to read your authorization information. A consent dialog can be forced to appear by adding `"prompt": "consent"` to the `loginRequest` property in [`authentication.py`](../app/backend/core/authentication.py)
* It's possible that your tenant admin has placed a restriction on consent to apps with [unverified publishers](https://learn.microsoft.com/azure/active-directory/develop/publisher-verification-overview). In this case, only admins may consent to the client and server apps, and normal user accounts are unable to use the login system until the admin consents on behalf of the entire organization.
* It's possible that your tenant admin requires [admin approval of all new apps](https://learn.microsoft.com/azure/active-directory/manage-apps/manage-consent-requests). Regardless of whether you select the delegated or admin permissions, the app will not work without tenant admin consent.

Expand All @@ -144,42 +144,42 @@ Two optional scripts are provided that allow easier setup of sample data with do

### Azure Data Lake Storage Gen2 Setup

[Azure Data Lake Storage Gen2](https://learn.microsoft.com/azure/storage/blobs/data-lake-storage-introduction) implements an [access control model](https://learn.microsoft.com/azure/storage/blobs/data-lake-storage-access-control) that can be used for document level access control. The [adlsgen2setup.ps1](./scripts/adlsgen2setup.ps1) script uploads the sample data included in the [data](./data) folder to a Data Lake Storage Gen2 storage account. The [Storage Blob Data Owner](https://learn.microsoft.com/azure/storage/blobs/data-lake-storage-access-control-model#role-based-access-control-azure-rbac) role is required to use the script.
[Azure Data Lake Storage Gen2](https://learn.microsoft.com/azure/storage/blobs/data-lake-storage-introduction) implements an [access control model](https://learn.microsoft.com/azure/storage/blobs/data-lake-storage-access-control) that can be used for document level access control. The [adlsgen2setup.ps1](../scripts/adlsgen2setup.ps1) script uploads the sample data included in the [data](./data) folder to a Data Lake Storage Gen2 storage account. The [Storage Blob Data Owner](https://learn.microsoft.com/azure/storage/blobs/data-lake-storage-access-control-model#role-based-access-control-azure-rbac) role is required to use the script.

In order to use this script, an existing Data Lake Storage Gen2 storage account is required. Run `azd env set AZURE_ADLS_GEN2_STORAGE_ACCOUNT <your-storage-account>` prior to running the script.

To run the script, run the following command: `./scripts/adlsgen2setup.ps1`. The script performs the following steps:
* Creates example [groups](https://learn.microsoft.com/azure/active-directory/fundamentals/how-to-manage-groups) listed in the [sampleacls.json](./scripts/sampleacls.json) file.
To run the script, run the following command: `/scripts/adlsgen2setup.ps1`. The script performs the following steps:
* Creates example [groups](https://learn.microsoft.com/azure/active-directory/fundamentals/how-to-manage-groups) listed in the [sampleacls.json](../scripts/sampleacls.json) file.
* Creates a filesystem / container `gptkbcontainer` in the storage account.
* Creates the directories listed in the [sampleacls.json](./scripts/sampleacls.json) file.
* Uploads the sample PDFs referenced in the [sampleacls.json](./scripts/sampleacls.json) file into the appropriate directories.
* [Recursively sets Access Control Lists (ACLs)](https://learn.microsoft.com/azure/storage/blobs/data-lake-storage-acl-cli) using the information from the [sampleacls.json](./scripts/sampleacls.json) file.
* Creates the directories listed in the [sampleacls.json](../scripts/sampleacls.json) file.
* Uploads the sample PDFs referenced in the [sampleacls.json](../scripts/sampleacls.json) file into the appropriate directories.
* [Recursively sets Access Control Lists (ACLs)](https://learn.microsoft.com/azure/storage/blobs/data-lake-storage-acl-cli) using the information from the [sampleacls.json](../scripts/sampleacls.json) file.

In order to use the sample access control, you need to join these groups in your Azure AD tenant.

Note that this optional script may not work in Codespaces if your administrator has applied a [Conditional Access policy](https://learn.microsoft.com/azure/active-directory/conditional-access/overview) to your tenant.

### Azure Data Lake Storage Gen2 Prep Docs

Once a Data Lake Storage Gen2 storage account has been setup with sample data and access control lists, [prepdocs.py](./scripts/prepdocs.py) can be used to automatically process PDFs in the storage account and store them with their [access control lists in the search index](https://learn.microsoft.com/azure/storage/blobs/data-lake-storage-access-control).
Once a Data Lake Storage Gen2 storage account has been setup with sample data and access control lists, [prepdocs.py](../scripts/prepdocs.py) can be used to automatically process PDFs in the storage account and store them with their [access control lists in the search index](https://learn.microsoft.com/azure/storage/blobs/data-lake-storage-access-control).

To run this script with a Data Lake Storage Gen2 account, first set the following environment variables:

* `AZURE_ADLS_GEN2_STORAGE_ACCOUNT`: Name of existing [Data Lake Storage Gen2 storage account](https://learn.microsoft.com/azure/storage/blobs/data-lake-storage-introduction).
* (Optional) `AZURE_ADLS_GEN2_FILESYSTEM`: Name of existing Data Lake Storage Gen2 filesystem / container in the storage account. If empty, `gptkbcontainer` is used.
* (Optional) `AZURE_ADLS_GEN2_FILESYSTEM_PATH`: Specific path in the Data Lake Storage Gen2 filesystem / container to process. Only PDFs contained in this path will be processed.

Once the environment variables are set, run the script using the following command: `./scripts/prepdocs.ps1` or `./scripts/prepdocs.sh`.
Once the environment variables are set, run the script using the following command: `/scripts/prepdocs.ps1` or `/scripts/prepdocs.sh`.

### Manually managing Document Level Access Control

Manually enable document level access control on a search index and manually set access control values using the [manageacl.ps1](./scripts/manageacl.ps1) script.
Manually enable document level access control on a search index and manually set access control values using the [manageacl.ps1](../scripts/manageacl.ps1) script.

Run `azd up` or use `azd env set` to manually set `AZURE_SEARCH_SERVICE` and `AZURE_SEARCH_INDEX` environment variables prior to running the script.

The script supports the following commands. Note that the syntax is the same regardless of whether [manageacl.ps1](./scripts/manageacl.ps1) or [manageacl.sh](./scripts/manageacl.sh) is used.
The script supports the following commands. Note that the syntax is the same regardless of whether [manageacl.ps1](../scripts/manageacl.ps1) or [manageacl.sh](../scripts/manageacl.sh) is used.
* `./scripts/manageacl.ps1 --acl-action enable_acls`: Creates the required `oids` (User ID) and `groups` (Group IDs) [security filter](https://learn.microsoft.com/azure/search/search-security-trimming-for-azure-search) fields for document level access control on your index. Does nothing if these fields already exist.
* Example usage: `../scripts/manageacl.ps1 --acl-action enable_acls`
* Example usage: `./scripts/manageacl.ps1 --acl-action enable_acls`
* `./scripts/manageacl.ps1 --document [name-of-pdf.pdf] --acl-type [oids or groups]--acl-action view`: Prints access control values associated with either User IDs or Group IDs for a specific document.
* Example to view all Group IDs from the Benefit_Options PDF: `./scripts/manageacl.ps1 --document Benefit_Options.pdf --acl-type oids --acl-action view`.
* `./scripts/manageacl.ps1 --document [name-of-pdf.pdf] --acl-type [oids or groups]--acl-action add --acl [ID of group or user]`: Adds an access control value associated with either User IDs or Group IDs for a specific document.
Expand Down
9 changes: 7 additions & 2 deletions docs/productionizing.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ to production. Here are some things to consider:
* [Azure resource configuration](#azure-resource-configuration)
* [Additional security measures](#additional-security-measures)
* [Load testing](#load-testing)
* [Evaluation](#evaluation)

## Azure resource configuration

Expand All @@ -23,7 +24,7 @@ If the maximum TPM isn't enough for your expected load, you have a few options:

* Use a backoff mechanism to retry the request. This is helpful if you're running into a short-term quota due to bursts of activity but aren't over long-term quota. The [tenacity](https://tenacity.readthedocs.io/en/latest/) library is a good option for this, and this [pull request](https://github.com/Azure-Samples/azure-search-openai-demo/pull/500) shows how to apply it to this app.

* If you are consistently going over the TPM, then consider implementing a load balancer between OpenAI instances. Most developers implement that using Azure API Management following [this blog post](https://www.raffertyuy.com/raztype/azure-openai-load-balancing/) or [this repository](https://github.com/andredewes/apim-aoai-smart-loadbalancing). Another approach is to use [LiteLLM's load balancer](https://docs.litellm.ai/docs/providers/azure#azure-api-load-balancing) with Azure Cache for Redis.
* If you are consistently going over the TPM, then consider implementing a load balancer between OpenAI instances. Most developers implement that using Azure API Management using [the openai-apim-lb repo](https://github.com/Azure-Samples/openai-apim-lb) or with Azure Container Apps using [the openai-aca-lb repo](https://github.com/Azure-Samples/openai-aca-lb). Another approach is to use [LiteLLM's load balancer](https://docs.litellm.ai/docs/providers/azure#azure-api-load-balancing) with Azure Cache for Redis.

### Azure Storage

Expand Down Expand Up @@ -55,7 +56,7 @@ and scale up the maximum/minimum based on load.

* **Authentication**: By default, the deployed app is publicly accessible.
We recommend restricting access to authenticated users.
See [Enabling authentication](../README.md#enabling-authentication) to learn how to enable authentication.
See [Enabling authentication](./deploy_features.md#enabling-authentication) to learn how to enable authentication.
* **Networking**: We recommend deploying inside a Virtual Network. If the app is only for
internal enterprise use, use a private DNS zone. Also consider using Azure API Management (APIM)
for firewalls and other forms of protection.
Expand Down Expand Up @@ -99,3 +100,7 @@ Here's an example loadtest for 50 users and a spawn rate of 1 per second:
![Screenshot of Locust charts showing 5 requests per second](images/screenshot_locust.png)

After each test, check the local or App Service logs to see if there are any errors.

## Evaluation

Before you make your chat app available to users, you'll want to rigorously evaluate the answer quality. You can use tools in [the AI RAG Chat evaluator](https://github.com/Azure-Samples/ai-rag-chat-evaluator) repository to run evaluations, review results, and compare answers across runs.

0 comments on commit a8b6618

Please sign in to comment.