Skip to content

MarchingBug/DoctorNotesTextAnalyticsSearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Doctor Notes Text Analytics- Code Repository

Doctor Notes - Text Analytics Search App

Overview

Source Code Repository for the Cognitive Search based Doctor Notes with Text Analytics Search App

If you simply want to show this code in a running instance, feel free to use https://doctornotessearchpoc.azurewebsites.net/. Otherwise, you can follow the setup instructions below to recreate your own instance in your Azure subscription.

Purpose

Give doctors the ability to extract and find meaningful patient data from their notes, to either have a larger view for a patient, to find patterns or for research. How can we use AI to better understand to achieve this goal? In this code, we take a sample set of fake doctor notes and apply several machine learning techniques (name entity recognition of medical terms, finding semantically similar words, and knowledge graphs) providing medical professionals a better way to find and make sense of the research they need.

Assets

This repository containes to following assets and code:

  • InvokeHealthEntityExtraction: An Azure Function to call the Text Analytics for Health container which is invoked as a custom skill in Azure Cognitive Services Skill-Sets
  • Azure SQL Database
  • AzureCognitiveSearchService: Jupyter notebook that will create data source, index, skillsets and indexer used by Azure Cognitive Search
  • Web Application
  • Github actions configuration to deploy the web application

What you will learn

If you are new or new-ish to Azure, at the end of this project you will have a better understanding of the following concepts:

  • Azure Storage Accounts
  • Azure Cognitive Services
  • Azure SQL Server
  • Azure Functions
  • Azure App Services
  • Advanced Azure Cognitive Search
  • Azure Container Instances
  • Jupiter Notebooks
  • Github Actions

Architecture

Data is pulled from an Azure SQL Database. The main indexer runs data in json format through a skillset which reshapes the data and extracts medical entities, and puts the enriched data in the search index, it also saves Azure Text for Analytics json to the database render marked-up text.

Doctor Notes/Text Analytics Search Architecture

Services used for this solution

Listed below are the services needed for this solution, if you don't have an azure subscription, you can create a free one. If you already have an subscription, please make sure that your administration has granted access to the services below:

Programming Tools needed:

  • VS Code to edit Azure Functions
  • Visual Studio to edit web-app (this is only if you want to customize the application)

Expected time to completion

This project should take about 4 hours to complete

Setup Steps

Before you begin, fork this repository to your own github account then download it to your local drive

  1. Azure account - login or create one
  2. Create a resource group
  3. Import database package
  4. Create a Storage Account
  5. Implement Text Analytics For Health
  6. Deploy InvokeHealthEntityExtraction Azure function
  7. Create Azure search service
  8. Run Notebook to configure Indexes and Data for Azure Search
  9. Deploy Website

Task 1 - Azure Account

First, you will need an Azure account. If you don't already have one, you can start a free trial of Azure here.

Log into the Azure Portal using your credentials


Task 2 - Create a resource group

If you are new to Azure,a resource group is a container that holds related resources for an Azure solution. The resource group can include all the resources for the solution, or only those resources that you want to manage as a group, click here to learn how to create a group

Write the name of your resource group on a text file, we will need it later


Task 3 - Import database package

Upload the file doctor-note-poc-bacpac located under the folder data-files to a storage account in your subscription. Import the database package to a serverless database, for more information on how to do this click here.

If you have never done this expand this section for detailed steps

Click on create new resource and search for SQL Server (logical server) and select that option

Create a SQL Server

Click the create button

Create a SQL Database resourse

Select the resource group you previously created

Enter a name for the server and a location that matches the location of your resource group. Select use both SQL and Azure AD authentication, add yourself as Azure AD admin. Enter a not easy to guess user name and password for the server. Click Networking

Create a SQL Database resourse

Under firewall rules select Allow Azure Services and resources to access this server. Click Review + create

Create a SQL Database resourse

Verify all information is correct, click on "Create"

Create a SQL Database resourse

Once your database is created, navigate to your new SQL Server and click on Import Database

Create a SQL Database resourse

Once on the Import dabase select backup

Create a SQL Database resourse

Select the storage account where you uploaded the database file and navigate to the file. Click Select

Create a SQL Database resourse

Next click configure database

Create a SQL Database resourse

Under computer tier, select serverless, click ok

Create a SQL Database resourse

Enter a data base name, select SQL server authentication and enter the user name & password you defined for the SQL Server, click ok

Create a SQL Database resourse

Navigate to your SQL server, and select import/export history to see the progress of your import, once completed, navigate to databases to look at your new imported database

Create a SQL Database resourse

Once on your imported database, select Query editor and enter your user credentials. Loging will fail as you need to grant access to your IP address. Click on Allow IP server and then login

Create a SQL Database resourse

Once on the query screen copy and paste this sql statement and click Run to verify data was imported

Select * from DoctorNotes
 

Create a SQL Database resourse

Write the name of your sql server, database, username and password on a text file, we will need it later


Task 4 - Create a Storage Account

Create a storage account and get the connection string, you will need this connection string for the next steps. If you have never done that, here is the documentation to do it.

Once your storage account is created, navigate to the storage account and create a container named doctor-notes-search

Write the name of your storage account, get the connection string and access key on the same text file, we will need it later


Task 5 - Implement Text Analytics For Health Container

Our implementation uses the Text Analytics for Health container for medical entity extraction. Once you have received access, you will need to set up the container as instructed in their README.

Write your container name on a text file, you wll need it later


Task 6 - Deploy InvokeHealthEntityExtraction Azure function

Then, you will need to update the InvokeHealthEntityExtraction Azure function with the location of your running container. You will also need to download a file umls_concept_dict.pickle that is too big to host on GitHub, which will allow lookup of UMLS entities.

Specifically, in the InvokeHealthEntityExtraction\InvokeHealthEntityExtraction folder:

  • Download the umls_concept_dict.pickle file and save to this directory InvokeHealthEntityExtraction\InvokeHealthEntityExtraction (the same directory as init.py) so it will deploy with the Azure function.

After this action is complete, you can deploy the InvokeHealthEntityExtraction Azure function. One easy way to deploy an Azure function is using Visual Studio Code. You can install VS Code and then follow some of the instructions at this link:

  1. Install the Azure Functions extension for Visual Studio Code

  2. Sign in to Azure

  3. Publish the function to Azure

After the function is deployed you need to update the function configuration parameters and get the value for the function Url follow these steps:

To update function's configuration parameters, in the Azure portal navigate to your Azure function app, under settings click "configuration", then under "Application settings" click "New application setting" (see image below)

Azure function Configuration

Add the following parameters and their corresponding values:

    text_analytics_container_url: YOUR_CONTAINER_URL

    cognitive_services_enpoint: YOUR_ALL_IN_ONE_COGNITIVE_SERVICES_END_POINT

    cognitive_services_key: YOUR_ALL_IN_ONE_COGNITIVE_SERVICES_END_API_KEY    

Next Click "Functions" in the left-hand sidebar. Then click on the function name, click "Get Function Url" at the top of the page.

Copy that value of the function URL to the text file, you will need it later.


Task 7 - Create Azure search service

Create a new Azure search service using the Azure portal at https://portal.azure.com/#create/Microsoft.Search. Select your Azure subscription. Use the previously created resource group. You will need a globally-unique URL as the name of your search service (try something like "doctonotes-search-" plus your name, organization, or numbers). Finally, choose a nearby location to host your search service - please remember the location that you chose, as your Cognitive Services instance will need to be based in the same location. Click "Review + create" and then (after validation) click "Create" to instantiate and deploy the service.

Copy that value of the Azure Service URL, service name and service key to the text file, you will need it later.


Task 8 - Run Notebooks to create indexes on Azure Search

After deployment of Azure Search service is complete, click "Go to resource" to navigate to your new search service. We will need some information about your search service to fill in the "Azure Search variables" section in the SetupAzureCognitiveSearchService.ipynb notebook, which is in the AzureCognitiveSearchService directory. Open the notebook for details on how to do this and copy those values into the first code cell, but don't run the notebook yet (you will need to update skillset.json first).

Before running the notebook, you will also need to change the TODOs in the skillset.json (which is also located in the AzureCognitiveSearchService folder). Open skillset.json, search for "TODO", and replace each instance with the following:

  1. Invoke TA Health Extraction custom skill URI: this value should be "https://" plus the value from the "Get Function Url" for the InvokeHealthEntityExtraction function that you noted down earlier
  2. Cognitive Services key: create a new Cognitive Services key in the Azure portal using the same subscription, location, and resource group that you did for your Azure search service. Click "Create" and after the resource is ready, click it. Click "Keys and Endpoint" in the left-hand sidebar. Copy the Key 1 value into this TODO.
  3. Knowledge Store connection string: use the value that you noted down earlier of the connection string to the knowledgeStore container in your Azure blob storage. It should be of the format "DefaultEndpointsProtocol=https;AccountName=YourValueHere;AccountKey=YourValueHere;EndpointSuffix=core.windows.net".

Finally, you are all set to go into the SetupAzureCognitiveSearchService.ipynb notebook and run it. This notebook will call REST endpoints on the search service that you have deployed in Azure to setup the search data sources, index, indexers, and skillset.


Task 9 - Deploy Web Application

To deploy the web application you will need the following steps:

  1. Create an Azure App Service
  2. Update Web App Settings file
  3. Create Github Secret and Update Github Actions File
  4. Commit changes to your repository

Step 1 - Create an App Services

This repository includes a workflow to publish the web application. But first you need to create an App Service with the following configuration:

  • Unique name for your application like DoctorNotesApp
  • Publish: Code
  • Runtime stack: .Net 6(LTS)
  • Operating System: windows
  • Region: the same region you selected for your resource group
  • Create a new Windows plan if you dont have one

You can change the default size of your sizing plan to a development plan if you want to, but performance would be slower

Application Configuration

Once the App service is provisioned, navigate to the App and download the publish profile

Get App Profile

Open the file and copy the content to a text file

Step 2 - Update Web App Settings file

Navigate to the web-app/Cognitive.UI folder and open the appsettings.json file and change the following parameters:

  "SearchServiceName": "YOUR_COGNITIVE_SEARCH_SERVICE_NAME",
  "SearchApiKey": "YOUR_COGNITIVE_SEARCH_SERVICE_NAME",
  "SearchIndexName": "azuresql-index",
  "SearchIndexerName": "azure-sql-indexer",
  "StorageAccountName": "YOUR_STORAGE_ACCOUNT_NAME",
  "StorageAccountKey": "YOUR_STORAGE_ACCOUNT_NAME",
  "StorageContainerAddress": "https://YOUR_STORAGE_ACCOUNT_NAME.blob.core.windows.net/doctor-notes-search" 

Please make sure the index and indexer names match those created on your Cognitive Search Service

Step 3 - Create Github Secret and update Github Actions File

Next nagivate to your Github repository secrets change the value for the secret named DoctorNotesSearchPoc_A28D copy and paste the content of the publish profile you just downloaded to the value box.

If the secret does not exist, please create it.

Next navigate to the workflow file located at .github/workflows/DoctorNotesSearchPoc.yml and replace the value for the variable AZURE_WEBAPP_NAME to match the name of the Azure Service App you just created.

Step 4 - Enable the workflow and Commit changes to Github

In your GitHub repository, navigate to Actions, select the workflow "Build and Deploy .Net app...." and click on enable this workflow option to your right

enable-worlflow

Commit your changes to the main branch of the forked Github repository, then navigate to Actions to confirm the Application has been published.

Credits

This project was enhanced and changed from the Covid-19 Search repository by Liam Cavanagh.

Markup text for healthcare analytics code was provided by Oren Barnea

About

Azure Cognitive Search + Azure Text Analytics for Healthcare Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published