Skip to content

Commit

Permalink
All Notebooks updated to v3.0 of Accelerator using LangGraph
Browse files Browse the repository at this point in the history
  • Loading branch information
pablomarin committed Nov 6, 2024
1 parent d731cc9 commit 9a7df82
Show file tree
Hide file tree
Showing 25 changed files with 3,743 additions and 5,647 deletions.
16 changes: 8 additions & 8 deletions 01-Load-Data-ACogSearch.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"source": [
"# Introduction\n",
"\n",
"Welcome to this repository. We will be walking you to a series of notebooks in which you will understand how RAG works (Retrieval Augmented Generation, a technique that combines the power of search and generation of AI to answer user queries). We will work with different sources (Azure AI Search, Files, SQL Server, Websites, APIs, etc) and at the end of the notebooks you will understand why the magic happens with the combination of:\n",
"Welcome to this repository. We will be walking you to a series of notebooks in which you will understand how RAG works (Retrieval Augmented Generation, a technique that combines the power of search and generative AI to answer user queries). We will work with different sources (Azure AI Search, Files, SQL Server, Websites, APIs, etc) and at the end of the notebooks you will understand why the magic happens with the combination of:\n",
"\n",
"1) Multi-Agents: Agents talking to each other\n",
"2) Azure OpenAI models\n",
Expand All @@ -24,7 +24,7 @@
"In this Jupyter Notebook, we create and run enrichment steps to unlock searchable content in the specified Azure blob. It performs operations over mixed content in Azure Storage, such as images and application files, using a skillset that analyzes and extracts text information that becomes searchable in Azure Cognitive Search. \n",
"The reference sample can be found at [Tutorial: Use Python and AI to generate searchable content from Azure blobs](https://docs.microsoft.com/azure/search/cognitive-search-tutorial-blob-python).\n",
"\n",
"In this demo we are going to be using a private (so we can mimic a private data lake scenario) Blob Storage container that has all the dialogues of each episode of the TV Series show: FRIENDS. 3.1k text files.\n",
"In this demo we are going to be using a private (so we can mimic a private data lake scenario) Blob Storage container that has all the dialogues of each episode of the TV Series show: FRIENDS. (3.1k text files).\n",
"\n",
"Although only TXT files are used here, this can be done at a much larger scale and Azure Cognitive Search supports a range of other file formats including: Microsoft Office (DOCX/DOC, XSLX/XLS, PPTX/PPT, MSG), HTML, XML, ZIP, and plain text files (including JSON).\n",
"Azure Search support the following sources: [Data Sources Gallery](https://learn.microsoft.com/EN-US/AZURE/search/search-data-sources-gallery)\n",
Expand Down Expand Up @@ -122,16 +122,16 @@
"name": "stderr",
"output_type": "stream",
"text": [
"Uploading Files: 100%|██████████████████████████████████████████| 3107/3107 [08:47<00:00, 5.89it/s]\n"
"Uploading Files: 100%|██████████████████████████████████████████| 3107/3107 [08:57<00:00, 5.78it/s]\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Temp Folder: ./data/temp_extract removed\n",
"CPU times: user 34.9 s, sys: 5.76 s, total: 40.6 s\n",
"Wall time: 11min 21s\n"
"CPU times: user 34 s, sys: 5.15 s, total: 39.2 s\n",
"Wall time: 11min 48s\n"
]
}
],
Expand Down Expand Up @@ -662,7 +662,7 @@
},
{
"cell_type": "code",
"execution_count": 20,
"execution_count": 14,
"metadata": {
"tags": []
},
Expand All @@ -672,8 +672,8 @@
"output_type": "stream",
"text": [
"200\n",
"Status: inProgress\n",
"Items Processed: 2180\n",
"Status: success\n",
"Items Processed: 3107\n",
"True\n"
]
}
Expand Down
108 changes: 54 additions & 54 deletions 02-LoadCSVOneToMany-ACogSearch.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -98,16 +98,16 @@
"name": "stderr",
"output_type": "stream",
"text": [
"Uploading Files: 100%|████████████████████████████████████████████████| 1/1 [00:03<00:00, 3.95s/it]"
"Uploading Files: 100%|████████████████████████████████████████████████| 1/1 [00:05<00:00, 5.20s/it]"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Temp Folder: ./data/temp_extract removed\n",
"CPU times: user 776 ms, sys: 311 ms, total: 1.09 s\n",
"Wall time: 5.29 s\n"
"CPU times: user 779 ms, sys: 338 ms, total: 1.12 s\n",
"Wall time: 6.77 s\n"
]
},
{
Expand Down Expand Up @@ -158,7 +158,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"204\n",
"201\n",
"True\n"
]
}
Expand Down Expand Up @@ -220,69 +220,69 @@
"text/html": [
"<style type=\"text/css\">\n",
"</style>\n",
"<table id=\"T_b8e73\">\n",
"<table id=\"T_baaab\">\n",
" <thead>\n",
" <tr>\n",
" <th class=\"blank level0\" >&nbsp;</th>\n",
" <th id=\"T_b8e73_level0_col0\" class=\"col_heading level0 col0\" >cord_uid</th>\n",
" <th id=\"T_b8e73_level0_col1\" class=\"col_heading level0 col1\" >source_x</th>\n",
" <th id=\"T_b8e73_level0_col2\" class=\"col_heading level0 col2\" >title</th>\n",
" <th id=\"T_b8e73_level0_col3\" class=\"col_heading level0 col3\" >abstract</th>\n",
" <th id=\"T_b8e73_level0_col4\" class=\"col_heading level0 col4\" >authors</th>\n",
" <th id=\"T_b8e73_level0_col5\" class=\"col_heading level0 col5\" >url</th>\n",
" <th id=\"T_baaab_level0_col0\" class=\"col_heading level0 col0\" >cord_uid</th>\n",
" <th id=\"T_baaab_level0_col1\" class=\"col_heading level0 col1\" >source_x</th>\n",
" <th id=\"T_baaab_level0_col2\" class=\"col_heading level0 col2\" >title</th>\n",
" <th id=\"T_baaab_level0_col3\" class=\"col_heading level0 col3\" >abstract</th>\n",
" <th id=\"T_baaab_level0_col4\" class=\"col_heading level0 col4\" >authors</th>\n",
" <th id=\"T_baaab_level0_col5\" class=\"col_heading level0 col5\" >url</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th id=\"T_b8e73_level0_row0\" class=\"row_heading level0 row0\" >0</th>\n",
" <td id=\"T_b8e73_row0_col0\" class=\"data row0 col0\" >ug7v899j</td>\n",
" <td id=\"T_b8e73_row0_col1\" class=\"data row0 col1\" >PMC</td>\n",
" <td id=\"T_b8e73_row0_col2\" class=\"data row0 col2\" >Clinical features of culture-p...</td>\n",
" <td id=\"T_b8e73_row0_col3\" class=\"data row0 col3\" >OBJECTIVE: This retrospective ...</td>\n",
" <td id=\"T_b8e73_row0_col4\" class=\"data row0 col4\" >Madani, Tariq A; Al-Ghamdi, Ai...</td>\n",
" <td id=\"T_b8e73_row0_col5\" class=\"data row0 col5\" ><a href=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC35282/\">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC35282/</a></td>\n",
" <th id=\"T_baaab_level0_row0\" class=\"row_heading level0 row0\" >0</th>\n",
" <td id=\"T_baaab_row0_col0\" class=\"data row0 col0\" >ug7v899j</td>\n",
" <td id=\"T_baaab_row0_col1\" class=\"data row0 col1\" >PMC</td>\n",
" <td id=\"T_baaab_row0_col2\" class=\"data row0 col2\" >Clinical features of culture-p...</td>\n",
" <td id=\"T_baaab_row0_col3\" class=\"data row0 col3\" >OBJECTIVE: This retrospective ...</td>\n",
" <td id=\"T_baaab_row0_col4\" class=\"data row0 col4\" >Madani, Tariq A; Al-Ghamdi, Ai...</td>\n",
" <td id=\"T_baaab_row0_col5\" class=\"data row0 col5\" ><a href=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC35282/\">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC35282/</a></td>\n",
" </tr>\n",
" <tr>\n",
" <th id=\"T_b8e73_level0_row1\" class=\"row_heading level0 row1\" >1</th>\n",
" <td id=\"T_b8e73_row1_col0\" class=\"data row1 col0\" >02tnwd4m</td>\n",
" <td id=\"T_b8e73_row1_col1\" class=\"data row1 col1\" >PMC</td>\n",
" <td id=\"T_b8e73_row1_col2\" class=\"data row1 col2\" >Nitric oxide: a pro-inflammato...</td>\n",
" <td id=\"T_b8e73_row1_col3\" class=\"data row1 col3\" >Inflammatory diseases of the r...</td>\n",
" <td id=\"T_b8e73_row1_col4\" class=\"data row1 col4\" >Vliet, Albert van der; Eiseric...</td>\n",
" <td id=\"T_b8e73_row1_col5\" class=\"data row1 col5\" ><a href=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59543/\">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59543/</a></td>\n",
" <th id=\"T_baaab_level0_row1\" class=\"row_heading level0 row1\" >1</th>\n",
" <td id=\"T_baaab_row1_col0\" class=\"data row1 col0\" >02tnwd4m</td>\n",
" <td id=\"T_baaab_row1_col1\" class=\"data row1 col1\" >PMC</td>\n",
" <td id=\"T_baaab_row1_col2\" class=\"data row1 col2\" >Nitric oxide: a pro-inflammato...</td>\n",
" <td id=\"T_baaab_row1_col3\" class=\"data row1 col3\" >Inflammatory diseases of the r...</td>\n",
" <td id=\"T_baaab_row1_col4\" class=\"data row1 col4\" >Vliet, Albert van der; Eiseric...</td>\n",
" <td id=\"T_baaab_row1_col5\" class=\"data row1 col5\" ><a href=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59543/\">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59543/</a></td>\n",
" </tr>\n",
" <tr>\n",
" <th id=\"T_b8e73_level0_row2\" class=\"row_heading level0 row2\" >2</th>\n",
" <td id=\"T_b8e73_row2_col0\" class=\"data row2 col0\" >ejv2xln0</td>\n",
" <td id=\"T_b8e73_row2_col1\" class=\"data row2 col1\" >PMC</td>\n",
" <td id=\"T_b8e73_row2_col2\" class=\"data row2 col2\" >Surfactant protein-D and pulmo...</td>\n",
" <td id=\"T_b8e73_row2_col3\" class=\"data row2 col3\" >Surfactant protein-D (SP-D) pa...</td>\n",
" <td id=\"T_b8e73_row2_col4\" class=\"data row2 col4\" >Crouch, Erika C...</td>\n",
" <td id=\"T_b8e73_row2_col5\" class=\"data row2 col5\" ><a href=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59549/\">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59549/</a></td>\n",
" <th id=\"T_baaab_level0_row2\" class=\"row_heading level0 row2\" >2</th>\n",
" <td id=\"T_baaab_row2_col0\" class=\"data row2 col0\" >ejv2xln0</td>\n",
" <td id=\"T_baaab_row2_col1\" class=\"data row2 col1\" >PMC</td>\n",
" <td id=\"T_baaab_row2_col2\" class=\"data row2 col2\" >Surfactant protein-D and pulmo...</td>\n",
" <td id=\"T_baaab_row2_col3\" class=\"data row2 col3\" >Surfactant protein-D (SP-D) pa...</td>\n",
" <td id=\"T_baaab_row2_col4\" class=\"data row2 col4\" >Crouch, Erika C...</td>\n",
" <td id=\"T_baaab_row2_col5\" class=\"data row2 col5\" ><a href=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59549/\">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59549/</a></td>\n",
" </tr>\n",
" <tr>\n",
" <th id=\"T_b8e73_level0_row3\" class=\"row_heading level0 row3\" >3</th>\n",
" <td id=\"T_b8e73_row3_col0\" class=\"data row3 col0\" >2b73a28n</td>\n",
" <td id=\"T_b8e73_row3_col1\" class=\"data row3 col1\" >PMC</td>\n",
" <td id=\"T_b8e73_row3_col2\" class=\"data row3 col2\" >Role of endothelin-1 in lung d...</td>\n",
" <td id=\"T_b8e73_row3_col3\" class=\"data row3 col3\" >Endothelin-1 (ET-1) is a 21 am...</td>\n",
" <td id=\"T_b8e73_row3_col4\" class=\"data row3 col4\" >Fagan, Karen A; McMurtry, Ivan...</td>\n",
" <td id=\"T_b8e73_row3_col5\" class=\"data row3 col5\" ><a href=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59574/\">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59574/</a></td>\n",
" <th id=\"T_baaab_level0_row3\" class=\"row_heading level0 row3\" >3</th>\n",
" <td id=\"T_baaab_row3_col0\" class=\"data row3 col0\" >2b73a28n</td>\n",
" <td id=\"T_baaab_row3_col1\" class=\"data row3 col1\" >PMC</td>\n",
" <td id=\"T_baaab_row3_col2\" class=\"data row3 col2\" >Role of endothelin-1 in lung d...</td>\n",
" <td id=\"T_baaab_row3_col3\" class=\"data row3 col3\" >Endothelin-1 (ET-1) is a 21 am...</td>\n",
" <td id=\"T_baaab_row3_col4\" class=\"data row3 col4\" >Fagan, Karen A; McMurtry, Ivan...</td>\n",
" <td id=\"T_baaab_row3_col5\" class=\"data row3 col5\" ><a href=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59574/\">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59574/</a></td>\n",
" </tr>\n",
" <tr>\n",
" <th id=\"T_b8e73_level0_row4\" class=\"row_heading level0 row4\" >4</th>\n",
" <td id=\"T_b8e73_row4_col0\" class=\"data row4 col0\" >9785vg6d</td>\n",
" <td id=\"T_b8e73_row4_col1\" class=\"data row4 col1\" >PMC</td>\n",
" <td id=\"T_b8e73_row4_col2\" class=\"data row4 col2\" >Gene expression in epithelial ...</td>\n",
" <td id=\"T_b8e73_row4_col3\" class=\"data row4 col3\" >Respiratory syncytial virus (R...</td>\n",
" <td id=\"T_b8e73_row4_col4\" class=\"data row4 col4\" >Domachowske, Joseph B; Bonvill...</td>\n",
" <td id=\"T_b8e73_row4_col5\" class=\"data row4 col5\" ><a href=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59580/\">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59580/</a></td>\n",
" <th id=\"T_baaab_level0_row4\" class=\"row_heading level0 row4\" >4</th>\n",
" <td id=\"T_baaab_row4_col0\" class=\"data row4 col0\" >9785vg6d</td>\n",
" <td id=\"T_baaab_row4_col1\" class=\"data row4 col1\" >PMC</td>\n",
" <td id=\"T_baaab_row4_col2\" class=\"data row4 col2\" >Gene expression in epithelial ...</td>\n",
" <td id=\"T_baaab_row4_col3\" class=\"data row4 col3\" >Respiratory syncytial virus (R...</td>\n",
" <td id=\"T_baaab_row4_col4\" class=\"data row4 col4\" >Domachowske, Joseph B; Bonvill...</td>\n",
" <td id=\"T_baaab_row4_col5\" class=\"data row4 col5\" ><a href=\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59580/\">https://www.ncbi.nlm.nih.gov/pmc/articles/PMC59580/</a></td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n"
],
"text/plain": [
"<pandas.io.formats.style.Styler at 0x7ff3db96f880>"
"<pandas.io.formats.style.Styler at 0x7fcb84e30850>"
]
},
"execution_count": 6,
Expand Down Expand Up @@ -325,7 +325,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 7,
"id": "74913764-9dfb-4646-aac8-d389cd4533e6",
"metadata": {
"tags": []
Expand Down Expand Up @@ -429,7 +429,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 8,
"id": "b46cfa90-28b4-4602-b6ff-743a3407fd72",
"metadata": {
"tags": []
Expand Down Expand Up @@ -550,7 +550,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 9,
"id": "b87b8ebd-8091-43b6-9124-cc17021cfb78",
"metadata": {
"tags": []
Expand Down Expand Up @@ -601,7 +601,7 @@
},
{
"cell_type": "code",
"execution_count": 25,
"execution_count": 16,
"id": "6132c041-7213-410e-a206-1a8c7385128e",
"metadata": {
"tags": []
Expand All @@ -612,8 +612,8 @@
"output_type": "stream",
"text": [
"200\n",
"Status: success\n",
"Items Processed: 0\n",
"Status: inProgress\n",
"Items Processed: 14322\n",
"True\n"
]
}
Expand All @@ -638,7 +638,7 @@
"id": "2152806f-245c-45db-93c6-c19c0569d73a",
"metadata": {},
"source": [
"**When the indexer finishes running we will have all 90,000 rows indexed properly as separate documents in our Search Engine!.**"
"**When the indexer finishes running (this might some time, depending how much capacity TPM your model has) we will have all 90,000 rows indexed properly as separate documents in our Search Engine!.**"
]
},
{
Expand Down
Loading

0 comments on commit 9a7df82

Please sign in to comment.