diff --git a/dotnet/README.md b/dotnet/README.md index 6483a7d035aa..b208a8f65335 100644 --- a/dotnet/README.md +++ b/dotnet/README.md @@ -77,11 +77,11 @@ requirements and setup instructions. 3. [Running AI prompts from file](./notebooks/02-running-prompts-from-file.ipynb) 4. [Creating Semantic Functions at runtime (i.e. inline functions)](./notebooks/03-semantic-function-inline.ipynb) 5. [Using Kernel Arguments to Build a Chat Experience](./notebooks/04-kernel-arguments-chat.ipynb) -6. [Introduction to the Planning/Function Calling](./notebooks/05-using-function-calling.ipynb) -7. [Building Memory with Embeddings](./notebooks/06-memory-and-embeddings.ipynb) +6. [Introduction to the Function Calling](./notebooks/05-using-function-calling.ipynb) +7. [Vector Stores and Embeddings](./notebooks/06-vector-stores-and-embeddings.ipynb) 8. [Creating images with DALL-E 3](./notebooks/07-DALL-E-3.ipynb) 9. [Chatting with ChatGPT and Images](./notebooks/08-chatGPT-with-DALL-E-3.ipynb) -10. [BingSearch using Kernel](./notebooks/10-RAG-with-BingSearch.ipynb) +10. [BingSearch using Kernel](./notebooks/09-RAG-with-BingSearch.ipynb) # Semantic Kernel Samples diff --git a/dotnet/notebooks/06-memory-and-embeddings.ipynb b/dotnet/notebooks/06-memory-and-embeddings.ipynb deleted file mode 100644 index 77309ca2015d..000000000000 --- a/dotnet/notebooks/06-memory-and-embeddings.ipynb +++ /dev/null @@ -1,566 +0,0 @@ -{ - "cells": [ - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Building Semantic Memory with Embeddings\n", - "\n", - "So far, we've mostly been treating the kernel as a stateless orchestration engine.\n", - "We send text into a model API and receive text out. \n", - "\n", - "In a [previous notebook](04-kernel-arguments-chat.ipynb), we used `kernel arguments` to pass in additional\n", - "text into prompts to enrich them with more data. This allowed us to create a basic chat experience. \n", - "\n", - "However, if you solely relied on kernel arguments, you would quickly realize that eventually your prompt\n", - "would grow so large that you would run into the model's token limit. What we need is a way to persist state\n", - "and build both short-term and long-term memory to empower even more intelligent applications. \n", - "\n", - "To do this, we dive into the key concept of `Semantic Memory` in the Semantic Kernel. " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "#r \"nuget: Microsoft.SemanticKernel, 1.23.0\"\n", - "#r \"nuget: Microsoft.SemanticKernel.Plugins.Memory, 1.23.0-alpha\"\n", - "#r \"nuget: System.Linq.Async, 6.0.1\"\n", - "\n", - "#!import config/Settings.cs\n", - "\n", - "using Microsoft.SemanticKernel;\n", - "using Kernel = Microsoft.SemanticKernel.Kernel;\n", - "\n", - "var builder = Kernel.CreateBuilder();\n", - "\n", - "// Configure AI service credentials used by the kernel\n", - "var (useAzureOpenAI, model, azureEndpoint, apiKey, orgId) = Settings.LoadFromFile();\n", - "\n", - "if (useAzureOpenAI)\n", - " builder.AddAzureOpenAIChatCompletion(model, azureEndpoint, apiKey);\n", - "else\n", - " builder.AddOpenAIChatCompletion(model, apiKey, orgId);\n", - "\n", - "var kernel = builder.Build();" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In order to use memory, we need to instantiate the Memory Plugin with a Memory Storage\n", - "and an Embedding backend. In this example, we make use of the `VolatileMemoryStore`\n", - "which can be thought of as a temporary in-memory storage (not to be confused with Semantic Memory).\n", - "\n", - "This memory is not written to disk and is only available during the app session.\n", - "\n", - "When developing your app you will have the option to plug in persistent storage\n", - "like Azure Cosmos Db, PostgreSQL, SQLite, etc. Semantic Memory allows also to index\n", - "external data sources, without duplicating all the information, more on that later." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "using Microsoft.SemanticKernel.Memory;\n", - "using Microsoft.SemanticKernel.Embeddings;\n", - "using Microsoft.SemanticKernel.Connectors.AzureOpenAI;\n", - "using Microsoft.SemanticKernel.Connectors.OpenAI;\n", - "\n", - "// Memory functionality is experimental\n", - "#pragma warning disable SKEXP0001, SKEXP0010, SKEXP0050\n", - "\n", - "var (useAzureOpenAI, model, azureEndpoint, apiKey, orgId) = Settings.LoadFromFile();\n", - "\n", - "var modelId = \"text-embedding-ada-002\";\n", - "ITextEmbeddingGenerationService textEmbeddingService = useAzureOpenAI\n", - " ? new AzureOpenAITextEmbeddingGenerationService(deploymentName: modelId, endpoint: azureEndpoint, apiKey: apiKey)\n", - " : new OpenAITextEmbeddingGenerationService(modelId: modelId, apiKey: apiKey);\n", - "\n", - "var memory = new SemanticTextMemory(new VolatileMemoryStore(), textEmbeddingService);" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "At its core, Semantic Memory is a set of data structures that allow you to store\n", - "the meaning of text that come from different data sources, and optionally to store\n", - "the source text too.\n", - "\n", - "These texts can be from the web, e-mail providers, chats, a database, or from your\n", - "local directory, and are hooked up to the Semantic Kernel through data source connectors.\n", - "\n", - "The texts are embedded or compressed into a vector of floats representing mathematically\n", - "the texts' contents and meaning.\n", - "\n", - "You can read more about embeddings [here](https://aka.ms/sk/embeddings)." - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Manually adding memories\n", - "Let's create some initial memories \"About Me\". We can add memories to our `VolatileMemoryStore` by using `SaveInformationAsync`" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "const string MemoryCollectionName = \"aboutMe\";\n", - "\n", - "await memory.SaveInformationAsync(MemoryCollectionName, id: \"info1\", text: \"My name is Andrea\");\n", - "await memory.SaveInformationAsync(MemoryCollectionName, id: \"info2\", text: \"I currently work as a tourist operator\");\n", - "await memory.SaveInformationAsync(MemoryCollectionName, id: \"info3\", text: \"I currently live in Seattle and have been living there since 2005\");\n", - "await memory.SaveInformationAsync(MemoryCollectionName, id: \"info4\", text: \"I visited France and Italy five times since 2015\");\n", - "await memory.SaveInformationAsync(MemoryCollectionName, id: \"info5\", text: \"My family is from New York\");" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's try searching the memory:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "var questions = new[]\n", - "{\n", - " \"what is my name?\",\n", - " \"where do I live?\",\n", - " \"where is my family from?\",\n", - " \"where have I travelled?\",\n", - " \"what do I do for work?\",\n", - "};\n", - "\n", - "foreach (var q in questions)\n", - "{\n", - " var response = await memory.SearchAsync(MemoryCollectionName, q).FirstOrDefaultAsync();\n", - " Console.WriteLine(\"Q: \" + q);\n", - " Console.WriteLine(\"A: \" + response?.Relevance.ToString() + \"\\t\" + response?.Metadata.Text);\n", - "}" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's now revisit our chat sample from the [previous notebook](04-kernel-arguments-chat.ipynb).\n", - "If you remember, we used kernel arguments to fill the prompt with a `history` that continuously got populated as we chatted with the bot. Let's add also memory to it!" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "This is done by using the `TextMemoryPlugin` which exposes the `recall` native function.\n", - "\n", - "`recall` takes an input ask and performs a similarity search on the contents that have\n", - "been embedded in the Memory Store. By default, `recall` returns the most relevant memory." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "using Microsoft.SemanticKernel.Plugins.Memory;\n", - "\n", - "#pragma warning disable SKEXP0001, SKEXP0050\n", - "\n", - "// TextMemoryPlugin provides the \"recall\" function\n", - "kernel.ImportPluginFromObject(new TextMemoryPlugin(memory));" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "const string skPrompt = @\"\n", - "ChatBot can have a conversation with you about any topic.\n", - "It can give explicit instructions or say 'I don't know' if it does not have an answer.\n", - "\n", - "Information about me, from previous conversations:\n", - "- {{$fact1}} {{recall $fact1}}\n", - "- {{$fact2}} {{recall $fact2}}\n", - "- {{$fact3}} {{recall $fact3}}\n", - "- {{$fact4}} {{recall $fact4}}\n", - "- {{$fact5}} {{recall $fact5}}\n", - "\n", - "Chat:\n", - "{{$history}}\n", - "User: {{$userInput}}\n", - "ChatBot: \";\n", - "\n", - "var chatFunction = kernel.CreateFunctionFromPrompt(skPrompt, new OpenAIPromptExecutionSettings { MaxTokens = 200, Temperature = 0.8 });" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The `RelevanceParam` is used in memory search and is a measure of the relevance score from 0.0 to 1.0, where 1.0 means a perfect match. We encourage users to experiment with different values." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "#pragma warning disable SKEXP0001, SKEXP0050\n", - "\n", - "var arguments = new KernelArguments();\n", - "\n", - "arguments[\"fact1\"] = \"what is my name?\";\n", - "arguments[\"fact2\"] = \"where do I live?\";\n", - "arguments[\"fact3\"] = \"where is my family from?\";\n", - "arguments[\"fact4\"] = \"where have I travelled?\";\n", - "arguments[\"fact5\"] = \"what do I do for work?\";\n", - "\n", - "arguments[TextMemoryPlugin.CollectionParam] = MemoryCollectionName;\n", - "arguments[TextMemoryPlugin.LimitParam] = \"2\";\n", - "arguments[TextMemoryPlugin.RelevanceParam] = \"0.8\";" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now that we've included our memories, let's chat!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "var history = \"\";\n", - "arguments[\"history\"] = history;\n", - "Func Chat = async (string input) => {\n", - " // Save new message in the kernel arguments\n", - " arguments[\"userInput\"] = input;\n", - "\n", - " // Process the user message and get an answer\n", - " var answer = await chatFunction.InvokeAsync(kernel, arguments);\n", - "\n", - " // Append the new interaction to the chat history\n", - " var result = $\"\\nUser: {input}\\nChatBot: {answer}\\n\";\n", - "\n", - " history += result;\n", - " arguments[\"history\"] = history;\n", - " \n", - " // Show the bot response\n", - " Console.WriteLine(result);\n", - "};" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "await Chat(\"Hello, I think we've met before, remember? my name is...\");" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "await Chat(\"I want to plan a trip and visit my family. Do you know where that is?\");" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "await Chat(\"Great! What are some fun things to do there?\");" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Adding documents to your memory\n", - "\n", - "Many times in your applications you'll want to bring in external documents into your memory. Let's see how we can do this using our VolatileMemoryStore.\n", - "\n", - "Let's first get some data using some of the links in the Semantic Kernel repo." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "const string memoryCollectionName = \"SKGitHub\";\n", - "\n", - "var githubFiles = new Dictionary()\n", - "{\n", - " [\"https://github.com/microsoft/semantic-kernel/blob/main/README.md\"]\n", - " = \"README: Installation, getting started, and how to contribute\",\n", - " [\"https://github.com/microsoft/semantic-kernel/blob/main/dotnet/notebooks/02-running-prompts-from-file.ipynb\"]\n", - " = \"Jupyter notebook describing how to pass prompts from a file to a semantic plugin or function\",\n", - " [\"https://github.com/microsoft/semantic-kernel/blob/main/dotnet/notebooks/00-getting-started.ipynb\"]\n", - " = \"Jupyter notebook describing how to get started with the Semantic Kernel\",\n", - " [\"https://github.com/microsoft/semantic-kernel/tree/main/samples/plugins/ChatPlugin/ChatGPT\"]\n", - " = \"Sample demonstrating how to create a chat plugin interfacing with ChatGPT\",\n", - " [\"https://github.com/microsoft/semantic-kernel/blob/main/dotnet/src/Plugins/Plugins.Memory/VolatileMemoryStore.cs\"]\n", - " = \"C# class that defines a volatile embedding store\",\n", - "};" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's build a new Memory." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "// Memory functionality is experimental\n", - "#pragma warning disable SKEXP0001, SKEXP0010, SKEXP0050\n", - "\n", - "var memoryBuilder = new MemoryBuilder();\n", - "\n", - "var modelId = \"text-embedding-ada-002\";\n", - "ITextEmbeddingGenerationService textEmbeddingService = useAzureOpenAI\n", - " ? new AzureOpenAITextEmbeddingGenerationService(deploymentName: modelId, endpoint: azureEndpoint, apiKey: apiKey)\n", - " : new OpenAITextEmbeddingGenerationService(modelId: modelId, apiKey: apiKey);\n", - "\n", - "var memory = new SemanticTextMemory(new VolatileMemoryStore(), textEmbeddingService);" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now let's add these files to our VolatileMemoryStore using `SaveReferenceAsync`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "Console.WriteLine(\"Adding some GitHub file URLs and their descriptions to a volatile Semantic Memory.\");\n", - "var i = 0;\n", - "foreach (var entry in githubFiles)\n", - "{\n", - " await memory.SaveReferenceAsync(\n", - " collection: memoryCollectionName,\n", - " description: entry.Value,\n", - " text: entry.Value,\n", - " externalId: entry.Key,\n", - " externalSourceName: \"GitHub\"\n", - " );\n", - " Console.WriteLine($\" URL {++i} saved\");\n", - "}" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "string ask = \"I love Jupyter notebooks, how should I get started?\";\n", - "Console.WriteLine(\"===========================\\n\" +\n", - " \"Query: \" + ask + \"\\n\");\n", - "\n", - "var memories = memory.SearchAsync(memoryCollectionName, ask, limit: 5, minRelevanceScore: 0.77);\n", - "\n", - "i = 0;\n", - "await foreach (var memory in memories)\n", - "{\n", - " Console.WriteLine($\"Result {++i}:\");\n", - " Console.WriteLine(\" URL: : \" + memory.Metadata.Id);\n", - " Console.WriteLine(\" Title : \" + memory.Metadata.Description);\n", - " Console.WriteLine(\" Relevance: \" + memory.Relevance);\n", - " Console.WriteLine();\n", - "}" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now you might be wondering what happens if you have so much data that it doesn't fit into your RAM? That's where you want to make use of an external Vector Database made specifically for storing and retrieving embeddings.\n", - "\n", - "Stay tuned for that!" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": ".NET (C#)", - "language": "C#", - "name": ".net-csharp" - }, - "language_info": { - "name": "polyglot-notebook" - }, - "polyglot_notebook": { - "kernelInfo": { - "defaultKernelName": "csharp", - "items": [ - { - "aliases": [], - "name": "csharp" - } - ] - } - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} diff --git a/dotnet/notebooks/06-vector-stores-and-embeddings.ipynb b/dotnet/notebooks/06-vector-stores-and-embeddings.ipynb new file mode 100644 index 000000000000..7cd2df81322b --- /dev/null +++ b/dotnet/notebooks/06-vector-stores-and-embeddings.ipynb @@ -0,0 +1,494 @@ +{ + "cells": [ + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Vector Stores and Embeddings\n", + "\n", + "So far, we've mostly been treating the kernel as a stateless orchestration engine.\n", + "We send text into a model API and receive text out. \n", + "\n", + "In a [previous notebook](04-kernel-arguments-chat.ipynb), we used `kernel arguments` to pass in additional\n", + "text into prompts to enrich them with more data. This allowed us to create a basic chat experience. \n", + "\n", + "However, if you solely relied on kernel arguments, you would quickly realize that eventually your prompt\n", + "would grow so large that you would run into the model's token limit. What we need is a way to persist state\n", + "and build both short-term and long-term memory to empower even more intelligent applications. \n", + "\n", + "To do this, we dive into the key concept of `Vector Stores` in the Semantic Kernel.\n", + "\n", + "More information can be found [here](https://learn.microsoft.com/en-us/semantic-kernel/concepts/vector-store-connectors)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "dotnet_interactive": { + "language": "csharp" + }, + "polyglot_notebook": { + "kernelName": "csharp" + } + }, + "outputs": [], + "source": [ + "#r \"nuget: Microsoft.SemanticKernel, 1.24.1\"\n", + "#r \"nuget: Microsoft.SemanticKernel.Connectors.InMemory, 1.24.1-preview\"\n", + "#r \"nuget: Microsoft.Extensions.VectorData.Abstractions, 9.0.0-preview.1.24518.1\"\n", + "#r \"nuget: System.Linq.Async, 6.0.1\"\n", + "\n", + "#!import config/Settings.cs\n", + "\n", + "using Microsoft.SemanticKernel;\n", + "using Kernel = Microsoft.SemanticKernel.Kernel;\n", + "\n", + "#pragma warning disable SKEXP0010\n", + "\n", + "var builder = Kernel.CreateBuilder();\n", + "\n", + "// Configure AI service credentials used by the kernel\n", + "var (useAzureOpenAI, model, azureEndpoint, apiKey, orgId) = Settings.LoadFromFile();\n", + "\n", + "if (useAzureOpenAI)\n", + "{\n", + " builder.AddAzureOpenAITextEmbeddingGeneration(\"text-embedding-ada-002\", azureEndpoint, apiKey);\n", + "}\n", + "else\n", + "{\n", + " builder.AddOpenAITextEmbeddingGeneration(\"text-embedding-ada-002\", apiKey, orgId);\n", + "}\n", + "\n", + "var kernel = builder.Build();" + ] + }, + { + "attachments": {}, + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Package `Microsoft.Extensions.VectorData.Abstractions`, which we downloaded in a previous code snippet, contains all necessary abstractions to work with vector stores. \n", + "\n", + "Together with abstractions, we also need to use an implementation of a concrete database connector, such as Azure AI Search, Azure CosmosDB, Qdrant, Redis and so on. A list of supported connectors can be found [here](https://learn.microsoft.com/en-us/semantic-kernel/concepts/vector-store-connectors/out-of-the-box-connectors/).\n", + "\n", + "In this example, we are going to use the in-memory connector for demonstration purposes - `Microsoft.SemanticKernel.Connectors.InMemory`." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Define your model\n", + "\n", + "It all starts from defining your data model. In abstractions, there are three main data model property types:\n", + "\n", + "1. Key\n", + "2. Data\n", + "3. Vector\n", + "\n", + "In most cases, a data model contains one key property, multiple data and vector properties, but some connectors may have restrictions, for example when only one vector property is supported. \n", + "\n", + "Also, each connector supports a different set of property types. For more information about supported property types in each connector, visit the connector's page, which can be found [here](https://learn.microsoft.com/en-us/semantic-kernel/concepts/vector-store-connectors/out-of-the-box-connectors/).\n", + "\n", + "There are two ways how to define your data model - using attributes (declarative way) or record definition (imperative way).\n", + "\n", + "Here is how a data model could look like with attributes:" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "dotnet_interactive": { + "language": "csharp" + }, + "polyglot_notebook": { + "kernelName": "csharp" + } + }, + "outputs": [], + "source": [ + "using Microsoft.Extensions.VectorData;\n", + "\n", + "public sealed class Glossary\n", + "{\n", + " [VectorStoreRecordKey]\n", + " public ulong Key { get; set; }\n", + "\n", + " [VectorStoreRecordData]\n", + " public string Term { get; set; }\n", + "\n", + " [VectorStoreRecordData]\n", + " public string Definition { get; set; }\n", + "\n", + " [VectorStoreRecordVector(Dimensions: 1536)]\n", + " public ReadOnlyMemory DefinitionEmbedding { get; set; }\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "More information about each attribute and its properties can be found [here](https://learn.microsoft.com/en-us/semantic-kernel/concepts/vector-store-connectors/defining-your-data-model#attributes)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "There could be a case when you can't modify the existing class with attributes. In this case, you can define a separate record definition with all the information about your properties. Note that the defined data model class is still required in this case:" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": { + "dotnet_interactive": { + "language": "csharp" + }, + "polyglot_notebook": { + "kernelName": "csharp" + } + }, + "outputs": [], + "source": [ + "public sealed class GlossaryWithoutAttributes\n", + "{\n", + " public ulong Key { get; set; }\n", + "\n", + " public string Term { get; set; }\n", + "\n", + " public string Definition { get; set; }\n", + "\n", + " public ReadOnlyMemory DefinitionEmbedding { get; set; }\n", + "}\n", + "\n", + "var recordDefinition = new VectorStoreRecordDefinition()\n", + "{\n", + " Properties = new List()\n", + " {\n", + " new VectorStoreRecordKeyProperty(\"Key\", typeof(ulong)),\n", + " new VectorStoreRecordDataProperty(\"Term\", typeof(string)),\n", + " new VectorStoreRecordDataProperty(\"Definition\", typeof(string)),\n", + " new VectorStoreRecordVectorProperty(\"DefinitionEmbedding\", typeof(ReadOnlyMemory)) { Dimensions = 1536 }\n", + " }\n", + "};" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Define main components\n", + "\n", + "As soon as you define your data model with either attributes or the record definition approach, you can start using it with your database of choice. \n", + "\n", + "There are a couple of abstractions that allow you to work with your database and collections:\n", + "\n", + "1. `IVectorStoreRecordCollection` - represents a collection. This collection may or may not exist, and the interface provides methods to check if the collection exists, create it or delete it. The interface also provides methods to upsert, get and delete records. Finally, the interface inherits from `IVectorizedSearch` providing vector search capabilities.\n", + "2. `IVectorStore` - contains operations that spans across all collections in the vector store, e.g. `ListCollectionNames`. It also provides the ability to get `IVectorStoreRecordCollection` instances.\n", + "\n", + "Each connector has extension methods to register your vector store and collection using DI - `services.AddInMemoryVectorStore()` or `services.AddInMemoryVectorStoreRecordCollection(\"collection-name\")`. \n", + "\n", + "It's also possible to initialize these instances directly, which we are going to do in this notebook for simplicity:" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "dotnet_interactive": { + "language": "csharp" + }, + "polyglot_notebook": { + "kernelName": "csharp" + } + }, + "outputs": [], + "source": [ + "using Microsoft.SemanticKernel.Connectors.InMemory;\n", + "\n", + "#pragma warning disable SKEXP0020\n", + "\n", + "// Define vector store\n", + "var vectorStore = new InMemoryVectorStore();\n", + "\n", + "// Get a collection instance using vector store\n", + "var collection = vectorStore.GetCollection(\"skglossary\");\n", + "\n", + "// Get a collection instance by initializing it directly\n", + "var collection2 = new InMemoryVectorStoreRecordCollection(\"skglossary\");" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Initializing a collection instance will allow you to work with your collection and data, but it doesn't mean that this collection already exists in a database. To ensure you are working with existing collection, you can create it if it doesn't exist:" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "dotnet_interactive": { + "language": "csharp" + }, + "polyglot_notebook": { + "kernelName": "csharp" + } + }, + "outputs": [], + "source": [ + "await collection.CreateCollectionIfNotExistsAsync();" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, since we just created a new collection, it is empty, so we want to insert some records using the data model we defined above:" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "dotnet_interactive": { + "language": "csharp" + }, + "polyglot_notebook": { + "kernelName": "csharp" + } + }, + "outputs": [], + "source": [ + "var glossaryEntries = new List()\n", + "{\n", + " new Glossary() \n", + " {\n", + " Key = 1,\n", + " Term = \"API\",\n", + " Definition = \"Application Programming Interface. A set of rules and specifications that allow software components to communicate and exchange data.\"\n", + " },\n", + " new Glossary() \n", + " {\n", + " Key = 2,\n", + " Term = \"Connectors\",\n", + " Definition = \"Connectors allow you to integrate with various services provide AI capabilities, including LLM, AudioToText, TextToAudio, Embedding generation, etc.\"\n", + " },\n", + " new Glossary() \n", + " {\n", + " Key = 3,\n", + " Term = \"RAG\",\n", + " Definition = \"Retrieval Augmented Generation - a term that refers to the process of retrieving additional data to provide as context to an LLM to use when generating a response (completion) to a user's question (prompt).\"\n", + " }\n", + "};" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If we want to perform a vector search on our records in the database, initializing just the key and data properties is not enough, we also need to generate and initialize vector properties. For that, we can use `ITextEmbeddingGenerationService` which we already registered above.\n", + "\n", + "The line `#pragma warning disable SKEXP0001` is required because `ITextEmbeddingGenerationService` interface is experimental and may change in the future." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "dotnet_interactive": { + "language": "csharp" + }, + "polyglot_notebook": { + "kernelName": "csharp" + } + }, + "outputs": [], + "source": [ + "using Microsoft.SemanticKernel.Embeddings;\n", + "\n", + "#pragma warning disable SKEXP0001\n", + "\n", + "var textEmbeddingGenerationService = kernel.GetRequiredService();\n", + "\n", + "var tasks = glossaryEntries.Select(entry => Task.Run(async () =>\n", + "{\n", + " entry.DefinitionEmbedding = await textEmbeddingGenerationService.GenerateEmbeddingAsync(entry.Definition);\n", + "}));\n", + "\n", + "await Task.WhenAll(tasks);" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Upsert records\n", + "\n", + "Now our glossary records are ready to be inserted into the database. For that, we can use `collection.UpsertAsync` or `collection.UpsertBatchAsync` methods. Note that this operation is idempotent - if a record with a specific key doesn't exist, it will be inserted. If it already exists, it will be updated. As a result, we should receive the keys of the upserted records:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "dotnet_interactive": { + "language": "csharp" + }, + "polyglot_notebook": { + "kernelName": "csharp" + } + }, + "outputs": [], + "source": [ + "await foreach (var key in collection.UpsertBatchAsync(glossaryEntries))\n", + "{\n", + " Console.WriteLine(key);\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Get records by key\n", + "\n", + "In order to ensure our records were upserted correctly, we can get these records by a key with `collection.GetAsync` or `collection.GetBatchAsync` methods. \n", + "\n", + "Both methods accept `GetRecordOptions` class as a parameter, where you can specify if you want to include vector properties in your response or not. Taking into account that the vector dimension value can be high, if you don't need to work with vectors in your code, it's recommended to not fetch them from the database. That's why `GetRecordOptions.IncludeVectors` property is `false` by default. \n", + "\n", + "In this example, we want to include vectors in the result to ensure that our data was upserted correctly:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "dotnet_interactive": { + "language": "csharp" + }, + "polyglot_notebook": { + "kernelName": "csharp" + } + }, + "outputs": [], + "source": [ + "var options = new GetRecordOptions() { IncludeVectors = true };\n", + "\n", + "await foreach (var record in collection.GetBatchAsync(keys: [1, 2, 3], options))\n", + "{\n", + " Console.WriteLine($\"Key: {record.Key}\");\n", + " Console.WriteLine($\"Term: {record.Term}\");\n", + " Console.WriteLine($\"Definition: {record.Definition}\");\n", + " Console.WriteLine($\"Definition Embedding: {JsonSerializer.Serialize(record.DefinitionEmbedding)}\");\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Perform a search\n", + "\n", + "Since we ensured that our records are already in the database, we can perform a vector search with `collection.VectorizedSearchAsync` method. \n", + "\n", + "This method accepts the `VectorSearchOptions` class as a parameter, which allows configuration of the vector search operation - specify the maximum number of records to return, the number of results to skip before returning results, a search filter to use before doing the vector search and so on. More information about it can be found [here](https://learn.microsoft.com/en-us/semantic-kernel/concepts/vector-store-connectors/vector-search#vector-search-options).\n", + "\n", + "To perform a vector search, we need a vector generated from our query string:" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": { + "dotnet_interactive": { + "language": "csharp" + }, + "polyglot_notebook": { + "kernelName": "csharp" + } + }, + "outputs": [], + "source": [ + "#pragma warning disable SKEXP0001\n", + "\n", + "var searchString = \"I want to learn more about Connectors\";\n", + "var searchVector = await textEmbeddingGenerationService.GenerateEmbeddingAsync(searchString);" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "As soon as we have our search vector, we can perform a search operation. The result of the `collection.VectorizedSearchAsync` method will be a collection of records from the database with their search scores:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "dotnet_interactive": { + "language": "csharp" + }, + "polyglot_notebook": { + "kernelName": "csharp" + } + }, + "outputs": [], + "source": [ + "var searchResult = await collection.VectorizedSearchAsync(searchVector);\n", + "\n", + "await foreach (var result in searchResult.Results)\n", + "{\n", + " Console.WriteLine($\"Search score: {result.Score}\");\n", + " Console.WriteLine($\"Key: {result.Record.Key}\");\n", + " Console.WriteLine($\"Term: {result.Record.Term}\");\n", + " Console.WriteLine($\"Definition: {result.Record.Definition}\");\n", + " Console.WriteLine(\"=========\");\n", + "}" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Additional information\n", + "\n", + "There are more concepts related to the vector stores that will allow you to extend the capabilities. Each of them is described in more detail on the Microsoft Learn portal:\n", + "\n", + "1. [Generic data model](https://learn.microsoft.com/en-us/semantic-kernel/concepts/vector-store-connectors/generic-data-model) - allows to store and search data without a concrete data model type, using the generic data model instead.\n", + "2. [Custom mapper](https://learn.microsoft.com/en-us/semantic-kernel/concepts/vector-store-connectors/how-to/vector-store-custom-mapper) - define a custom mapper for a specific connector, when the default mapping logic is not enough to work with a database.\n", + "3. [Code samples](https://learn.microsoft.com/en-us/semantic-kernel/concepts/vector-store-connectors/code-samples) - end-to-end RAG sample, supporting multiple vectors in the same record, vector search with paging, interoperability with Langchain and more." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": ".NET (C#)", + "language": "C#", + "name": ".net-csharp" + }, + "language_info": { + "name": "polyglot-notebook" + }, + "polyglot_notebook": { + "kernelInfo": { + "defaultKernelName": "csharp", + "items": [ + { + "aliases": [], + "name": "csharp" + } + ] + } + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/dotnet/notebooks/10-RAG-with-BingSearch.ipynb b/dotnet/notebooks/09-RAG-with-BingSearch.ipynb similarity index 100% rename from dotnet/notebooks/10-RAG-with-BingSearch.ipynb rename to dotnet/notebooks/09-RAG-with-BingSearch.ipynb diff --git a/dotnet/notebooks/09-memory-with-chroma.ipynb b/dotnet/notebooks/09-memory-with-chroma.ipynb deleted file mode 100644 index a34e0f1666d6..000000000000 --- a/dotnet/notebooks/09-memory-with-chroma.ipynb +++ /dev/null @@ -1,565 +0,0 @@ -{ - "cells": [ - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Building Semantic Memory with Embeddings\n", - "\n", - "In this notebook, we show how to use [Chroma](https://www.trychroma.com/) with Semantic Kernel to create even more\n", - "intelligent applications. We assume that you are already familiar with the concepts of Semantic Kernel\n", - "and memory. [Previously](04-kernel-arguments-chat.ipynb), we have used `kernel arguments` to pass\n", - "additional text into prompts, enriching them with more context for a basic chat experience.\n", - "\n", - "However, relying solely on kernel arguments has its limitations, such as the model's token limit.\n", - "To overcome these limitations, we will use **SK Semantic Memory**, leveraging Chroma as a persistent\n", - "Semantic Memory Storage.\n", - "\n", - "**Chroma** is an open-source embedding database designed to make it easy to build Language Model\n", - "applications by making knowledge, facts, and plugins pluggable for LLMs. It allows us to store and\n", - "retrieve information in a way that can be easily utilized by the models, enabling both short-term\n", - "and long-term memory for more advanced applications. In this notebook, we will showcase how to\n", - "effectively use Chroma with the Semantic Kernel for a powerful application experience.\n", - "\n", - "**Note:** This example is verified using Chroma version **0.4.10**. Any higher versions may introduce incompatibility." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "#r \"nuget: Microsoft.SemanticKernel, 1.23.0\"\n", - "#r \"nuget: Microsoft.SemanticKernel.Connectors.Chroma, 1.23.0-alpha\"\n", - "#r \"nuget: Microsoft.SemanticKernel.Plugins.Memory, 1.23.0-alpha\"\n", - "#r \"nuget: System.Linq.Async, 6.0.1\"\n", - "\n", - "#!import config/Settings.cs\n", - "\n", - "using System;\n", - "using System.Collections.Generic;\n", - "using System.Linq;\n", - "using System.Threading.Tasks;\n", - "using Microsoft.SemanticKernel;\n", - "using Microsoft.SemanticKernel.Connectors.Chroma;\n", - "using Microsoft.SemanticKernel.Memory;\n", - "using Microsoft.SemanticKernel.Plugins.Memory;\n", - "using Kernel = Microsoft.SemanticKernel.Kernel;\n", - "\n", - "var builder = Kernel.CreateBuilder();\n", - "\n", - "// Configure AI backend used by the kernel\n", - "var (useAzureOpenAI, model, azureEndpoint, apiKey, orgId) = Settings.LoadFromFile();\n", - "\n", - "if (useAzureOpenAI)\n", - " builder.AddAzureOpenAIChatCompletion(model, azureEndpoint, apiKey);\n", - "else\n", - " builder.AddOpenAIChatCompletion(model, apiKey, orgId);\n", - "\n", - "var kernel = builder.Build();" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In order to use memory, we need to instantiate the Memory Plugin with a Memory Storage\n", - "and an Embedding backend. In this example, we make use of the `ChromaMemoryStore`,\n", - "leveraging [Chroma](https://www.trychroma.com/), an open source embedding database\n", - "you can run locally and in the cloud.\n", - "\n", - "To run Chroma locally, here's a quick script to download Chroma source and run it using Docker:\n", - "\n", - "```shell\n", - "git clone https://github.com/chroma-core/chroma.git\n", - "cd chroma\n", - "docker-compose up --build\n", - "```" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "#pragma warning disable SKEXP0001, SKEXP0010, SKEXP0020, SKEXP0050\n", - "\n", - "using Microsoft.SemanticKernel.Connectors.AzureOpenAI;\n", - "using Microsoft.SemanticKernel.Connectors.OpenAI;\n", - "using Microsoft.SemanticKernel.Embeddings;\n", - "\n", - "var (useAzureOpenAI, model, azureEndpoint, apiKey, orgId) = Settings.LoadFromFile();\n", - "\n", - "var modelId = \"text-embedding-ada-002\";\n", - "ITextEmbeddingGenerationService textEmbeddingService = useAzureOpenAI\n", - " ? new AzureOpenAITextEmbeddingGenerationService(deploymentName: modelId, endpoint: azureEndpoint, apiKey: apiKey)\n", - " : new OpenAITextEmbeddingGenerationService(modelId: modelId, apiKey: apiKey);\n", - "\n", - "var memory = new SemanticTextMemory(new VolatileMemoryStore(), textEmbeddingService);" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "At its core, Semantic Memory is a set of data structures that allows to store\n", - "the meaning of text that come from different data sources, and optionally to\n", - "store the source text and other metadata.\n", - "\n", - "The text can be from the web, e-mail providers, chats, a database, or from your\n", - "local directory, and are hooked up to the Semantic Kernel through memory connectors.\n", - "\n", - "The texts are embedded, sort of \"compressed\", into a vector of floats that representing\n", - "mathematically the text content and meaning.\n", - "\n", - "You can read more about embeddings [here](https://aka.ms/sk/embeddings)." - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Manually adding memories\n", - "\n", - "Let's create some initial memories \"About Me\". We can add memories to `ChromaMemoryStore` by using `SaveInformationAsync`" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "const string MemoryCollectionName = \"aboutMe\";\n", - "\n", - "await memory.SaveInformationAsync(MemoryCollectionName, id: \"info1\", text: \"My name is Andrea\");\n", - "await memory.SaveInformationAsync(MemoryCollectionName, id: \"info2\", text: \"I currently work as a tourist operator\");\n", - "await memory.SaveInformationAsync(MemoryCollectionName, id: \"info3\", text: \"I currently live in Seattle and have been living there since 2005\");\n", - "await memory.SaveInformationAsync(MemoryCollectionName, id: \"info4\", text: \"I visited France and Italy five times since 2015\");\n", - "await memory.SaveInformationAsync(MemoryCollectionName, id: \"info5\", text: \"My family is from New York\");" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's try searching the memory:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "var questions = new[]\n", - "{\n", - " \"what is my name?\",\n", - " \"where do I live?\",\n", - " \"where is my family from?\",\n", - " \"where have I travelled?\",\n", - " \"what do I do for work?\",\n", - "};\n", - "\n", - "foreach (var q in questions)\n", - "{\n", - " var response = await memory.SearchAsync(MemoryCollectionName, q, limit: 1, minRelevanceScore: 0.5).FirstOrDefaultAsync();\n", - " Console.WriteLine(q + \" \" + response?.Metadata.Text);\n", - "}" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's now revisit our chat sample from the [previous notebook](04-kernel-arguments-chat.ipynb).\n", - "If you remember, we used kernel arguments to fill the prompt with a `history` that continuously got populated as we chatted with the bot. Let's add also memory to it!" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "This is done by using the `TextMemoryPlugin` which exposes the `recall` native function.\n", - "\n", - "`recall` takes an input ask and performs a similarity search on the contents that have\n", - "been embedded in the Memory Store. By default, `recall` returns the most relevant memory." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "#pragma warning disable SKEXP0001, SKEXP0050\n", - "\n", - "// TextMemoryPlugin provides the \"recall\" function\n", - "kernel.ImportPluginFromObject(new TextMemoryPlugin(memory));" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "const string skPrompt = @\"\n", - "ChatBot can have a conversation with you about any topic.\n", - "It can give explicit instructions or say 'I don't know' if it does not have an answer.\n", - "\n", - "Information about me, from previous conversations:\n", - "- {{$fact1}} {{recall $fact1}}\n", - "- {{$fact2}} {{recall $fact2}}\n", - "- {{$fact3}} {{recall $fact3}}\n", - "- {{$fact4}} {{recall $fact4}}\n", - "- {{$fact5}} {{recall $fact5}}\n", - "\n", - "Chat:\n", - "{{$history}}\n", - "User: {{$userInput}}\n", - "ChatBot: \";\n", - "\n", - "var chatFunction = kernel.CreateFunctionFromPrompt(skPrompt, new OpenAIPromptExecutionSettings { MaxTokens = 200, Temperature = 0.8 });" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The `RelevanceParam` is used in memory search and is a measure of the relevance score from 0.0 to 1.0, where 1.0 means a perfect match. We encourage users to experiment with different values." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "#pragma warning disable SKEXP0001, SKEXP0050\n", - "\n", - "var arguments = new KernelArguments();\n", - "\n", - "arguments[\"fact1\"] = \"what is my name?\";\n", - "arguments[\"fact2\"] = \"where do I live?\";\n", - "arguments[\"fact3\"] = \"where is my family from?\";\n", - "arguments[\"fact4\"] = \"where have I travelled?\";\n", - "arguments[\"fact5\"] = \"what do I do for work?\";\n", - "\n", - "arguments[TextMemoryPlugin.CollectionParam] = MemoryCollectionName;\n", - "arguments[TextMemoryPlugin.LimitParam] = \"2\";\n", - "arguments[TextMemoryPlugin.RelevanceParam] = \"0.8\";" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now that we've included our memories, let's chat!" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "var history = \"\";\n", - "arguments[\"history\"] = history;\n", - "Func Chat = async (string input) => {\n", - " // Save new message in the kernel arguments\n", - " arguments[\"userInput\"] = input;\n", - "\n", - " // Process the user message and get an answer\n", - " var answer = await chatFunction.InvokeAsync(kernel, arguments);\n", - "\n", - " // Append the new interaction to the chat history\n", - " var result = $\"\\nUser: {input}\\nChatBot: {answer}\\n\";\n", - "\n", - " history += result;\n", - " arguments[\"history\"] = history;\n", - " \n", - " // Show the bot response\n", - " Console.WriteLine(result);\n", - "};" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "await Chat(\"Hello, I think we've met before, remember? my name is...\");" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "await Chat(\"I want to plan a trip and visit my family. Do you know where that is?\");" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "await Chat(\"Great! What are some fun things to do there?\");" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Adding documents to your memory\n", - "\n", - "Many times in your applications you'll want to bring in external documents into your memory. Let's see how we can do this using ChromaMemoryStore.\n", - "\n", - "Let's first get some data using some of the links in the Semantic Kernel repo." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "const string memoryCollectionName = \"SKGitHub\";\n", - "\n", - "var githubFiles = new Dictionary()\n", - "{\n", - " [\"https://github.com/microsoft/semantic-kernel/blob/main/README.md\"]\n", - " = \"README: Installation, getting started, and how to contribute\",\n", - " [\"https://github.com/microsoft/semantic-kernel/blob/main/dotnet/notebooks/02-running-prompts-from-file.ipynb\"]\n", - " = \"Jupyter notebook describing how to pass prompts from a file to a semantic plugin or function\",\n", - " [\"https://github.com/microsoft/semantic-kernel/blob/main/dotnet/notebooks/00-getting-started.ipynb\"]\n", - " = \"Jupyter notebook describing how to get started with the Semantic Kernel\",\n", - " [\"https://github.com/microsoft/semantic-kernel/tree/main/prompt_template_samples/ChatPlugin/ChatGPT\"]\n", - " = \"Sample demonstrating how to create a chat plugin interfacing with ChatGPT\",\n", - " [\"https://github.com/microsoft/semantic-kernel/blob/main/dotnet/src/Plugins/Plugins.Memory/VolatileMemoryStore.cs\"]\n", - " = \"C# class that defines a volatile embedding store\",\n", - "};" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's build a new Memory." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "#pragma warning disable SKEXP0001, SKEXP0010, SKEXP0020, SKEXP0050\n", - "\n", - "var modelId = \"text-embedding-ada-002\";\n", - "ITextEmbeddingGenerationService textEmbeddingService = useAzureOpenAI\n", - " ? new AzureOpenAITextEmbeddingGenerationService(deploymentName: modelId, endpoint: azureEndpoint, apiKey: apiKey)\n", - " : new OpenAITextEmbeddingGenerationService(modelId: modelId, apiKey: apiKey);\n", - "\n", - "var memory = new SemanticTextMemory(new VolatileMemoryStore(), textEmbeddingService);" - ] - }, - { - "attachments": {}, - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now let's add these files to ChromaMemoryStore using `SaveReferenceAsync`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "Console.WriteLine(\"Adding some GitHub file URLs and their descriptions to Chroma Semantic Memory.\");\n", - "var i = 0;\n", - "foreach (var entry in githubFiles)\n", - "{\n", - " await memory.SaveReferenceAsync(\n", - " collection: memoryCollectionName,\n", - " description: entry.Value,\n", - " text: entry.Value,\n", - " externalId: entry.Key,\n", - " externalSourceName: \"GitHub\"\n", - " );\n", - " Console.WriteLine($\" URL {++i} saved\");\n", - "}" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "dotnet_interactive": { - "language": "csharp" - }, - "polyglot_notebook": { - "kernelName": "csharp" - } - }, - "outputs": [], - "source": [ - "string ask = \"I love Jupyter notebooks, how should I get started?\";\n", - "Console.WriteLine(\"===========================\\n\" +\n", - " \"Query: \" + ask + \"\\n\");\n", - "\n", - "var memories = memory.SearchAsync(memoryCollectionName, ask, limit: 5, minRelevanceScore: 0.6);\n", - "\n", - "i = 0;\n", - "await foreach (var memory in memories)\n", - "{\n", - " Console.WriteLine($\"Result {++i}:\");\n", - " Console.WriteLine(\" URL: : \" + memory.Metadata.Id);\n", - " Console.WriteLine(\" Title : \" + memory.Metadata.Description);\n", - " Console.WriteLine(\" Relevance: \" + memory.Relevance);\n", - " Console.WriteLine();\n", - "}" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": ".NET (C#)", - "language": "C#", - "name": ".net-csharp" - }, - "language_info": { - "name": "polyglot-notebook" - }, - "polyglot_notebook": { - "kernelInfo": { - "defaultKernelName": "csharp", - "items": [ - { - "aliases": [], - "name": "csharp" - } - ] - } - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} diff --git a/dotnet/notebooks/README.md b/dotnet/notebooks/README.md index 83c8d880ebfd..2a63eaf89630 100644 --- a/dotnet/notebooks/README.md +++ b/dotnet/notebooks/README.md @@ -58,12 +58,11 @@ For a quick dive, look at the [getting started notebook](00-getting-started.ipyn 2. [Running AI prompts from file](02-running-prompts-from-file.ipynb) 3. [Creating Semantic Functions at runtime (i.e. inline functions)](03-semantic-function-inline.ipynb) 4. [Using Kernel Arguments to Build a Chat Experience](04-kernel-arguments-chat.ipynb) -5. [Introduction to the Planning/Function Calling](05-using-function-calling.ipynb) -6. [Building Memory with Embeddings](06-memory-and-embeddings.ipynb) +5. [Introduction to the Function Calling](05-using-function-calling.ipynb) +6. [Vector Stores and Embeddings](06-vector-stores-and-embeddings.ipynb) 7. [Creating images with DALL-E 3](07-DALL-E-3.ipynb) 8. [Chatting with ChatGPT and Images](08-chatGPT-with-DALL-E-3.ipynb) -9. [Building Semantic Memory with Chroma](09-memory-with-chroma.ipynb) -10. [BingSearch using Kernel](10-RAG-with-BingSearch.ipynb) +9. [BingSearch using Kernel](09-RAG-with-BingSearch.ipynb) # Run notebooks in the browser with JupyterLab