-
Notifications
You must be signed in to change notification settings - Fork 221
feat: Add Prompty-based architecture for local model compatibility #197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: Add Prompty-based architecture for local model compatibility #197
Conversation
MAJOR BREAKTHROUGH: Replace ChatCompletionAgent with template-based approach Problem Solved: - ChatCompletionAgent with function calling was timing out (100+ seconds) with local models - Codestral and other local models generate text descriptions instead of proper OpenAI tool_calls - 0% success rate, complete agent failure for weather queries Solution Implemented: - Prompty template with few-shot learning examples - Manual intent detection and location extraction - Direct plugin orchestration without LLM decision-making - Single LLM call with template variables Performance Impact: ✅ Response time: 100+ seconds → 15-20 seconds (5x improvement) ✅ Success rate: 0% → 100% ✅ Universal model compatibility (not just OpenAI-compatible) ✅ Cleaner, more maintainable architecture Key Components: - Bot/Agents/WeatherForecastAgent.cs: Complete refactor to Prompty-based approach - Prompts/weather-forecast.prompty: New few-shot learning template - MyM365Agent1.csproj: Added Microsoft.SemanticKernel.Prompty package - Comprehensive documentation in PROMPTY_ARCHITECTURE.md This architectural pattern demonstrates that few-shot learning can be more effective than function calling for local/open-source models, providing a robust foundation for building reliable AI agents across any model provider. Sample Implementation: MyM365Agent1 weather agent Documentation: docs/prompty-few-shot-architecture.md
…ature - Fix documentation to properly describe ChatCompletionAgent as a new Semantic Kernel feature - Reframe solution as alternative approach for local model compatibility - Remove incorrect 'traditional' language - this is about modern SK features not working with local models - More accurate technical positioning of the breakthrough
- Move MyM365Agent1 to samples/basic/weather-agent-prompty/ to match naming convention
- Remove promotional language ('breakthrough', 'revolutionary', 'paradigm shift')
- Clean up WHOAMI project that was accidentally included
- Update all documentation links to point to new project location
- Use professional, measured language appropriate for open source contribution
- Maintain kebab-case naming pattern consistent with other basic samples
Project now follows standard conventions and is ready for pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for local models via a Prompty-based few-shot architecture and updates the basic weather-agent sample to integrate the Ollama connector.
- Extend the .NET sample (
weather-agent) to register and configure the Ollama chat completion service. - Introduce a new Prompty-driven sample (
weather-agent-prompty) with manual intent detection, few-shot templates, and plugin orchestration. - Update documentation and guides to explain the new architecture pattern and sample usage.
Reviewed Changes
Copilot reviewed 37 out of 39 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| samples/basic/weather-agent/dotnet/appsettings.json | Added Ollama settings and toggles for local model support. |
| samples/basic/weather-agent/dotnet/Program.cs | Registered Ollama connector and updated AI service logic. |
| samples/basic/weather-agent/dotnet/WeatherAgent.csproj | Added Ollama connector package and suppressed relevant warnings. |
| samples/basic/weather-agent-prompty/MyM365Agent1/Bot/Plugins/WeatherForecastPlugin.cs | Created plugin for weather forecasts (syntax issues). |
| samples/basic/weather-agent-prompty/MyM365Agent1/Bot/WeatherAgentBot.cs | Configured dependency injection and streaming responses. |
Comments suppressed due to low confidence (2)
samples/basic/weather-agent-prompty/MyM365Agent1/Bot/Plugins/WeatherForecastPlugin.cs:8
- Class declarations cannot include constructor parameters in their signature. Remove the parameter from the class declaration and define a proper constructor instead.
public class WeatherForecastPlugin(ITurnContext turnContext)
samples/basic/weather-agent-prompty/MyM365Agent1/Bot/WeatherAgentBot.cs:29
- The initializer syntax
[ ... ]is invalid for ServiceCollection. Usenew ServiceCollection { ... }to register service descriptors.
ServiceCollection serviceCollection = [
| <PackageReference Include="Microsoft.SemanticKernel.Agents.Core" Version="1.45.0" /> | ||
| <PackageReference Include="Microsoft.SemanticKernel.Connectors.AzureOpenAI" Version="1.45.0" /> | ||
| <PackageReference Include="Microsoft.SemanticKernel.Connectors.OpenAI" Version="1.45.0" /> | ||
| <PackageReference Include="Microsoft.SemanticKernel.Connectors.Ollama" Version="1.46.0-alpha" /> |
Copilot
AI
Jun 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The Ollama connector is pinned to version 1.46.0-alpha while other Semantic Kernel packages remain at 1.45.0; aligning versions can prevent compatibility issues.
| <PackageReference Include="Microsoft.SemanticKernel.Connectors.Ollama" Version="1.46.0-alpha" /> | |
| <PackageReference Include="Microsoft.SemanticKernel.Connectors.Ollama" Version="1.45.0" /> |
| public Task<WeatherForecast> GetForecastForDate(string date, string location) | ||
| { | ||
| string searchingForDate = date; | ||
| if (DateTime.TryParse(date, out DateTime searchingDate)) | ||
| { | ||
| searchingForDate = searchingDate.ToLongDateString(); | ||
| } | ||
| turnContext.StreamingResponse.QueueInformativeUpdateAsync($"Looking up the Weather in {location} for {searchingForDate}"); | ||
|
|
||
| return Task.FromResult(new WeatherForecast | ||
| { | ||
| Date = date, | ||
| TemperatureC = Random.Shared.Next(-20, 55) | ||
| }); |
Copilot
AI
Jun 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The asynchronous call to QueueInformativeUpdateAsync is not awaited, which may lead to unexpected ordering of messages. Consider awaiting the Task to ensure the update is sent before continuing.
| public Task<WeatherForecast> GetForecastForDate(string date, string location) | |
| { | |
| string searchingForDate = date; | |
| if (DateTime.TryParse(date, out DateTime searchingDate)) | |
| { | |
| searchingForDate = searchingDate.ToLongDateString(); | |
| } | |
| turnContext.StreamingResponse.QueueInformativeUpdateAsync($"Looking up the Weather in {location} for {searchingForDate}"); | |
| return Task.FromResult(new WeatherForecast | |
| { | |
| Date = date, | |
| TemperatureC = Random.Shared.Next(-20, 55) | |
| }); | |
| public async Task<WeatherForecast> GetForecastForDate(string date, string location) | |
| { | |
| string searchingForDate = date; | |
| if (DateTime.TryParse(date, out DateTime searchingDate)) | |
| { | |
| searchingForDate = searchingDate.ToLongDateString(); | |
| } | |
| await turnContext.StreamingResponse.QueueInformativeUpdateAsync($"Looking up the Weather in {location} for {searchingForDate}"); | |
| return new WeatherForecast | |
| { | |
| Date = date, | |
| TemperatureC = Random.Shared.Next(-20, 55) | |
| }; |
| "ModelId": "mistral", // This is the Model ID of the Ollama model (e.g., llama3.2, mistral, codellama) | ||
| "Endpoint": "http://localhost:11434" // This is the Endpoint of the Ollama server |
Copilot
AI
Jun 28, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Storing the Ollama endpoint and model ID in appsettings.json may expose configuration details. It's safer to inject these values via user secrets or environment variables.
| "ModelId": "mistral", // This is the Model ID of the Ollama model (e.g., llama3.2, mistral, codellama) | |
| "Endpoint": "http://localhost:11434" // This is the Endpoint of the Ollama server | |
| "ModelId": "{{OllamaModelId}}", // Use environment variables or user secrets to store the Model ID of the Ollama model | |
| "Endpoint": "{{OllamaEndpoint}}" // Use environment variables or user secrets to store the Endpoint of the Ollama server |
|
@microsoft-github-policy-service agree |
- Fix async/await pattern in WeatherForecastPlugin.cs - Align Ollama package version to 1.45.0 for consistency - Note: OpenAI fallback already exists in weather-agent Program.cs - Address code review suggestions for better async handling
|
Thanks @copilot for the review! I've addressed the feedback: ✅ Fixed async/await pattern in WeatherForecastPlugin.cs - now properly awaits QueueInformativeUpdateAsync Configuration Security Note: The settings in �ppsettings.json are development defaults. In production, these would typically be:
The sample follows the pattern of other Microsoft samples which include default localhost configurations for ease of getting started. Ready for review! �� |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShawnDelaineBellazanJr , thanks for your contribution suggestion here.
Given your update sample was taken from the ATK Weather app template, which was actually an earlier version of the weather sample in this repo, and given your demonstrating Promptly as a local LLM but still using SK. I feel like this should be approached as an extension of the existing weather sample vs a totally new one.
I feel like it makes much more sense to augment the existing sample.
thoughts?
| @@ -0,0 +1,157 @@ | |||
| # 🚀 Prompty + Few-Shot Learning Architecture for Local Models | |||
|
|
|||
| ## Achievement Summary | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShawnDelaineBellazanJr This is relevant to your proposed sample changes for the weather agent using Sematic kernel. Please move this document to the correct folder for the relevant sample.
|
|
||
| ## Achievement Summary | ||
|
|
||
| We successfully solved a critical compatibility issue with local LLMs and achieved a **5x performance improvement** with **100% reliability** for AI agent responses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShawnDelaineBellazanJr This is relevant to your proposed sample changes for the weather agent using Sematic kernel. Please move this document to the correct folder for the relevant sample.
| - [Using Activities](./docs/usingactivities.md) | ||
| - [Creating Messages](./docs/creatingmessages.md) | ||
|
|
||
| ## Advanced Patterns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShawnDelaineBellazanJr This is relevant to your proposed sample changes for the weather agent using Sematic kernel. Please move this document to the correct folder for the relevant sample, there is also a sample MD that describes most of the samples, you can add it there.
|
|
||
| ## Overview | ||
|
|
||
| This document describes an alternative architecture pattern for building reliable AI agents that work with local/open-source language models. The approach replaces traditional function calling with Prompty templates and few-shot learning, achieving significant improvements in reliability, performance, and model compatibility. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShawnDelaineBellazanJr This is relevant to your proposed sample changes for the weather agent using Sematic kernel. Please move this document to the correct folder for the relevant sample, there is also a sample MD that describes most of the samples, you can add it there.
Problem Solved
The Microsoft 365 Agents SDK's ChatCompletionAgent relies on OpenAI-style function calling, which causes compatibility issues with local models like Codestral, Llama, and Mistral. These models generate text descriptions instead of proper \ ool_calls\ format, resulting in:
Solution: Prompty + Few-Shot Learning
This PR introduces an alternative architecture pattern that provides universal compatibility with local models:
Key Innovation
Instead of relying on function calling, we teach models through explicit examples what we want them to do using Prompty templates with few-shot learning.
Performance Improvements
Architecture Pattern
\
User Input → Intent Detection → Plugin Calls → Prompty Template → Response
\\
Implementation Highlights
1. Prompty Template (\weather-forecast.prompty)
2. Manual Intent Detection (C#)
3. Direct Plugin Orchestration
Files Added
Testing
✅ Tested with Codestral via LM Studio
✅ 100% success rate for weather queries
✅ Fast, reliable responses (15-20 seconds)
✅ Clean error handling and fallbacks
Benefits for the Ecosystem
This enables the Microsoft 365 Agents SDK to work reliably with:
Impact
This alternative approach advances the SDK's goal of being unopinionated about AI providers by enabling reliable agent experiences regardless of model choice, supporting both cost-effective local deployments and cloud-based solutions.
Sample: weather-agent-prompty
Documentation: Prompty Architecture Guide