Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Microsoft.ML.GenAI.Phi, test package and sample project. #7184

Merged
merged 43 commits into from
Jul 30, 2024
Merged
Show file tree
Hide file tree
Changes from 30 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
c5ffc73
add genai.phi and tests
LittleLittleCloud Jun 27, 2024
1493eba
formatter
LittleLittleCloud Jun 27, 2024
972c7c9
refactor Phi3Tokenizer
LittleLittleCloud Jun 27, 2024
349a426
update
LittleLittleCloud Jun 27, 2024
17b689a
add configuration for phi-series
LittleLittleCloud Jun 27, 2024
1faddbc
add semantic kernel and autogen intergration
LittleLittleCloud Jun 27, 2024
2802ed3
update
LittleLittleCloud Jun 28, 2024
c8ab578
add Microsoft.ML.GenAI.Sample
LittleLittleCloud Jun 28, 2024
3db5c61
use tokenzier model from testTokenizer package
LittleLittleCloud Jun 28, 2024
d72fbce
use defaults
LittleLittleCloud Jun 28, 2024
9cc9a0c
add quantize linear
LittleLittleCloud Jun 28, 2024
59e5da8
use version string
LittleLittleCloud Jun 28, 2024
f9539b8
remove special token from CreatePhi2 API
LittleLittleCloud Jun 28, 2024
dbe8187
set up quantize sample
LittleLittleCloud Jun 28, 2024
ccaddfe
initialize linear with zeros
LittleLittleCloud Jun 28, 2024
151fa86
update sample
LittleLittleCloud Jul 1, 2024
b4e5d84
add 6.0 to targetframework
LittleLittleCloud Jul 1, 2024
b9604a0
fix tests
LittleLittleCloud Jul 1, 2024
73c0d31
update
LittleLittleCloud Jul 1, 2024
9fd352b
Merge branch 'main' into u/xiaoyun/phi
LittleLittleCloud Jul 15, 2024
745f40b
remove Phi3Tokenizer and use LlamaTokenizer instead
LittleLittleCloud Jul 15, 2024
57444cc
revert change in tokenizer package
LittleLittleCloud Jul 15, 2024
a5058a4
Merge branch 'main' into u/xiaoyun/phi
LittleLittleCloud Jul 17, 2024
4745683
run test on x64
LittleLittleCloud Jul 17, 2024
b933ce4
fix tests
LittleLittleCloud Jul 17, 2024
43dd37f
check in approved file
LittleLittleCloud Jul 17, 2024
1a77f8d
run test in net6.0
LittleLittleCloud Jul 17, 2024
e3e09e4
use meta device
LittleLittleCloud Jul 17, 2024
b316f19
copy approval tests to output folder
LittleLittleCloud Jul 18, 2024
d6f0e61
set up approval test file location
LittleLittleCloud Jul 19, 2024
0bb6b98
fix comment
LittleLittleCloud Jul 23, 2024
405d162
rename to AddGenAITextGeneration and AddGenAIChatCompletion
LittleLittleCloud Jul 23, 2024
a1b0369
Update job-template.yml
LittleLittleCloud Jul 26, 2024
6b3b46f
add mit license
LittleLittleCloud Jul 29, 2024
c165b5f
add reference
LittleLittleCloud Jul 29, 2024
b837269
bump code coverage version
LittleLittleCloud Jul 29, 2024
3020ebd
add <PreserveCompilationContext>true</PreserveCompilationContext>
LittleLittleCloud Jul 30, 2024
077d366
add runtime package
LittleLittleCloud Jul 30, 2024
cd3b20e
remove flag
LittleLittleCloud Jul 30, 2024
62576c4
add flag
LittleLittleCloud Jul 30, 2024
d91ec0a
fix build error
LittleLittleCloud Jul 30, 2024
e7d0fde
update
LittleLittleCloud Jul 30, 2024
a97afdf
update
LittleLittleCloud Jul 30, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 45 additions & 1 deletion Microsoft.ML.sln
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,15 @@ Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "Microsoft.ML.TorchSharp.Tes
EndProject
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "Microsoft.ML.TensorFlow.Tests", "test\Microsoft.ML.TensorFlow.Tests\Microsoft.ML.TensorFlow.Tests.csproj", "{763FF013-8309-4680-A769-B54E7BB99612}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "Microsoft.ML.GenAI.Core", "src\Microsoft.ML.GenAI.Core\Microsoft.ML.GenAI.Core.csproj", "{DB2CA055-8ABD-4E3E-8089-5B64C3415E85}"
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "Microsoft.ML.GenAI.Core", "src\Microsoft.ML.GenAI.Core\Microsoft.ML.GenAI.Core.csproj", "{DB2CA055-8ABD-4E3E-8089-5B64C3415E85}"
EndProject
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "Microsoft.ML.GenAI.Phi", "src\Microsoft.ML.GenAI.Phi\Microsoft.ML.GenAI.Phi.csproj", "{694BF884-B2E4-4E1C-9342-0564BAAC4575}"
EndProject
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "Microsoft.ML.GenAI.Phi.Tests", "test\Microsoft.ML.GenAI.Phi.Tests\Microsoft.ML.GenAI.Phi.Tests.csproj", "{867FFC34-DFA7-400F-B9BB-85158326CE08}"
EndProject
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "Microsoft.ML.GenAI.Samples", "docs\samples\Microsoft.ML.GenAI.Samples\Microsoft.ML.GenAI.Samples.csproj", "{1D4AD9A3-19AF-432B-889D-A63FE6D7BD47}"
EndProject
Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "Microsoft.ML.GenAI.Core.Tests", "test\Microsoft.ML.GenAI.Core.Tests\Microsoft.ML.GenAI.Core.Tests.csproj", "{14AB0804-D4CE-4634-B544-5A8587620783}"
EndProject
Global
GlobalSection(SolutionConfigurationPlatforms) = preSolution
Expand Down Expand Up @@ -838,6 +846,38 @@ Global
{DB2CA055-8ABD-4E3E-8089-5B64C3415E85}.Release|Any CPU.Build.0 = Release|Any CPU
{DB2CA055-8ABD-4E3E-8089-5B64C3415E85}.Release|x64.ActiveCfg = Release|Any CPU
{DB2CA055-8ABD-4E3E-8089-5B64C3415E85}.Release|x64.Build.0 = Release|Any CPU
{694BF884-B2E4-4E1C-9342-0564BAAC4575}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{694BF884-B2E4-4E1C-9342-0564BAAC4575}.Debug|Any CPU.Build.0 = Debug|Any CPU
{694BF884-B2E4-4E1C-9342-0564BAAC4575}.Debug|x64.ActiveCfg = Debug|Any CPU
{694BF884-B2E4-4E1C-9342-0564BAAC4575}.Debug|x64.Build.0 = Debug|Any CPU
{694BF884-B2E4-4E1C-9342-0564BAAC4575}.Release|Any CPU.ActiveCfg = Release|Any CPU
{694BF884-B2E4-4E1C-9342-0564BAAC4575}.Release|Any CPU.Build.0 = Release|Any CPU
{694BF884-B2E4-4E1C-9342-0564BAAC4575}.Release|x64.ActiveCfg = Release|Any CPU
{694BF884-B2E4-4E1C-9342-0564BAAC4575}.Release|x64.Build.0 = Release|Any CPU
{867FFC34-DFA7-400F-B9BB-85158326CE08}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{867FFC34-DFA7-400F-B9BB-85158326CE08}.Debug|Any CPU.Build.0 = Debug|Any CPU
{867FFC34-DFA7-400F-B9BB-85158326CE08}.Debug|x64.ActiveCfg = Debug|Any CPU
{867FFC34-DFA7-400F-B9BB-85158326CE08}.Debug|x64.Build.0 = Debug|Any CPU
{867FFC34-DFA7-400F-B9BB-85158326CE08}.Release|Any CPU.ActiveCfg = Release|Any CPU
{867FFC34-DFA7-400F-B9BB-85158326CE08}.Release|Any CPU.Build.0 = Release|Any CPU
{867FFC34-DFA7-400F-B9BB-85158326CE08}.Release|x64.ActiveCfg = Release|Any CPU
{867FFC34-DFA7-400F-B9BB-85158326CE08}.Release|x64.Build.0 = Release|Any CPU
{1D4AD9A3-19AF-432B-889D-A63FE6D7BD47}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{1D4AD9A3-19AF-432B-889D-A63FE6D7BD47}.Debug|Any CPU.Build.0 = Debug|Any CPU
{1D4AD9A3-19AF-432B-889D-A63FE6D7BD47}.Debug|x64.ActiveCfg = Debug|Any CPU
{1D4AD9A3-19AF-432B-889D-A63FE6D7BD47}.Debug|x64.Build.0 = Debug|Any CPU
{1D4AD9A3-19AF-432B-889D-A63FE6D7BD47}.Release|Any CPU.ActiveCfg = Release|Any CPU
{1D4AD9A3-19AF-432B-889D-A63FE6D7BD47}.Release|Any CPU.Build.0 = Release|Any CPU
{1D4AD9A3-19AF-432B-889D-A63FE6D7BD47}.Release|x64.ActiveCfg = Release|Any CPU
{1D4AD9A3-19AF-432B-889D-A63FE6D7BD47}.Release|x64.Build.0 = Release|Any CPU
{14AB0804-D4CE-4634-B544-5A8587620783}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{14AB0804-D4CE-4634-B544-5A8587620783}.Debug|Any CPU.Build.0 = Debug|Any CPU
{14AB0804-D4CE-4634-B544-5A8587620783}.Debug|x64.ActiveCfg = Debug|Any CPU
{14AB0804-D4CE-4634-B544-5A8587620783}.Debug|x64.Build.0 = Debug|Any CPU
{14AB0804-D4CE-4634-B544-5A8587620783}.Release|Any CPU.ActiveCfg = Release|Any CPU
{14AB0804-D4CE-4634-B544-5A8587620783}.Release|Any CPU.Build.0 = Release|Any CPU
{14AB0804-D4CE-4634-B544-5A8587620783}.Release|x64.ActiveCfg = Release|Any CPU
{14AB0804-D4CE-4634-B544-5A8587620783}.Release|x64.Build.0 = Release|Any CPU
EndGlobalSection
GlobalSection(SolutionProperties) = preSolution
HideSolutionNode = FALSE
Expand Down Expand Up @@ -925,6 +965,10 @@ Global
{AB8D68F1-6C3E-41FD-B0EC-A093E009341D} = {AED9C836-31E3-4F3F-8ABC-929555D3F3C4}
{763FF013-8309-4680-A769-B54E7BB99612} = {AED9C836-31E3-4F3F-8ABC-929555D3F3C4}
{DB2CA055-8ABD-4E3E-8089-5B64C3415E85} = {09EADF06-BE25-4228-AB53-95AE3E15B530}
{694BF884-B2E4-4E1C-9342-0564BAAC4575} = {09EADF06-BE25-4228-AB53-95AE3E15B530}
{867FFC34-DFA7-400F-B9BB-85158326CE08} = {AED9C836-31E3-4F3F-8ABC-929555D3F3C4}
{1D4AD9A3-19AF-432B-889D-A63FE6D7BD47} = {DA452A53-2E94-4433-B08C-041EDEC729E6}
{14AB0804-D4CE-4634-B544-5A8587620783} = {AED9C836-31E3-4F3F-8ABC-929555D3F3C4}
EndGlobalSection
GlobalSection(ExtensibilityGlobals) = postSolution
SolutionGuid = {41165AF1-35BB-4832-A189-73060F82B01D}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net8.0</TargetFramework>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
</PropertyGroup>

<ItemGroup>
<ProjectReference Include="..\..\..\src\Microsoft.ML.GenAI.Core\Microsoft.ML.GenAI.Core.csproj" />
<ProjectReference Include="..\..\..\src\Microsoft.ML.GenAI.Phi\Microsoft.ML.GenAI.Phi.csproj" />
</ItemGroup>

<ItemGroup>
<PackageReference Include="TorchSharp-cuda-windows" Version="0.102.5" Condition="$([MSBuild]::IsOSPlatform('Windows'))" />
<PackageReference Include="Microsoft.SemanticKernel" Version="$(SemanticKernelVersion)" />
</ItemGroup>

</Project>
39 changes: 39 additions & 0 deletions docs/samples/Microsoft.ML.GenAI.Samples/Phi3Mini/AutoGenSample.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using AutoGen.Core;
using Microsoft.ML.GenAI.Phi;
using static TorchSharp.torch;
using TorchSharp;
using Microsoft.ML.GenAI.Core;
using Microsoft.ML.GenAI.Core.Extension;

namespace Microsoft.ML.GenAI.Samples.Phi3Mini;

public class AutoGenSample
{
public static async Task RunAsync()
{
var device = "cuda";
if (device == "cuda")
{
torch.InitializeDeviceType(DeviceType.CUDA);
}

var defaultType = ScalarType.Float16;
torch.manual_seed(1);
torch.set_default_dtype(defaultType);
var weightFolder = @"C:\Users\xiaoyuz\source\repos\Phi-3-mini-4k-instruct";
var pipeline = Utils.LoadPhi3Mini4KFromFolder(weightFolder, device: device);

// agent
var agent = new Phi3Agent(pipeline, "assistant")
.RegisterPrintMessage();
var question = @"write a C# program to calculate the factorial of a number";

// chat with the assistant
await agent.SendAsync(question);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
using Microsoft.ML.GenAI.Phi.Extension;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using TorchSharp;
using static TorchSharp.torch;

namespace Microsoft.ML.GenAI.Samples.Phi3Mini;

public class SemanticKernelSample
{
public static async Task RunChatCompletionSample()
{
var device = "cuda";
if (device == "cuda")
{
torch.InitializeDeviceType(DeviceType.CUDA);
}

var defaultType = ScalarType.Float16;
torch.manual_seed(1);
torch.set_default_dtype(defaultType);
var weightFolder = @"C:\Users\xiaoyuz\source\repos\Phi-3-mini-4k-instruct";
var pipeline = Utils.LoadPhi3Mini4KFromFolder(weightFolder, device: device);


var kernel = Kernel.CreateBuilder()
.AddPhi3AsChatCompletion(pipeline)
.Build();
var chatService = kernel.GetRequiredService<IChatCompletionService>();
var chatHistory = new ChatHistory();
chatHistory.AddSystemMessage("you are a helpful assistant");
chatHistory.AddUserMessage("write a C# program to calculate the factorial of a number");

await foreach (var response in chatService.GetStreamingChatMessageContentsAsync(chatHistory))
{
Console.Write(response);
}
}

public static async Task RunTextGenerationSample()
{
var device = "cuda";
if (device == "cuda")
{
torch.InitializeDeviceType(DeviceType.CUDA);
}

var defaultType = ScalarType.Float16;
torch.manual_seed(1);
torch.set_default_dtype(defaultType);
var weightFolder = @"C:\Users\xiaoyuz\source\repos\Phi-3-mini-4k-instruct";
var pipeline = Utils.LoadPhi3Mini4KFromFolder(weightFolder, device);


var kernel = Kernel.CreateBuilder()
.AddPhi3AsTextGeneration(pipeline)
.Build();

var response = await kernel.InvokePromptAsync("Tell a joke");
Console.WriteLine(response);
}
}
103 changes: 103 additions & 0 deletions docs/samples/Microsoft.ML.GenAI.Samples/Phi3Mini/Utils.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.ML.GenAI.Core;
using Microsoft.ML.GenAI.Phi;
using Tensorboard;
using static TorchSharp.torch;
using TorchSharp;
using Microsoft.ML.GenAI.Core.Extension;
using System.Text.Json;
using Microsoft.ML.Tokenizers;

namespace Microsoft.ML.GenAI.Samples.Phi3Mini;

internal static class Utils
{
public static ICausalLMPipeline<Tokenizer, Phi3ForCasualLM> LoadPhi3Mini4KFromFolder(
string weightFolder,
string configName = "config.json",
string device = "cuda",
int modelSizeOnCudaInGB = 16,
int modelSizeOnMemoryInGB = 64,
int modelSizeOnDiskInGB = 200,
bool quantizeToInt8 = false,
bool quantizeToInt4 = false)
{
Console.WriteLine("Loading Phi3 from huggingface model weight folder");
torch.set_default_device("meta");
var configPath = System.IO.Path.Combine(weightFolder, configName);
var config = JsonSerializer.Deserialize<Phi3Config>(System.IO.File.ReadAllText(configPath)) ?? throw new ArgumentNullException(nameof(configPath));
var timer = System.Diagnostics.Stopwatch.StartNew();
var model = new Phi3ForCasualLM(config);
var tokenzierPath = System.IO.Path.Combine(weightFolder, "tokenizer.model");
var tokenizer = Phi3TokenizerHelper.FromPretrained(tokenzierPath);

if (quantizeToInt8)
{
model.ToInt8QuantizeModule();
}
else if (quantizeToInt4)
{
model.ToInt4QuantizeModule();
}

var deviceSizeMap = new Dictionary<string, long>
{
["cuda"] = modelSizeOnCudaInGB * 1L * 1024 * 1024 * 1024,
["cpu"] = modelSizeOnMemoryInGB * 1L * 1024 * 1024 * 1024,
["disk"] = modelSizeOnDiskInGB * 1L * 1024 * 1024 * 1024,
};

var deviceMap = model.InferDeviceMapForEachLayer(
devices: ["cuda", "cpu", "disk"],
deviceSizeMapInByte: deviceSizeMap);

var deviceMapJson = JsonSerializer.Serialize(deviceMap, new JsonSerializerOptions { WriteIndented = true });
Console.WriteLine($"Device map:");
Console.WriteLine(deviceMapJson);

// load weight
torch.set_default_device("cpu");

Console.WriteLine("Start loading");
timer = System.Diagnostics.Stopwatch.StartNew();
model = new Phi3ForCasualLM(config);
timer.Stop();
Console.WriteLine($"Phi3 model created in {timer.ElapsedMilliseconds / 1000} s");

timer = System.Diagnostics.Stopwatch.StartNew();
model.LoadSafeTensors(weightFolder);
timer.Stop();
Console.WriteLine($"Phi3 weight loaded in {timer.ElapsedMilliseconds / 1000} s");

if (quantizeToInt8 || quantizeToInt4)
{
timer = System.Diagnostics.Stopwatch.StartNew();
Console.WriteLine("Start quantizing if needed");
if (quantizeToInt8)
{
model.ToInt8QuantizeModule();
}
else if (quantizeToInt4)
{
model.ToInt4QuantizeModule();
}
Console.WriteLine("Quantizing done");
timer.Stop();
Console.WriteLine($"Quantizing done in {timer.ElapsedMilliseconds / 1000} s");
}

timer = System.Diagnostics.Stopwatch.StartNew();
Console.WriteLine($"Start loading to device: {device}");
model = model.ToDynamicLoadingModel(deviceMap, "cuda");
timer.Stop();
Console.WriteLine($"Phi3 loaded to device: {device} in {timer.ElapsedMilliseconds / 1000} s");
var pipeline = new CausalLMPipeline<Tokenizer, Phi3ForCasualLM>(tokenizer, model, device);
torch.set_default_device(device);

return pipeline;
}
}
4 changes: 4 additions & 0 deletions docs/samples/Microsoft.ML.GenAI.Samples/Program.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
// See https://aka.ms/new-console-template for more information
using Microsoft.ML.GenAI.Samples.Phi3Mini;

await SemanticKernelSample.RunChatCompletionSample();
4 changes: 4 additions & 0 deletions eng/Versions.props
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,9 @@
<TensorflowDotNETVersion>0.20.1</TensorflowDotNETVersion>
<TensorFlowMajorVersion>2</TensorFlowMajorVersion>
<TensorFlowVersion>2.3.1</TensorFlowVersion>
<TorchSharpPyBridgeVersion>1.4.1</TorchSharpPyBridgeVersion>
<AutoGenVersion>0.0.15</AutoGenVersion>
<SemanticKernelVersion>1.15.0</SemanticKernelVersion>
<TorchSharpVersion>0.102.7</TorchSharpVersion>
<LibTorchVersion>2.2.1.1</LibTorchVersion>
<!-- Build/infrastructure Dependencies -->
Expand All @@ -75,6 +78,7 @@
<SystemCompositionVersion>1.2.0</SystemCompositionVersion>
<!-- Test-only Dependencies -->
<ApprovalTestsVersion>5.4.7</ApprovalTestsVersion>
<MoqVersion>4.20.70</MoqVersion>
<BenchmarkDotNetVersion>0.13.1</BenchmarkDotNetVersion>
<DotNetRuntime60Version>6.0.26</DotNetRuntime60Version>
<DotNetRuntime80Version>8.0.1</DotNetRuntime80Version>
Expand Down
50 changes: 0 additions & 50 deletions src/Microsoft.ML.GenAI.Core/Extension/CausalLMPipelineExtension.cs

This file was deleted.

Loading
Loading