Implement multimodal request support for Gemini API (#2) #3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request introduces multimodal support for the Google Gemini API within the
ChatAIze.GenerativeCS
library, addressing issue #2. Users can now send requests combining text with various file types (PDF, DOC, TXT, images, audio, video).Key Changes Implemented:
Gemini File Service (
FileService.cs
,IFileService.cs
):FileService
,IFileService
) are aligned with existing provider naming conventions (e.g.,ChatCompletion.cs
).Enhanced Chat Message Structure (
ChatMessage.cs
,ChatContentPart.cs
):ChatMessage.cs
now uses anICollection<IChatContentPart> Parts
property to hold different content types within a single message.IChatContentPart
interface and concreteTextPart
andFileDataPart
classes.FileDataPart
encapsulatesFileDataSource
(MIME type and file URI) for referencing uploaded files.ChatMessage.Content
property has been marked[Obsolete]
and now acts as a getter/setter for the firstTextPart
in theParts
collection to maintain backward compatibility.Updated Gemini Chat Provider (
ChatCompletion.cs
):CreateChatCompletionRequest
method now iterates throughmessage.Parts
.TextPart
andFileDataPart
(includingmime_type
andfile_uri
) into the JSON payload for the Gemini API'sgenerateContent
endpoint.ChatMessage.Content
usage (for backward compatibility fallback) have been suppressed with#pragma
.Client and DI Integration (
GeminiClient.cs
,GeminiClientExtension.cs
):GeminiClient.cs
now instantiates and exposes anIFileService
through a publicFiles
property.GeminiClientExtension.cs
has been updated to registerIFileService
as a singleton, resolving its instance from theGeminiClient.Files
property. This ensures a consistentIFileService
instance is used.Model Updates (
Models/Gemini/
)GeminiFile.cs
,GeminiFileUploadRequest.cs
,GeminiListFilesResponse.cs
to represent data structures for the Gemini Files API.required
modifier for non-nullable properties expected from the API and initializing collections.Documentation & Packaging:
README.md
with a new section explaining how to use the multimodal features, including accessingIFileService
, uploading files, and sending chat messages with file references.ChatAIze.GenerativeCS.csproj
to0.15.0
..csproj
file to reflect the new multimodal capabilities.How to Test:
GeminiClient
.geminiClient.Files
.fileService.UploadFileAsync(...)
.Chat
object and add aChatMessage
.ChatMessage.Parts
collection, add aTextPart
and aFileDataPart
using theMimeType
andUri
from the uploaded file.geminiClient.CompleteAsync(chat)
and observe the model's response, which should consider the content of the uploaded file.Future Considerations (Not in this PR):
GeminiClient.cs
to simplify the process of sending a message with a local file (e.g., a method that handles both upload and message creation).This implementation adheres to the existing coding patterns and architectural style of the library.
Fixes #2