Skip to content

Draft: Add Firebase Imagen API for Unity #1270

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
105 changes: 103 additions & 2 deletions docs/firebaseai/FirebaseAIReadme.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@ Get Started with Firebase AI

Thank you for installing the Firebase AI Unity SDK.

The Firebase AI Gemini API gives you access to the latest generative AI models from Google: the Gemini models. This SDK is built specifically for use with Unity and mobile developers, offering security options against unauthorized clients as well as integrations with other Firebase services.
The Firebase AI SDK for Unity gives you access to Google's state-of-the-art generative AI models. This SDK is built specifically for use with Unity and mobile developers, offering security options against unauthorized clients as well as integrations with other Firebase services.

With this, you can add AI personalization to your app, build an AI chat experience, create AI-powered optimizations and automation, and much more!
With this, you can add AI personalization to your app, build an AI chat experience, create AI-powered optimizations and automation, generate images, and much more!

### Links

Expand All @@ -19,3 +19,104 @@ With this, you can add AI personalization to your app, build an AI chat experien
* [Stack overflow](https://stackoverflow.com/questions/tagged/firebase)
* [Slack community](https://firebase-community.slack.com/)
* [Google groups](https://groups.google.com/forum/#!forum/firebase-talk)

## Available Models

The Firebase AI SDK for Unity currently supports the following model families:

### Gemini API

The Firebase AI Gemini API gives you access to the latest generative AI models from Google: the Gemini models. These models are excellent for text generation, summarization, chat applications, and more.

_(Refer to the [Firebase documentation](https://firebase.google.com/docs/vertex-ai/gemini-models) for more detailed examples on using the Gemini API.)_

### Imagen API

The Firebase AI Imagen API allows you to generate and manipulate images using Google's advanced image generation models. You can create novel images from text prompts, edit existing images, and more.

#### Initializing ImagenModel

First, initialize `FirebaseAI` and then get an `ImagenModel` instance. You can optionally provide generation configuration and safety settings at this stage.

```csharp
using Firebase;
using Firebase.AI;
using UnityEngine; // Required for Debug.Log and Texture2D

public class ImagenExample : MonoBehaviour
{
async void Start()
{
FirebaseApp app = FirebaseApp.DefaultInstance; // Or your specific app

// Initialize the Vertex AI backend service (recommended for Imagen)
var ai = FirebaseAI.GetInstance(app, FirebaseAI.Backend.VertexAI());

// Create an `ImagenModel` instance with a model that supports your use case
// Consult Imagen documentation for the latest model names.
var model = ai.GetImagenModel(
modelName: "imagen-3.0-generate-002", // Example model name, replace with a valid one
generationConfig: new ImagenGenerationConfig(numberOfImages: 1)); // Request 1 image

// Provide an image generation prompt
var prompt = "A photo of a futuristic car driving on Mars at sunset.";

// To generate an image and receive it as inline data (byte array)
var response = await model.GenerateImagesAsync(prompt: prompt);

// If fewer images were generated than were requested,
// then `filteredReason` will describe the reason they were filtered out
if (!string.IsNullOrEmpty(response.FilteredReason)) {
UnityEngine.Debug.Log($"Image generation partially filtered: {response.FilteredReason}");
}

if (response.Images != null && response.Images.Count > 0)
{
foreach (var image in response.Images) {
// Assuming image is ImagenInlineImage
Texture2D tex = image.AsTexture2D();
if (tex != null)
{
UnityEngine.Debug.Log($"Image generated with MIME type: {image.MimeType}, Size: {tex.width}x{tex.height}");
// Process the image (e.g., display it on a UI RawImage)
// Example: rawImageComponent.texture = tex;
}
}
}
else
{
UnityEngine.Debug.Log("No images were generated. Check FilteredReason or logs for more details.");
}
}
}
```

#### Generating Images to Google Cloud Storage (GCS)

Imagen can also output generated images directly to a Google Cloud Storage bucket. This is useful for workflows where images don't need to be immediately processed on the client.

```csharp
// (Inside an async method, assuming 'model' is an initialized ImagenModel)
var gcsUri = new System.Uri("gs://your-gcs-bucket-name/path/to/output_image.png");
var gcsResponse = await model.GenerateImagesAsync(prompt: "A fantasy castle in the clouds", gcsUri: gcsUri);

if (gcsResponse.Images != null && gcsResponse.Images.Count > 0) {
foreach (var imageRef in gcsResponse.Images) {
// imageRef will be an ImagenGcsImage instance
UnityEngine.Debug.Log($"Image generation requested to GCS. Output URI: {imageRef.GcsUri}, MIME Type: {imageRef.MimeType}");
// Further processing might involve triggering a cloud function or another backend process
// that reads from this GCS URI.
}
}
```

#### Configuration Options

When working with Imagen, you can customize the generation process using several configuration structs:

* **`ImagenGenerationConfig`**: Controls aspects like the number of images to generate (`NumberOfImages`), the desired aspect ratio (`ImagenAspectRatio`), the output image format (`ImagenImageFormat`), and whether to add a watermark (`AddWatermark`). You can also specify a `NegativePrompt`.
* **`ImagenSafetySettings`**: Allows you to configure safety filters for generated content, such as `SafetyFilterLevel` (e.g., `BlockMediumAndAbove`) and `PersonFilterLevel` (e.g., `BlockAll`).
* **`ImagenImageFormat`**: Defines the output image format. Use static methods like `ImagenImageFormat.Png()` or `ImagenImageFormat.Jpeg(int? compressionQuality = null)`.
* **`ImagenAspectRatio`**: An enum to specify common aspect ratios like `Square1x1`, `Portrait9x16`, etc.

These configuration types are available in the `Firebase.AI` namespace. Refer to the API documentation or inline comments in the SDK for more details on their usage.
32 changes: 32 additions & 0 deletions firebaseai/src/FirebaseAI.cs
Original file line number Diff line number Diff line change
Expand Up @@ -191,6 +191,38 @@ public LiveGenerativeModel GetLiveModel(
liveGenerationConfig, tools,
systemInstruction, requestOptions);
}

/// <summary>
/// Initializes an Imagen model for image generation with the given parameters.
///
/// - Note: Refer to Imagen documentation for appropriate model names and capabilities.
/// </summary>
/// <param name="modelName">The name of the Imagen model to use.</param>
/// <param name="generationConfig">The image generation parameters your model should use.</param>
/// <param name="safetySettings">Safety settings for content filtering.</param>
/// <param name="requestOptions">Configuration parameters for sending requests to the backend.</param>
/// <returns>The initialized `ImagenModel` instance.</returns>
public ImagenModel GetImagenModel(
string modelName,
ImagenGenerationConfig? generationConfig = null,
ImagenSafetySettings? safetySettings = null,
RequestOptions? requestOptions = null
) {
// Potentially add validation for modelName or other parameters if needed.
// Ensure backend compatibility if Imagen is only available on certain backends.
// For example, if Imagen is VertexAI only:
// if (_backend.Provider != Backend.InternalProvider.VertexAI) {
// throw new NotSupportedException("ImagenModel is currently only supported with the VertexAI backend.");
// }
return new ImagenModel(
_firebaseApp,
_backend,
modelName,
generationConfig,
safetySettings,
requestOptions
);
}
}

}
5 changes: 5 additions & 0 deletions firebaseai/src/IImagenImage.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
namespace Firebase.AI {
public interface IImagenImage {
public string MimeType { get; }
}
}
13 changes: 13 additions & 0 deletions firebaseai/src/ImagenGcsImage.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
using System;

namespace Firebase.AI {
public readonly struct ImagenGcsImage : IImagenImage {
public string MimeType { get; }
public System.Uri GcsUri { get; }

public ImagenGcsImage(string mimeType, System.Uri gcsUri) {
MimeType = mimeType;
GcsUri = gcsUri;
}
}
}
52 changes: 52 additions & 0 deletions firebaseai/src/ImagenGenerationConfig.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
namespace Firebase.AI {
public enum ImagenAspectRatio {
Square1x1,
Portrait9x16,
Landscape16x9,
Portrait3x4,
Landscape4x3
}

public readonly struct ImagenGenerationConfig {
public string NegativePrompt { get; }
public int? NumberOfImages { get; }
public ImagenAspectRatio? AspectRatio { get; }
public ImagenImageFormat? ImageFormat { get; }
public bool? AddWatermark { get; }

public ImagenGenerationConfig(
string negativePrompt = null,
int? numberOfImages = null,
ImagenAspectRatio? aspectRatio = null,
ImagenImageFormat? imageFormat = null,
bool? addWatermark = null
) {
NegativePrompt = negativePrompt;
NumberOfImages = numberOfImages;
AspectRatio = aspectRatio;
ImageFormat = imageFormat;
AddWatermark = addWatermark;
}

// Helper method to convert to JSON dictionary for requests
internal System.Collections.Generic.Dictionary<string, object> ToJson() {
var jsonDict = new System.Collections.Generic.Dictionary<string, object>();
if (!string.IsNullOrEmpty(NegativePrompt)) {
jsonDict["negativePrompt"] = NegativePrompt;
}
if (NumberOfImages.HasValue) {
jsonDict["numberOfImages"] = NumberOfImages.Value;
}
if (AspectRatio.HasValue) {
jsonDict["aspectRatio"] = AspectRatio.Value.ToString();
}
if (ImageFormat.HasValue) {
jsonDict["imageFormat"] = ImageFormat.Value.ToJson();
}
if (AddWatermark.HasValue) {
jsonDict["addWatermark"] = AddWatermark.Value;
}
return jsonDict;
}
}
}
60 changes: 60 additions & 0 deletions firebaseai/src/ImagenGenerationResponse.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
using System.Collections.Generic;
using System.Linq;
using Google.MiniJSON; // Assuming MiniJSON is available and used elsewhere in the SDK

namespace Firebase.AI {
public readonly struct ImagenGenerationResponse<T> where T : IImagenImage {
public IReadOnlyList<T> Images { get; }
public string FilteredReason { get; }

// Internal constructor for creating from parsed data
internal ImagenGenerationResponse(IReadOnlyList<T> images, string filteredReason) {
Images = images;
FilteredReason = filteredReason;
}

// Static factory method to parse JSON
// Note: This is a simplified parser. Error handling and robustness should match SDK standards.
internal static ImagenGenerationResponse<T> FromJson(string jsonString) {
if (string.IsNullOrEmpty(jsonString)) {
return new ImagenGenerationResponse<T>(System.Array.Empty<T>(), "Empty or null JSON response");
}

object jsonData = Json.Deserialize(jsonString);
if (!(jsonData is Dictionary<string, object> responseMap)) {
return new ImagenGenerationResponse<T>(System.Array.Empty<T>(), "Invalid JSON format: Expected a dictionary at the root.");
}

List<T> images = new List<T>();
string filteredReason = responseMap.ContainsKey("filteredReason") ? responseMap["filteredReason"] as string : null;

if (responseMap.ContainsKey("images") && responseMap["images"] is List<object> imagesList) {
foreach (var imgObj in imagesList) {
if (imgObj is Dictionary<string, object> imgMap) {
string mimeType = imgMap.ContainsKey("mimeType") ? imgMap["mimeType"] as string : "application/octet-stream";

if (typeof(T) == typeof(ImagenInlineImage)) {
if (imgMap.ContainsKey("imageBytes") && imgMap["imageBytes"] is string base64Data) {
byte[] data = System.Convert.FromBase64String(base64Data);
images.Add((T)(IImagenImage)new ImagenInlineImage(mimeType, data));
}
} else if (typeof(T) == typeof(ImagenGcsImage)) {
if (imgMap.ContainsKey("gcsUri") && imgMap["gcsUri"] is string uriString) {
if (System.Uri.TryCreate(uriString, System.UriKind.Absolute, out System.Uri gcsUri)) {
images.Add((T)(IImagenImage)new ImagenGcsImage(mimeType, gcsUri));
}
}
}
}
}
}

// If no specific images are found, but there's a top-level "image" field (for single image responses)
// This part might need adjustment based on actual API response for single vs multiple images.
// The provided API doc implies a list `Images` always.
// For now, sticking to the `images` list.

return new ImagenGenerationResponse<T>(images.AsReadOnly(), filteredReason);
}
}
}
34 changes: 34 additions & 0 deletions firebaseai/src/ImagenImageFormat.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
namespace Firebase.AI {
public readonly struct ImagenImageFormat {
public enum FormatType { Png, Jpeg }

public FormatType Type { get; }
public int? CompressionQuality { get; } // Nullable for PNG

private ImagenImageFormat(FormatType type, int? compressionQuality = null) {
Type = type;
CompressionQuality = compressionQuality;
}

public static ImagenImageFormat Png() {
return new ImagenImageFormat(FormatType.Png);
}

public static ImagenImageFormat Jpeg(int? compressionQuality = null) {
if (compressionQuality.HasValue && (compressionQuality < 0 || compressionQuality > 100)) {
throw new System.ArgumentOutOfRangeException(nameof(compressionQuality), "Compression quality must be between 0 and 100.");
}
return new ImagenImageFormat(FormatType.Jpeg, compressionQuality);
}

// Helper method to convert to JSON dictionary for requests
internal System.Collections.Generic.Dictionary<string, object> ToJson() {
var jsonDict = new System.Collections.Generic.Dictionary<string, object>();
jsonDict["type"] = Type.ToString().ToLowerInvariant();
if (Type == FormatType.Jpeg && CompressionQuality.HasValue) {
jsonDict["compressionQuality"] = CompressionQuality.Value;
}
return jsonDict;
}
}
}
27 changes: 27 additions & 0 deletions firebaseai/src/ImagenInlineImage.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
using UnityEngine;

namespace Firebase.AI {
public readonly struct ImagenInlineImage : IImagenImage {
public string MimeType { get; }
public byte[] Data { get; }

public ImagenInlineImage(string mimeType, byte[] data) {
MimeType = mimeType;
Data = data;
}

public UnityEngine.Texture2D AsTexture2D() {
// Implementation will be added in a later step.
// For now, it can return null or throw a NotImplementedException.
if (Data == null || Data.Length == 0) {
return null;
}
Texture2D tex = new Texture2D(2, 2); // Dimensions will be determined by image data
// ImageConversion.LoadImage will resize the texture dimensions.
if (ImageConversion.LoadImage(tex, Data)) {
return tex;
}
return null;
}
}
}
Loading