Skip to content

Commit d7ca6cf

Browse files
committed
feat(component,ai,gemini): add text embeddings task support (#1129)
Because - The Gemini component needed support for text embeddings functionality to enable use cases like semantic search, classification, and clustering - Users required the ability to generate embeddings using the `gemini-embedding-001` model with various task types for optimal performance This commit - Adds `TASK_TEXT_EMBEDDINGS` task with comprehensive input/output schema supporting multiple task types - Implements TaskTextEmbeddingsInput and TaskTextEmbeddingsOutput structs with proper validation and embedding generation - Adds comprehensive test coverage for the text embeddings functionality
1 parent f767b8a commit d7ca6cf

File tree

9 files changed

+654
-78
lines changed

9 files changed

+654
-78
lines changed

pkg/component/ai/gemini/v0/.compogen/bottom.mdx

Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
## Example Recipes
22

3+
### Chat with multimodality inputs
4+
35
```yaml
46
version: v1beta
57
component:
@@ -58,3 +60,55 @@ output:
5860
title: response-id
5961
value: ${gemini.output.response-id}
6062
```
63+
64+
### Cache a document
65+
66+
```yaml
67+
version: v1beta
68+
component:
69+
gemini:
70+
type: gemini
71+
task: TASK_CACHE
72+
input:
73+
model: gemini-2.5-flash
74+
operation: create
75+
ttl: 60s
76+
documents:
77+
- ${variable.document}
78+
system-message: You are a helpful assistant.
79+
variable:
80+
document:
81+
title: Document
82+
description: Document to convert to Markdown
83+
type: document
84+
stream:
85+
title: Enable Stream
86+
description: whether to enable streaming
87+
type: boolean
88+
output:
89+
cached-content:
90+
title: cached content
91+
value: ${gemini.output.cached-content}
92+
```
93+
94+
### Embed a text input
95+
96+
```yaml
97+
version: v1beta
98+
component:
99+
gemini:
100+
type: gemini
101+
task: TASK_TEXT_EMBEDDINGS
102+
input:
103+
model: gemini-embedding-001
104+
text: ${variable.text}
105+
variable:
106+
text:
107+
title: Text
108+
description: Text to embed
109+
type: string
110+
output:
111+
embedding:
112+
title: Embedding result
113+
value: ${gemini.output.embedding}
114+
```

pkg/component/ai/gemini/v0/README.mdx

Lines changed: 147 additions & 67 deletions
Large diffs are not rendered by default.

pkg/component/ai/gemini/v0/config/definition.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
availableTasks:
22
- TASK_CHAT
33
- TASK_CACHE
4+
- TASK_TEXT_EMBEDDINGS
45
custom: false
56
icon: assets/gemini.svg
67
iconUrl: ""

pkg/component/ai/gemini/v0/config/tasks.yaml

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1968,3 +1968,96 @@ TASK_CACHE:
19681968
description: >-
19691969
[**LIST**] Token for retrieving the next page of results for list operations.
19701970
type: string
1971+
TASK_TEXT_EMBEDDINGS:
1972+
shortDescription: Turn text into numbers, unlocking use cases like search.
1973+
input:
1974+
uiOrder: 0
1975+
title: Input
1976+
description: Input schema of the text embeddings task.
1977+
type: object
1978+
required:
1979+
- text
1980+
- model
1981+
- task-type
1982+
properties:
1983+
model:
1984+
uiOrder: 0
1985+
title: Model
1986+
shortDescription: ID of the model to use
1987+
description: >
1988+
The Gemini embedding model, gemini-embedding-001, is trained using the Matryoshka Representation Learning (MRL) technique which teaches a model
1989+
to learn high-dimensional embeddings that have initial segments (or prefixes) which are also useful, simpler versions of the same data.
1990+
type: string
1991+
enum:
1992+
- gemini-embedding-001
1993+
instillCredentialMap:
1994+
values:
1995+
- gemini-embedding-001
1996+
targets:
1997+
- setup.api-key
1998+
text:
1999+
uiOrder: 1
2000+
title: Text
2001+
description: The text to generate embeddings for.
2002+
type: string
2003+
task-type:
2004+
uiOrder: 2
2005+
title: Task Type
2006+
description: >-
2007+
The type of task to perform for optimal embedding generation.
2008+
The value is one of the following:
2009+
`SEMANTIC_SIMILARITY`: (Default) Embeddings optimized to assess text similarity. Examples: Recommendation systems, duplicate detection.
2010+
`CLASSIFICATION`: Embeddings optimized to classify texts according to preset labels. Examples: Sentiment analysis, spam detection.
2011+
`CLUSTERING`: Embeddings optimized to group similar texts together. Examples: Document organization, market research, anomaly detection.
2012+
`RETRIEVAL_DOCUMENT`: Embeddings optimized for document search. Examples: Indexing articles, books, or web pages for search.
2013+
`RETRIEVAL_QUERY`: Embeddings optimized for general search queries. Use `RETRIEVAL_QUERY` for queries; `RETRIEVAL_DOCUMENT` for documents to
2014+
be retrieved. Examples: General search queries for custom search applications.
2015+
`CODE_RETRIEVAL_QUERY`: Embeddings optimized for retrieval of code blocks based on natural language queries. Use `CODE_RETRIEVAL_QUERY` for
2016+
queries; `RETRIEVAL_DOCUMENT` for code blocks to be retrieved. Examples: Natural language queries about code for code suggestions and search.
2017+
`QUESTION_ANSWERING`: Embeddings for questions in a question-answering system, optimized for finding documents that answer the question. Use
2018+
`QUESTION_ANSWERING` for questions; `RETRIEVAL_DOCUMENT` for documents to be retrieved. Examples: Questions in Q&A systems, chatbots, knowledge
2019+
bases.
2020+
`FACT_VERIFICATION`: Embeddings for statements that need to be verified, optimized for retrieving documents that contain evidence supporting
2021+
or refuting the statement. Use `FACT_VERIFICATION` for the target text; `RETRIEVAL_DOCUMENT` for documents to be retrieved. Examples: Statements
2022+
that need verification for automated fact-checking systems.
2023+
type: string
2024+
enum:
2025+
- SEMANTIC_SIMILARITY
2026+
- CLASSIFICATION
2027+
- CLUSTERING
2028+
- RETRIEVAL_DOCUMENT
2029+
- RETRIEVAL_QUERY
2030+
- CODE_RETRIEVAL_QUERY
2031+
- QUESTION_ANSWERING
2032+
- FACT_VERIFICATION
2033+
default: SEMANTIC_SIMILARITY
2034+
title:
2035+
uiOrder: 3
2036+
title: Title
2037+
description: >-
2038+
Title for the text. Only applicable when TaskType is `RETRIEVAL_DOCUMENT`.
2039+
type: string
2040+
output-dimensionality:
2041+
uiOrder: 4
2042+
title: Output Dimensionality
2043+
description: >-
2044+
Use this parameter to control the size of the output embedding vector. Selecting a smaller output dimensionality can save storage space and increase
2045+
computational efficiency for downstream applications, while sacrificing little in terms of quality. By default, it outputs a 3072-dimensional
2046+
embedding, but you can truncate it to a smaller size without losing quality to save storage space. We recommend using 768, 1536, or 3072 output
2047+
dimensions.
2048+
type: integer
2049+
minimum: 1
2050+
maximum: 3072
2051+
output:
2052+
uiOrder: 0
2053+
title: Output
2054+
description: Output schema of the text embeddings task.
2055+
type: object
2056+
required:
2057+
- embedding
2058+
properties:
2059+
embedding:
2060+
uiOrder: 0
2061+
title: Embedding
2062+
description: The embedding of the text.
2063+
$ref: schema.yaml#/$defs/instill-types/embedding

pkg/component/ai/gemini/v0/io.go

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -127,3 +127,17 @@ type TaskCacheOutput struct {
127127
CachedContents []*genai.CachedContent `instill:"cached-contents"`
128128
NextPageToken *string `instill:"next-page-token"`
129129
}
130+
131+
// TaskTextEmbeddingsInput is the input for the TASK_TEXT_EMBEDDINGS task.
132+
type TaskTextEmbeddingsInput struct {
133+
Model string `instill:"model"`
134+
Text string `instill:"text"`
135+
TaskType string `instill:"task-type"`
136+
Title string `instill:"title"`
137+
OutputDimensionality *int32 `instill:"output-dimensionality"`
138+
}
139+
140+
// TaskTextEmbeddingsOutput is the output for the TASK_TEXT_EMBEDDINGS task.
141+
type TaskTextEmbeddingsOutput struct {
142+
Embedding []float64 `instill:"embedding"`
143+
}

pkg/component/ai/gemini/v0/main.go

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,13 @@ import (
1313
"google.golang.org/protobuf/types/known/structpb"
1414

1515
"github.com/instill-ai/pipeline-backend/pkg/component/base"
16+
"github.com/instill-ai/pipeline-backend/pkg/component/resources/schemas"
1617
)
1718

1819
const (
19-
ChatTask = "TASK_CHAT"
20-
CacheTask = "TASK_CACHE"
20+
ChatTask = "TASK_CHAT"
21+
CacheTask = "TASK_CACHE"
22+
TextEmbeddingsTask = "TASK_TEXT_EMBEDDINGS"
2123

2224
cfgAPIKey = "api-key"
2325
)
@@ -45,7 +47,10 @@ type component struct {
4547
func Init(bc base.Component) *component {
4648
once.Do(func() {
4749
comp = &component{Component: bc}
48-
err := comp.LoadDefinition(definitionYAML, setupYAML, tasksYAML, nil, nil)
50+
additionalYAMLBytes := map[string][]byte{
51+
"schema.yaml": schemas.SchemaYAML,
52+
}
53+
err := comp.LoadDefinition(definitionYAML, setupYAML, tasksYAML, nil, additionalYAMLBytes)
4954
if err != nil {
5055
panic(err)
5156
}
@@ -82,6 +87,8 @@ func (c *component) CreateExecution(x base.ComponentExecution) (base.IExecution,
8287
e.execute = e.chat
8388
case CacheTask:
8489
e.execute = e.cache
90+
case TextEmbeddingsTask:
91+
e.execute = e.textEmbeddings
8592
default:
8693
return nil, fmt.Errorf("unsupported task")
8794
}
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
package gemini
2+
3+
import (
4+
"context"
5+
6+
"google.golang.org/genai"
7+
8+
"github.com/instill-ai/pipeline-backend/pkg/component/base"
9+
)
10+
11+
func (e *execution) textEmbeddings(ctx context.Context, job *base.Job) error {
12+
// Read input
13+
in := TaskTextEmbeddingsInput{}
14+
if err := job.Input.ReadData(ctx, &in); err != nil {
15+
job.Error.Error(ctx, err)
16+
return nil
17+
}
18+
19+
// Create Gemini client
20+
client, err := e.createGeminiClient(ctx)
21+
if err != nil {
22+
job.Error.Error(ctx, err)
23+
return nil
24+
}
25+
26+
// Create content from input text
27+
contents := []*genai.Content{
28+
genai.NewContentFromText(in.Text, genai.RoleUser),
29+
}
30+
31+
// Use the task type from input, defaulting to SEMANTIC_SIMILARITY if empty
32+
taskType := in.TaskType
33+
if taskType == "" {
34+
taskType = "SEMANTIC_SIMILARITY"
35+
}
36+
37+
// Generate embeddings using the Gemini API
38+
result, err := client.Models.EmbedContent(ctx, in.Model, contents, &genai.EmbedContentConfig{
39+
TaskType: taskType,
40+
OutputDimensionality: in.OutputDimensionality,
41+
Title: in.Title,
42+
})
43+
if err != nil {
44+
job.Error.Error(ctx, err)
45+
return nil
46+
}
47+
48+
// Extract embeddings from the result
49+
if len(result.Embeddings) == 0 {
50+
job.Error.Error(ctx, err)
51+
return nil
52+
}
53+
54+
embedding := result.Embeddings[0]
55+
if len(embedding.Values) == 0 {
56+
job.Error.Error(ctx, err)
57+
return nil
58+
}
59+
60+
// Convert from []float32 to []float64 for consistency with other components
61+
embeddingFloat64 := make([]float64, len(embedding.Values))
62+
for i, v := range embedding.Values {
63+
embeddingFloat64[i] = float64(v)
64+
}
65+
66+
// Prepare output
67+
output := TaskTextEmbeddingsOutput{
68+
Embedding: embeddingFloat64,
69+
}
70+
71+
// Write output
72+
if err := job.Output.WriteData(ctx, output); err != nil {
73+
job.Error.Error(ctx, err)
74+
return nil
75+
}
76+
77+
return nil
78+
}

0 commit comments

Comments
 (0)