You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(component,data,artifact): update instillartifact component to latest protobuf (#1139)
### Because
- The artifact-backend protobuf definitions have been updated to use
"knowledge-base" terminology instead of "catalog", "chunk-type" instead
of "content-type", and changed API structures (CreateFileRequest now
takes a File object instead of individual fields)
- API endpoints were renamed (GetObjectURL → GetObjectURLAdmin,
GetObject → GetObjectAdmin)
- NextPageToken field changed from string to *string (optional pointer)
- Various enum values and field names were updated in the latest
protobuf generation
### This commit
- Updated all instillartifact component tasks to use new field names:
KnowledgeBaseId, Id, ChunkType, File object structure
- Changed API calls in external.go from GetObjectURL/GetObject to
GetObjectURLAdmin/GetObjectAdmin
- Fixed pointer type handling for NextPageToken in pagination logic
- Updated enum values (File_FILE_TYPE_PDF → File_TYPE_PDF,
File_PROCESS_STATUS_* → FileProcessStatus_FILE_PROCESS_STATUS_*)
- Changed from FileIds array to Filter string in ListFilesRequest
- Regenerated component mocks to match updated protobuf definitions
- Updated all test expectations to match new enum string representations
(TYPE_PDF, UNIT_PAGE)
- Fixed task definitions to use TASK_SEARCH instead of TASK_RETRIEVE,
TASK_CREATE_FILE instead of TASK_UPLOAD_FILE, etc.
@@ -7,14 +7,13 @@ description: "Learn about how to set up a Instill Artifact component https://git
7
7
8
8
The Instill Artifact component is a data component that allows users to access files and perform RAG-based search and retrieval through catalogs in the Instill Core platform.
9
9
It can carry out the following tasks:
10
-
-[Upload File](#upload-file)
11
-
-[Upload Files](#upload-files)
10
+
-[Create File](#create-file)
11
+
-[Create Files](#create-files)
12
12
-[Get Files Metadata](#get-files-metadata)
13
13
-[Get Chunks Metadata](#get-chunks-metadata)
14
14
-[Get File in Markdown](#get-file-in-markdown)
15
15
-[Match File Status](#match-file-status)
16
-
-[Retrieve](#retrieve)
17
-
-[Ask](#ask)
16
+
-[Search](#search)
18
17
19
18
To use Artifact Component, you will need to set up the OpenAI API key for self-hosted deployment of Instill Core.
20
19
You can do this by setting the `OPENAI_API_KEY` environment variable.
@@ -32,27 +31,27 @@ The component definition and tasks are defined in the [definition.yaml](https://
32
31
33
32
## Supported Tasks
34
33
35
-
### Upload File
34
+
### Create File
36
35
37
36
Upload and process the files into chunks into Catalog.
| Task ID (required) |`task`| string |`TASK_RETRIEVE`|
348
-
|Catalog ID (required) |`catalog-id`| string |Catalog ID that you input to search files in the Catalog. |
346
+
| Task ID (required) |`task`| string |`TASK_SEARCH`|
347
+
|Knowledge Base ID (required) |`knowledge-base-id`| string |Knowledge Base ID that you input to search files in the Knowledge Base. |
349
348
| Namespace (required) |`namespace`| string | Fill in your namespace, you can get namespace through the tab of switching namespace. |
350
349
| Text Prompt (required) |`text-prompt`| string | The prompt string to search the chunks. |
351
350
| Top K |`top-k`| integer | The number of top chunks to return. The range is from 1~20, and default is 5. |
352
351
| File UIDs |`file-uids`| array[string]| Optional list of file UIDs to filter the results by. The elements of the list must be UUID-formatted strings. |
353
352
| File Media Type |`file-media-type`| string | The media type to filter, empty for all. <br/><details><summary><strong>Enum values</strong></summary><ul><li>`document`</li><li>`image`</li><li>`audio`</li><li>`video`</li></ul></details> |
354
-
| Content Type |`content-type`| string | The content type to filter, empty for all. <br/><details><summary><strong>Enum values</strong></summary><ul><li>`chunk`</li><li>`summary`</li><li>`augmented`</li></ul></details> |
|[File Position](#retrieve-file-position)|`end`| object | Position within a file as coordinates ina specific unit. The number of dimensions of the coordinates depends on the unit type. |
393
-
|[File Position](#retrieve-file-position)|`start`| object | Position within a file as coordinates ina specific unit. The number of dimensions of the coordinates depends on the unit type. |
| Coordinates |`coordinates`| array | Position value. |
404
-
| Unit |`unit`| string | Unit of measurement for a position within a file. <br/><details><summary><strong>Enum values</strong></summary><ul><li>`UNIT_CHARACTER`</li><li>`UNIT_PAGE`</li><li>`UNIT_TIME_MS`</li><li>`UNIT_PIXEL`</li></ul></details> |
| Coordinates |`coordinates`| array | Position value. |
415
-
| Unit |`unit`| string | Unit of measurement for a position within a file. <br/><details><summary><strong>Enum values</strong></summary><ul><li>`UNIT_CHARACTER`</li><li>`UNIT_PAGE`</li><li>`UNIT_TIME_MS`</li><li>`UNIT_PIXEL`</li></ul></details> |
416
-
417
-
</div>
418
-
</details>
419
-
420
-
421
-
### Ask
422
-
423
-
Reply the questions based on the files in the catalog.
| Catalog ID (required) |`catalog-id`| string | Catalog ID that you input to search files in the Catalog. |
431
-
| Namespace (required) |`namespace`| string | Fill in your namespace, you can get namespace through the tab of switching namespace. |
432
-
| Question (required) |`question`| string | The question to reply. |
433
-
| Top K |`top-k`| integer | The number of top answers to return. The range is from 1~20, and default is 5. |
434
-
| File UIDs |`file-uids`| array[string]| Optional list of file UIDs to filter the results by. The elements of the list must be UUID-formatted strings. |
353
+
| Chunk Type |`chunk-type`| string | The chunk type to filter, empty for all. <br/><details><summary><strong>Enum values</strong></summary><ul><li>`content`</li><li>`summary`</li><li>`augmented`</li></ul></details> |
435
354
436
355
</div>
437
356
@@ -440,42 +359,41 @@ Reply the questions based on the files in the catalog.
440
359
441
360
| Output | Field ID | Type | Description |
442
361
| :--- | :--- | :--- | :--- |
443
-
| Answer |`answer`| string | Answers data from smart search. |
444
-
|[Chunks](#ask-chunks) (optional) |`chunks`| array[object]| Chunks data to answer question. |
362
+
|[Chunks](#search-chunks)|`chunks`| array[object]| Chunks data from smart search. |
|[File Position](#ask-file-position)|`end`| object | Position within a file as coordinates ina specific unit. The number of dimensions of the coordinates depends on the unit type. |
474
-
|[File Position](#ask-file-position)|`start`| object | Position within a file as coordinates ina specific unit. The number of dimensions of the coordinates depends on the unit type. |
391
+
|[File Position](#search-file-position)|`end`| object | Position within a file as coordinates ina specific unit. The number of dimensions of the coordinates depends on the unit type. |
392
+
|[File Position](#search-file-position)|`start`| object | Position within a file as coordinates ina specific unit. The number of dimensions of the coordinates depends on the unit type. |
0 commit comments