Migrate assistant feature integrations from text processing to task processing #114
Description
There's been a few changes in the assistant frontend implementation and in the server AI-related APIs. Here are some pointers if you want to adjust your app or client.
Everything described here appeared in Nextcloud 30.
Text processing + transcription + image generation + translation + anything else are now handled by one single API: the task processing
one.
The concepts of task type and providers are still there. The task types now include the "shape" of their input and output. The input and output shapes define a list of typed fields.
The assistant now only submits "task processing" tasks.
Use/open the Assistant in the frontend of your app
- The
openAssistantForm
frontend function is now exposed aswindow.OCA.Assistant.openAssistantForm
but theOCA.TPAssistant
namespace is still there for backward compatibility. - The
identifier
parameter ofopenAssistantForm
is deprecated (but is still supported) and replaced bycustomId
. - The
input
parameter ofopenAssistantForm
is deprecated (but still works forcore:text2text*
task types). It can be replaced byinputs
which is an object which contains the values for each field. If you only supportcore:text2text*
task types, only setting theinputs.input
field value is enough, more on that below, in the "Task types")
Use the Task processing OCP API in the backend of your app
The scheduling logic is the same as before. With the manager, tt is possible to:
- run a task synchronously
- schedule a task and listen to the task processing events
https://docs.nextcloud.com/server/latest/developer_manual/digging_deeper/task_processing.html
Task processing OCS API
- You can get the list of available task types with
/ocs/v2.php/taskprocessing/tasktypes
- You can get a user's task list by task type with
/ocs/v2.php/taskprocessing/tasks?taskType=TASK_TYPE_ID&customId=CUSTOM_ID
. The GET parameters are optional. - You can get a user's task list by scheduling app with
/ocs/v2.php/taskprocessing/tasks/app/APP_ID?customId=CUSTOM_ID
. The GET parameter is optional.
The list of available endpoints can be found in https://github.com/nextcloud/server/blob/master/core/Controller/TaskProcessingApiController.php or can be browsed with the ocs_api_viewer Nextcloud app (core -> task_processing_api).
Task representation
The task objects returned by the OCS API are a bit different.
The input
and output
attributes are now objects which contain the values for each field.
The status is now a string: https://github.com/nextcloud/server/blob/master/lib/public/TaskProcessing/Task.php#L366-L370
Task types support in clients
As different task types expect different input fields and produce different output fields, the previously existing text processing
support implementations cannot directly support all task processing
task types.
For an easy migration, one could support a static list of task processing task types: the ones that are equivalent to text processing ones:
- core:text2text (previously called FreePrompt)
- core:text2text:headline (previously called Headline)
- core:text2text:summary (previously called Summary)
- core:text2text:topics (previously called Topics)
And the new ones:
- core:text2text:formalization
- core:text2text:reformulation
- core:text2text:simplification
All those task types have the same input and output shapes: Just one text field named "input" and "output".
More details: https://docs.nextcloud.com/server/latest/developer_manual/digging_deeper/task_processing.html#tasks-types
Also, here is the list of task types defined in the server: https://github.com/nextcloud/server/tree/master/lib/public/TaskProcessing/TaskTypes . We can discuss how to support more task types later (for example, dynamically render the input/output form like it is done in the Assistant NC app).
Summarize
curl https://nc.org/ocs/v2.php/taskprocessing/schedule -X POST \
-H "ocs-apirequest: true" \
-H "Content-Type: application/json" \
-d '{"input":{"input":"the text to summarize"},"type":"core:text2text:summary","appId":"mail"}'
$task = new Task(\OCP\TaskProcessing\TaskTypes\TextToTextSummary::ID, ['input' => 'the text to summarize'], 'mail', $this->userId);
$this->taskProcessingManager->scheduleTask($task);
$taskId = $task->getId();
or
$task = new Task(\OCP\TaskProcessing\TaskTypes\TextToTextSummary::ID, ['input' => 'the text to summarize'], 'mail', $this->userId);
$resultTask = $this->taskProcessingManager->runTask($task);
$summary = $task->getOutput()['output'];
Translate
Translations can now be done via the task processing API. There is a core:text2text:translate
task type.
If this task type is in the list of available ones, it means there is at least a provider for this task type installed.
You can get the list of supported origin languages with taskTypeObject.inputShapeEnumValues.origin_language
.
Same for the target languages: taskTypeObject.inputShapeEnumValues.target_language
.
Both are a list of objects like:
{ "name": "English (US)", "value": "en" }
Example request to submit a translation task:
curl https://nc.org/ocs/v2.php/taskprocessing/schedule -X POST \
-H "ocs-apirequest: true" \
-H "Content-Type: application/json" \
-d '{"input":{"origin_language":"en","input":"hello","target_language":"de"},"type":"core:text2text:translate","appId":"text","customId":"document-123"}'
Transcribe
The input must be the file ID of the audio input file.
Example request to transcribe an audio file:
curl https://nc.org/ocs/v2.php/taskprocessing/schedule -X POST \
-H "ocs-apirequest: true" \
-H "Content-Type: application/json" \
-d '{"input":{"input":1450},"type":"core:audio2text","appId":"spreed"}'
Or from the backend side:
$task = new Task(\OCP\TaskProcessing\TaskTypes\AudioToText::ID, ['input' => 1450], 'spreed', $this->userId);
$this->taskProcessingManager->scheduleTask($task);
$taskId = $task->getId();
or
$task = new Task(\OCP\TaskProcessing\TaskTypes\AudioToText::ID, ['input' => 1450], 'spreed', $this->userId);
$resultTask = $this->taskProcessingManager->runTask($task);
$transcription = $task->getOutput()['output'];