Replies: 1 comment 2 replies
-
|
Yes — those variable fields tell the LLM node where to pull images from so they can be sent to the multimodal model. When you enable the vision switch, you configure a variable selector that points to the source of image files in your workflow. The most common options are:
Once the variable is resolved at runtime, the system converts those files into image prompt messages and appends them to the prompt that is sent to the model [3]. You can also choose a detail level (high/low) which maps to the API's image quality parameter [4]. So in short: the variable value you input is simply a pointer to the image source within your workflow, and the images obtained from that source are indeed fed to the model along with the text prompt. To reply, just mention @dosu. Share context across your team and agents. Try Dosu. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Self Checks
Content
When I select a multimodal model, after turning on the visual switch, what is the purpose of the variable values that need to be input? Are the images obtained from here then fed to the model?

Beta Was this translation helpful? Give feedback.
All reactions