3. Model configuration

Coze Studio is an AI app development platform based on LLM. Before running the Coze Studio open-source version for the first time, you need to clone the project to your local machine and configure the required models. During normal project operations, you can also add new model services as needed or delete unnecessary model services at any time.

Model list

The model services supported by Coze Studio are as follows:

Volcengine Ark | BytePlus ModelArk
OpenAI
DeepSeek
Claude
Ollama
Qwen
Gemini

Model configuration instructions

In the open-source version of Coze Studio, the model configurations are all placed in the backend/conf/model directory, which contains multiple YAML files, each corresponding to an accessible model. To facilitate quick setup for developers, Coze Studio provides some template files in the backend/conf/model/template directory, covering common model types, such as Volcengine Ark and OpenAI. Developers can find the model templates corresponding to the respective vendors, copy them to the backend/conf/model directory, and set various parameters based on the template comments.

Important notes

Before filling out the model configuration file, ensure you have understood the following Important notes:

Ensure each model file ID is unique and that the ID is not modified after the configuration goes live. ID is the identifier for model configuration circulation within the system; modifying the model ID may lead to issues with existing agent functionality.
Before deleting the model, ensure that it is no longer receiving online traffic.
Agents or workflows call models based on model IDs. For models that have already been launched, do not modify their IDs; otherwise, it may result in model call failures.

Configure the model for Coze Studio.

Coze Studio is an AI app development platform based on LLMs. Before deploying and launching the open-source version of Coze Studio for the first time, you need to configure the model service in the Coze Studio project, otherwise, you won't be able to properly select a model during the creation of agents or workflows. When deploying and setting up the Ark model service for the first time, you can simply follow the quick start guide to complete the configuration. If you need to add more models or switch to other model services, you can refer to the following steps.

Step 1: Modify the model configuration file.

Copy the template file.
1. In the backend/conf/template directory, find the corresponding template yaml file based on the name of the model to be added, for example, the configuration file for OpenAI models is model_template_openai.yaml.
2. Copy the template file to the backend/conf/model directory.
Modify the model configuration.
1. Go to the backend/conf/model directory and open the file copied in Step 1.
2. Modify the fields in the file: id, meta.conn_config.api_key, meta.conn_config.model, and save the file.
  - id: The model ID in Coze Studio, defined by the developer, must be a non-zero integer and globally unique. Agents or workflows call models based on model IDs. For models that have already been launched, do not modify their IDs; otherwise, it may result in model call failures.
  - meta.conn_config.api_key: The API Key for the model service. In this example, it is the API Key for Ark API Key. For more information, see Get Volcengine Ark API Key or Get BytePlus ModelArk API Key.
  - meta.conn_config.model: The Model name for the model service. In this example, it refers to the Model ID or Endpoint ID of Ark. For more information, see Get Volcengine Ark Model ID / Get Volcengine Ark Endpoint ID or Get BytePlus ModelArk Model ID / Get BytePlus ModelArk Endpoint ID.
  For users in China, you may use Volcengine Ark; for users outside China, you may use BytePlus ModelArk instead.
  
  Other parameters can remain at their default settings, or you can modify them as needed. Taking the Ark model as an example, the configuration method can refer to the Volcengine Ark Model List.

Step 2: Restart the service

After modifying the configuration file, execute the following commands to restart the service and apply the configurations.

docker compose --profile "*" restart coze-server

After the service starts successfully, open the agent editing page and select the configured model from the model dropdown list.

Configuration Reference

Coze Studio supports various common model services and model protocols. You can configure the corresponding template files, protocol, and base_url according to the type of model service. For example, for third - party models such as Volcengine Ark, you can refer to the [Third - Party Model Service] section to configure the corresponding template files; for official model services such as OpenAI, you can refer to the [Official Model Service] section. All model service template files are located in /backend/conf/model/template. You need to copy them to /backend/conf/model and then modify them. The changes will take effect after the service is restarted.

In addition, the base_url is common for [Embedding Configuration] in [Basic Component Configuration].

Third-Party Model Service

Platform	Base Template File Name	Protocol	Base_url	Special Instructions
Volcengine Ark	model_template_ark.yaml	ark	Volcengine Engine: https://ark.cn-beijing.volces.com/api/v3/ Overseas BytePlus: https://ark.ap-southeast.bytepluses.com/api/v3/	None
Alibaba Bai Lian	model_template_openai.yaml or model_template_qwen.yaml	openai or qwen	https://dashscope.aliyuncs.com/compatible-mode/v1	The qwen3 series does not support thinking in non-streaming calls. If used, you need to set enable_thinking: false in conn_config. Coze Studio will adapt this capability in future versions.
Silicon-based Flow	model_template_openai.yaml	openai	https://api.siliconflow.cn/v1	None
Other Third-Party API Relay	model_template_openai.yaml	openai	The address provided in the API document Note that the path usually has a /v1 suffix and does not have a /chat/completions suffix	If the platform only relays or proxies model services and the model is not an openai model, please configure the protocol according to the documentation in the [Official Model Service] section.

Open-Source Framework

Framework	Base Template File Name	Protocol	Base_url	Special Instructions
Ollama	model_template_ollama.yaml	ollama	http://${ip}:11434	1. When the mirror network mode is bridge, localhost in the coze-server mirror is not the localhost of the host. It needs to be modified to the ip of the Ollama deployment machine or `http://host.docker.internal:11434`. 2. Check the api_key: When the api_key is not set, this parameter is left blank. 3. Confirm whether the firewall of the Ollama-deployed host has opened port 11434. 4. Confirm that the Ollama network has enabled External Exposure.
vllm	model_template_openai.yaml	openai	http://${ip}:8000/v1 (specified when starting the port)	None
xinference	model_template_openai.yaml	openai	http://${ip}:9997/v1 (specified when starting the port)	None
sglang	model_template_openai.yaml	openai	http://${ip}:35140/v1 (specified when starting the port)	None
LMStudio	model_template_openai.yaml	openai	http://${ip}:${port}/v1	None

Official Model Service

Model	Base Template File Name	Protocol	Base_url	Special Instructions
Doubao	model_template_ark.yaml	ark	https://ark.cn-beijing.volces.com/api/v3/	None
OpenAI	model_template_openai.yaml	openai	https://api.openai.com/v1	Check the by_azure field configuration. If the model is a model service provided by Microsoft Azure, this parameter should be set to true.
Deepseek	model_template_deepseek.yaml	deepseek	https://api.deepseek.com/	None
Qwen	model_template_qwen.yaml	qwen	https://dashscope.aliyuncs.com/compatible-mode/v1	The qwen3 series does not support thinking in non-streaming calls. If used, you need to set enable_thinking: false in conn_config. Coze Studio will adapt this capability in future versions.
Gemini	model_template_gemini.yaml	gemini	https://generativelanguage.googleapis.com/	None
Claude	model_template_claude.yaml	claude	https://api.anthropic.com/v1/	None

Field Description

Model information

The model meta information file describes the foundational capabilities and connection details of the model.

The basic model meta information template can be found at backend/conf/model/template/model_template_basic.yaml.
The meta information templates for each model can be found under backend/conf/model/template, corresponding to the respective model names.

Below is the complete list of model configuration fields, which includes descriptions of each field:

Field name	Required	Example	Parameter description
id	Required	0	Model id
name	Required	test_model	Model platform display name
icon_uri	Optional	test_icon_uri	LLM showcase image uri
icon_url	Optional	test_icon_url	LLM showcase image url
description	Optional	-	Default LLM description
description.zh	Optional	This is the model description information	Chinese version model description, used for platform display
description.en	Optional	This is model description	English version model description, used for platform display
default_parameters	Optional	-	Model parameter list The elements in the list can collectively refer to the template file
default_parameters.name	Required	temperature	Model parameter name, enumerated values: temperature, top_p, top_k, max_tokens, response_format, frequency_penalty, presence_penalty
default_parameters.label	Required	-	Model parameter platform display name
default_parameters.label.zh	Required	Generate randomness	Model parameter platform display name - Chinese
default_parameters.label.en	Required	Temperature	Model parameter platform display name - English
default_parameters.desc	Required	-	Model parameter platform display description
default_parameters.desc.zh	Required	temperature: Increasing the temperature will make the model's output more diverse and innovative	Model parameter platform display description - Chinese
default_parameters.desc.en	Required	Temperature: When you increase this value, the model outputs more diverse and innovative content	Model parameter platform display description - English
default_parameters.type	Required	int	Field value type, enumeration value: int，float，boolean，string
default_parameters.min	Optional	'0'	(For numeric types) Minimum field value
default_parameters.max	Optional	'1'	(For numeric types) Maximum field value
default_parameters.default_val	Required	-	Precise / Balanced / Creative / Default values under custom mode
default_parameters.default_val.default_val	Required	'1.0'	Default value in custom mode
default_parameters.default_val.creative	Optional	'1.0'	Default value in creative mode
default_parameters.default_val.balance	Optional	'0.8'	Default value in balanced mode
default_parameters.default_val.precise	Optional	'0.3'	Default value in precise mode
default_parameters.precision	Optional	2	Precision (when type is float)
default_parameters.style	Required	-	Display type
default_parameters.style.widget	Required	slider	Display style, enumerated values: slider (slider) radio_button (button)
default_parameters.style.label	Required	-	Classification
default_parameters.style.label.zh	Required	Generation diversity	Classification label name - Chinese
default_parameters.style.label.en	Required	Generation diversity	Classification tag name-English
meta	Required	-	Model metadata
meta.name	Required	test_model_name	Model name, used for record keeping, not displayed
meta.protocol	Required	test_protocol	Model connection protocol
meta.capability	Required	-	Model foundational capabilities
meta.capability.function_call	Optional	true	Does the model support function call
meta.capability.input_modal	Optional	["text", "image", "audio", "video"]	Model input supports modality
meta.capability.input_tokens	Optional	1024	Input token limit
meta.capability.output_modal	Optional	["text", "image", "audio", "video"]	The model output supports modalities
meta.capability.output_tokens	Optional	1024	Output token limit
meta.capability.max_tokens	Optional	2048	Maximum token count
meta.capability.json_mode	Optional	true	Does it support JSON mode
meta.capability.prefix_caching	Optional	false	Does it support prefix caching
meta.capability.reasoning	Optional	false	Does it support reasoning
meta.conn_config	Required	-	Model connection parameters
meta.conn_config.base_url	Required	https://localhost:1234/chat/completion	Model service base URL
meta.conn_config.api_key	Required	qweasdzxc	API token
meta.conn_config.timeout	Optional	100	Timeout duration (nanoseconds)
meta.conn_config.model	Required	model_name	Model name.
meta.conn_config.temperature	Optional	0.7	Default temperature
meta.conn_config.frequency_penalty	Optional	0	Default frequency_penalty
meta.conn_config.presence_penalty	Optional	0	Default presence_penalty
meta.conn_config.max_tokens	Optional	2048	Default max_tokens
meta.conn_config.top_p	Optional	0	Default top_p
meta.conn_config.top_k	Optional	0	Default top_k
meta.conn_config.enable_thinking	Optional	false	Enable thinking process or not
meta.conn_config.stop	Optional	["bye"]	Stop word list
meta.conn_config.openai	Optional	-	OpenAI-specific configuration
meta.conn_config.openai.by_azure	Optional	true	Whether to use Azure
meta.conn_config.openai.api_version	Optional	2024-10-21	API version
meta.conn_config.openai.response_format.type	Optional	text	Response format type
meta.conn_config.claude	Optional	-	Claude-specific configuration
meta.conn_config.claude.by_bedrock	Optional	true	Is Bedrock used
meta.conn_config.claude.access_key	Optional	bedrock_ak	Bedrock access token
meta.conn_config.claude.secret_access_key	Optional	bedrock_secret_ak	Bedrock token
meta.conn_config.claude.session_token	Optional	bedrock_session_token	Bedrock conversation token
meta.conn_config.claude.region	Optional	bedrock_region	Bedrock region
meta.conn_config.ark	Optional	-	Ark dedicated configuration
meta.conn_config.ark.region	Optional	region	Section
meta.conn_config.ark.access_key	Optional	ak	Access token
meta.conn_config.ark.secret_key	Optional	sk	Token
meta.conn_config.ark.retry_times	Optional	123	Retry times
meta.conn_config.ark.custom_header	Optional	{"key": "val"}	Custom request headers
meta.conn_config.deepseek	Optional	-	Deepseek-specific configuration
meta.conn_config.deepseek.response_format_type	Optional	text	Response format type
meta.conn_config.qwen	Optional	-	Qwen-specific configuration
meta.conn_config.qwen.response_format	Optional	-	Return format
meta.conn_config.gemini	Optional	-	Gemini-specific configuration
meta.conn_config.gemini.backend	Optional	0	Gemini backend configuration * 0: Default * 1：GeminiAPI * 2：VertexAI
meta.conn_config.gemini.project	Optional	test_project	GCP Project ID for Vertex AI Required when backend=2
meta.conn_config.gemini.location	Optional	test_loc	GCP Location/Region for Vertex AI. Required when backend=2
meta.conn_config.gemini.api_version	Optional	v1beta	API version
meta.conn_config.headers	Optional	-	http headers
meta.conn_config.timeout_ms	Optional	-	http timeout
meta.conn_config.include_thoughts	Optional	true	Return whether it contains thinking content
meta.conn_config.thinking_budget	Optional	123	Thinking token consumption budget
meta.status	Optional	1	Model status * 0: Default state when not configured, equivalent to 1 * 1: In the app, it can be used and newly created * 5: Pending offline, usable but cannot be newly created * 10: Offline, neither usable nor newly created

Example

The complete configuration examples and field descriptions for each model can be referenced at backend/conf/model/template/model_template_basic.yaml. You can also refer to the following minimal configuration. Most field settings are generally similar, with major differences in protocol and conn_config.

Volcengine Ark

id: 2002
name: Doubao Model
icon_uri: doubao_v2.png
icon_url: ""
description:
    zh: 豆包模型简介
    en: doubao model description
default_parameters:
    - name: temperature
      label:
        zh: 生成随机性
        en: Temperature
      desc:
        zh: '- **temperature**: 调高温度会使得模型的输出更多样性和创新性，反之，降低温度会使输出内容更加遵循指令要求但减少多样性。建议不要与“Top p”同时调整。'
        en: '**Temperature**:\n\n- When you increase this value, the model outputs more diverse and innovative content; when you decrease it, the model outputs less diverse content that strictly follows the given instructions.\n- It is recommended not to adjust this value with \"Top p\" at the same time.'
      type: float
      min: "0"
      max: "1"
      default_val:
        balance: "0.8"
        creative: "1"
        default_val: "1.0"
        precise: "0.3"
      precision: 1
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: max_tokens
      label:
        zh: 最大回复长度
        en: Response max length
      desc:
        zh: 控制模型输出的Tokens 长度上限。通常 100 Tokens 约等于 150 个中文汉字。
        en: You can specify the maximum length of the tokens output through this value. Typically, 100 tokens are approximately equal to 150 Chinese characters.
      type: int
      min: "1"
      max: "4096"
      default_val:
        default_val: "4096"
      options: []
      style:
        widget: slider
        label:
            zh: 输入及输出设置
            en: Input and output settings
    - name: top_p
      label:
        zh: Top P
        en: Top P
      desc:
        zh: '- **Top p 为累计概率**: 模型在生成输出时会从概率最高的词汇开始选择，直到这些词汇的总概率累积达到Top p 值。这样可以限制模型只选择这些高概率的词汇，从而控制输出内容的多样性。建议不要与“生成随机性”同时调整。'
        en: '**Top P**:\n\n- An alternative to sampling with temperature, where only tokens within the top p probability mass are considered. For example, 0.1 means only the top 10% probability mass tokens are considered.\n- We recommend altering this or temperature, but not both.'
      type: float
      min: "0"
      max: "1"
      default_val:
        default_val: "0.7"
      precision: 2
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: response_format
      label:
        zh: 输出格式
        en: Response format
      desc:
        zh: '- **文本**: 使用普通文本格式回复\n- **Markdown**: 将引导模型使用Markdown格式输出回复\n- **JSON**: 将引导模型使用JSON格式输出'
        en: '**Response Format**:\n\n- **Text**: Replies in plain text format\n- **Markdown**: Uses Markdown format for replies\n- **JSON**: Uses JSON format for replies'
      type: int
      min: ""
      max: ""
      default_val:
        default_val: "0"
      options:
        - label: Text
          value: "0"
        - label: Markdown
          value: "1"
        - label: JSON
          value: "2"
      style:
        widget: radio_buttons
        label:
            zh: 输入及输出设置
            en: Input and output settings
meta:
    name: Doubao
    protocol: ark
    capability:
        function_call: true
        input_modal:
            - text
            - image
        input_tokens: 128000
        json_mode: false
        max_tokens: 128000
        output_modal:
            - text
        output_tokens: 16384
        prefix_caching: false
        reasoning: false
        prefill_response: false
    conn_config:
        base_url: ""
        api_key: ""
        timeout: 0s
        model: ""
        temperature: 0.1
        frequency_penalty: 0
        presence_penalty: 0
        max_tokens: 4096
        top_p: 0.7
        top_k: 0
        stop: []
        openai: null
        claude: null
        ark:
            region: ""
            access_key: ""
            secret_key: ""
            retry_times: null
            custom_header: {}
        deepseek: null
        qwen: null
        gemini: null
        custom: {}
    status: 0

Claude

id: 2006
name: Claude-3.5-Sonnet
icon_uri: claude_v2.png
icon_url: ""
description:
    zh: claude 模型简介
    en: claude model description
default_parameters:
    - name: temperature
      label:
        zh: 生成随机性
        en: Temperature
      desc:
        zh: '- **temperature**: 调高温度会使得模型的输出更多样性和创新性，反之，降低温度会使输出内容更加遵循指令要求但减少多样性。建议不要与“Top p”同时调整。'
        en: '**Temperature**:\n\n- When you increase this value, the model outputs more diverse and innovative content; when you decrease it, the model outputs less diverse content that strictly follows the given instructions.\n- It is recommended not to adjust this value with \"Top p\" at the same time.'
      type: float
      min: "0"
      max: "1"
      default_val:
        balance: "0.8"
        creative: "1"
        default_val: "1.0"
        precise: "0.3"
      precision: 1
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: max_tokens
      label:
        zh: 最大回复长度
        en: Response max length
      desc:
        zh: 控制模型输出的Tokens 长度上限。通常 100 Tokens 约等于 150 个中文汉字。
        en: You can specify the maximum length of the tokens output through this value. Typically, 100 tokens are approximately equal to 150 Chinese characters.
      type: int
      min: "1"
      max: "4096"
      default_val:
        default_val: "4096"
      options: []
      style:
        widget: slider
        label:
            zh: 输入及输出设置
            en: Input and output settings
meta:
    name: Claude-3.5-Sonnet
    protocol: claude
    capability:
        function_call: true
        input_modal:
            - text
            - image
        input_tokens: 128000
        json_mode: false
        max_tokens: 128000
        output_modal:
            - text
        output_tokens: 16384
        prefix_caching: false
        reasoning: false
        prefill_response: false
    conn_config:
        base_url: ""
        api_key: ""
        timeout: 0s
        model: ""
        temperature: 0.7
        frequency_penalty: 0
        presence_penalty: 0
        max_tokens: 4096
        top_p: 1
        top_k: 0
        stop: []
        openai: null
        claude:
            by_bedrock: false
            access_key: ""
            secret_access_key: ""
            session_token: ""
            region: ""
        ark: null
        deepseek: null
        qwen: null
        gemini: null
        custom: {}
    status: 0

Deepseek

id: 2004
name: DeepSeek-V3
icon_uri: deepseek_v2.png
icon_url: ""
description:
    zh: deepseek 模型简介
    en: deepseek model description
default_parameters:
    - name: temperature
      label:
        zh: 生成随机性
        en: Temperature
      desc:
        zh: '- **temperature**: 调高温度会使得模型的输出更多样性和创新性，反之，降低温度会使输出内容更加遵循指令要求但减少多样性。建议不要与“Top p”同时调整。'
        en: '**Temperature**:\n\n- When you increase this value, the model outputs more diverse and innovative content; when you decrease it, the model outputs less diverse content that strictly follows the given instructions.\n- It is recommended not to adjust this value with \"Top p\" at the same time.'
      type: float
      min: "0"
      max: "1"
      default_val:
        balance: "0.8"
        creative: "1"
        default_val: "1.0"
        precise: "0.3"
      precision: 1
      options: []
      style:
        widget: slider
        label:
            zh: 生成随机性
            en: Generation diversity
    - name: max_tokens
      label:
        zh: 最大回复长度
        en: Response max length
      desc:
        zh: 控制模型输出的Tokens 长度上限。通常 100 Tokens 约等于 150 个中文汉字。
        en: You can specify the maximum length of the tokens output through this value. Typically, 100 tokens are approximately equal to 150 Chinese characters.
      type: int
      min: "1"
      max: "4096"
      default_val:
        default_val: "4096"
      options: []
      style:
        widget: slider
        label:
            zh: 输入及输出设置
            en: Input and output settings
    - name: response_format
      label:
        zh: 输出格式
        en: Response format
      desc:
        zh: '- **文本**: 使用普通文本格式回复\n- **JSON**: 将引导模型使用JSON格式输出'
        en: '**Response Format**:\n\n- **Text**: Replies in plain text format\n- **Markdown**: Uses Markdown format for replies\n- **JSON**: Uses JSON format for replies'
      type: int
      min: ""
      max: ""
      default_val:
        default_val: "0"
      options:
        - label: Text
          value: "0"
        - label: JSON Object
          value: "1"
      style:
        widget: radio_buttons
        label:
            zh: 输入及输出设置
            en: Input and output settings
meta:
    name: DeepSeek-V3
    protocol: deepseek
    capability:
        function_call: false
        input_modal:
            - text
        input_tokens: 128000
        json_mode: false
        max_tokens: 128000
        output_modal:
            - text
        output_tokens: 16384
        prefix_caching: false
        reasoning: false
        prefill_response: false
    conn_config:
        base_url: ""
        api_key: ""
        timeout: 0s
        model: ""
        temperature: 0.7
        frequency_penalty: 0
        presence_penalty: 0
        max_tokens: 4096
        top_p: 1
        top_k: 0
        stop: []
        openai: null
        claude: null
        ark: null
        deepseek:
            response_format_type: text
        qwen: null
        gemini: null
        custom: {}
    status: 0

Ollama

id: 2003
name: Gemma-3
icon_uri: ollama.png
icon_url: ""
description:
    zh: ollama 模型简介
    en: ollama model description
default_parameters:
    - name: temperature
      label:
        zh: 生成随机性
        en: Temperature
      desc:
        zh: '- **temperature**: 调高温度会使得模型的输出更多样性和创新性，反之，降低温度会使输出内容更加遵循指令要求但减少多样性。建议不要与“Top p”同时调整。'
        en: '**Temperature**:\n\n- When you increase this value, the model outputs more diverse and innovative content; when you decrease it, the model outputs less diverse content that strictly follows the given instructions.\n- It is recommended not to adjust this value with \"Top p\" at the same time.'
      type: float
      min: "0"
      max: "1"
      default_val:
        balance: "0.8"
        creative: "1"
        default_val: "1.0"
        precise: "0.3"
      precision: 1
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: max_tokens
      label:
        zh: 最大回复长度
        en: Response max length
      desc:
        zh: 控制模型输出的Tokens 长度上限。通常 100 Tokens 约等于 150 个中文汉字。
        en: You can specify the maximum length of the tokens output through this value. Typically, 100 tokens are approximately equal to 150 Chinese characters.
      type: int
      min: "1"
      max: "4096"
      default_val:
        default_val: "4096"
      options: []
      style:
        widget: slider
        label:
            zh: 输入及输出设置
            en: Input and output settings
meta:
    name: Gemma-3
    protocol: ollama
    capability:
        function_call: true
        input_modal:
            - text
        input_tokens: 128000
        json_mode: false
        max_tokens: 128000
        output_modal:
            - text
        output_tokens: 16384
        prefix_caching: false
        reasoning: false
        prefill_response: false
    conn_config:
        base_url: ""
        api_key: ""
        timeout: 0s
        model: ""
        temperature: 0.6
        frequency_penalty: 0
        presence_penalty: 0
        max_tokens: 4096
        top_p: 0.95
        top_k: 20
        stop: []
        openai: null
        claude: null
        ark: null
        deepseek: null
        qwen: null
        gemini: null
        custom: {}
    status: 0

OpenAI

id: 2001
name: GPT-4o
icon_uri: openai_v2.png
icon_url: ""
description:
    zh: gpt 模型简介
    en: Multi-modal, 320ms, 88.7% MMLU, excels in education, customer support, health, and entertainment.
default_parameters:
    - name: temperature
      label:
        zh: 生成随机性
        en: Temperature
      desc:
        zh: '- **temperature**: 调高温度会使得模型的输出更多样性和创新性，反之，降低温度会使输出内容更加遵循指令要求但减少多样性。建议不要与“Top p”同时调整。'
        en: '**Temperature**:\n\n- When you increase this value, the model outputs more diverse and innovative content; when you decrease it, the model outputs less diverse content that strictly follows the given instructions.\n- It is recommended not to adjust this value with \"Top p\" at the same time.'
      type: float
      min: "0"
      max: "1"
      default_val:
        balance: "0.8"
        creative: "1"
        default_val: "1.0"
        precise: "0.3"
      precision: 1
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: max_tokens
      label:
        zh: 最大回复长度
        en: Response max length
      desc:
        zh: 控制模型输出的Tokens 长度上限。通常 100 Tokens 约等于 150 个中文汉字。
        en: You can specify the maximum length of the tokens output through this value. Typically, 100 tokens are approximately equal to 150 Chinese characters.
      type: int
      min: "1"
      max: "4096"
      default_val:
        default_val: "4096"
      options: []
      style:
        widget: slider
        label:
            zh: 输入及输出设置
            en: Input and output settings
    - name: top_p
      label:
        zh: Top P
        en: Top P
      desc:
        zh: '- **Top p 为累计概率**: 模型在生成输出时会从概率最高的词汇开始选择，直到这些词汇的总概率累积达到Top p 值。这样可以限制模型只选择这些高概率的词汇，从而控制输出内容的多样性。建议不要与“生成随机性”同时调整。'
        en: '**Top P**:\n\n- An alternative to sampling with temperature, where only tokens within the top p probability mass are considered. For example, 0.1 means only the top 10% probability mass tokens are considered.\n- We recommend altering this or temperature, but not both.'
      type: float
      min: "0"
      max: "1"
      default_val:
        default_val: "0.7"
      precision: 2
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: frequency_penalty
      label:
        zh: 重复语句惩罚
        en: Frequency penalty
      desc:
        zh: '- **frequency penalty**: 当该值为正时，会阻止模型频繁使用相同的词汇和短语，从而增加输出内容的多样性。'
        en: '**Frequency Penalty**: When positive, it discourages the model from repeating the same words and phrases, thereby increasing the diversity of the output.'
      type: float
      min: "-2"
      max: "2"
      default_val:
        default_val: "0"
      precision: 2
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: presence_penalty
      label:
        zh: 重复主题惩罚
        en: Presence penalty
      desc:
        zh: '- **presence penalty**: 当该值为正时，会阻止模型频繁讨论相同的主题，从而增加输出内容的多样性'
        en: '**Presence Penalty**: When positive, it prevents the model from discussing the same topics repeatedly, thereby increasing the diversity of the output.'
      type: float
      min: "-2"
      max: "2"
      default_val:
        default_val: "0"
      precision: 2
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: response_format
      label:
        zh: 输出格式
        en: Response format
      desc:
        zh: '- **文本**: 使用普通文本格式回复\n- **Markdown**: 将引导模型使用Markdown格式输出回复\n- **JSON**: 将引导模型使用JSON格式输出'
        en: '**Response Format**:\n\n- **Text**: Replies in plain text format\n- **Markdown**: Uses Markdown format for replies\n- **JSON**: Uses JSON format for replies'
      type: int
      min: ""
      max: ""
      default_val:
        default_val: "0"
      options:
        - label: Text
          value: "0"
        - label: Markdown
          value: "1"
        - label: JSON
          value: "2"
      style:
        widget: radio_buttons
        label:
            zh: 输入及输出设置
            en: Input and output settings
meta:
    name: GPT-4o
    protocol: openai
    capability:
        function_call: true
        input_modal:
            - text
            - image
        input_tokens: 128000
        json_mode: false
        max_tokens: 128000
        output_modal:
            - text
        output_tokens: 16384
        prefix_caching: false
        reasoning: false
        prefill_response: false
    conn_config:
        base_url: ""
        api_key: ""
        timeout: 0s
        model: ""
        temperature: 0.7
        frequency_penalty: 0
        presence_penalty: 0
        max_tokens: 4096
        top_p: 1
        top_k: 0
        stop: []
        openai:
            by_azure: true
            api_version: ""
            response_format:
                type: text
                jsonschema: null
        claude: null
        ark: null
        deepseek: null
        qwen: null
        gemini: null
        custom: {}
    status: 0

Qwen

id: 2005
name: Qwen3-32B
icon_uri: qwen_v2.png
icon_url: ""
description:
    zh: 通义千问模型
    en: qwen model description
default_parameters:
    - name: temperature
      label:
        zh: 生成随机性
        en: Temperature
      desc:
        zh: '- **temperature**: 调高温度会使得模型的输出更多样性和创新性，反之，降低温度会使输出内容更加遵循指令要求但减少多样性。建议不要与“Top p”同时调整。'
        en: '**Temperature**:\n\n- When you increase this value, the model outputs more diverse and innovative content; when you decrease it, the model outputs less diverse content that strictly follows the given instructions.\n- It is recommended not to adjust this value with \"Top p\" at the same time.'
      type: float
      min: "0"
      max: "1"
      default_val:
        balance: "0.8"
        creative: "1"
        default_val: "1.0"
        precise: "0.3"
      precision: 1
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: max_tokens
      label:
        zh: 最大回复长度
        en: Response max length
      desc:
        zh: 控制模型输出的Tokens 长度上限。通常 100 Tokens 约等于 150 个中文汉字。
        en: You can specify the maximum length of the tokens output through this value. Typically, 100 tokens are approximately equal to 150 Chinese characters.
      type: int
      min: "1"
      max: "4096"
      default_val:
        default_val: "4096"
      options: []
      style:
        widget: slider
        label:
            zh: 输入及输出设置
            en: Input and output settings
    - name: top_p
      label:
        zh: Top P
        en: Top P
      desc:
        zh: '- **Top p 为累计概率**: 模型在生成输出时会从概率最高的词汇开始选择，直到这些词汇的总概率累积达到Top p 值。这样可以限制模型只选择这些高概率的词汇，从而控制输出内容的多样性。建议不要与“生成随机性”同时调整。'
        en: '**Top P**:\n\n- An alternative to sampling with temperature, where only tokens within the top p probability mass are considered. For example, 0.1 means only the top 10% probability mass tokens are considered.\n- We recommend altering this or temperature, but not both.'
      type: float
      min: "0"
      max: "1"
      default_val:
        default_val: "0.95"
      precision: 2
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
meta:
    name: Qwen3-32B
    protocol: qwen
    capability:
        function_call: true
        input_modal:
            - text
        input_tokens: 128000
        json_mode: false
        max_tokens: 128000
        output_modal:
            - text
        output_tokens: 16384
        prefix_caching: false
        reasoning: false
        prefill_response: false
    conn_config:
        base_url: ""
        api_key: ""
        timeout: 0s
        model: ""
        temperature: 0.7
        frequency_penalty: 0
        presence_penalty: 0
        max_tokens: 4096
        top_p: 1
        top_k: 0
        stop: []
        openai: null
        claude: null
        ark: null
        deepseek: null
        qwen:
            response_format:
                type: text
                jsonschema: null
        gemini: null
        custom: {}
    status: 0

Gemini

id: 2007
name: Gemini-2.5-Flash
icon_uri: gemini_v2.png
icon_url: ""
description:
    zh: gemini 模型简介
    en: gemini model description
default_parameters:
    - name: temperature
      label:
        zh: 生成随机性
        en: Temperature
      desc:
        zh: '- **temperature**: 调高温度会使得模型的输出更多样性和创新性，反之，降低温度会使输出内容更加遵循指令要求但减少多样性。建议不要与“Top p”同时调整。'
        en: '**Temperature**:\n\n- When you increase this value, the model outputs more diverse and innovative content; when you decrease it, the model outputs less diverse content that strictly follows the given instructions.\n- It is recommended not to adjust this value with \"Top p\" at the same time.'
      type: float
      min: "0"
      max: "1"
      default_val:
        balance: "0.8"
        creative: "1"
        default_val: "1.0"
        precise: "0.3"
      precision: 1
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: max_tokens
      label:
        zh: 最大回复长度
        en: Response max length
      desc:
        zh: 控制模型输出的Tokens 长度上限。通常 100 Tokens 约等于 150 个中文汉字。
        en: You can specify the maximum length of the tokens output through this value. Typically, 100 tokens are approximately equal to 150 Chinese characters.
      type: int
      min: "1"
      max: "4096"
      default_val:
        default_val: "4096"
      options: []
      style:
        widget: slider
        label:
            zh: 输入及输出设置
            en: Input and output settings
    - name: top_p
      label:
        zh: Top P
        en: Top P
      desc:
        zh: '- **Top p 为累计概率**: 模型在生成输出时会从概率最高的词汇开始选择，直到这些词汇的总概率累积达到Top p 值。这样可以限制模型只选择这些高概率的词汇，从而控制输出内容的多样性。建议不要与“生成随机性”同时调整。'
        en: '**Top P**:\n\n- An alternative to sampling with temperature, where only tokens within the top p probability mass are considered. For example, 0.1 means only the top 10% probability mass tokens are considered.\n- We recommend altering this or temperature, but not both.'
      type: float
      min: "0"
      max: "1"
      default_val:
        default_val: "0.7"
      precision: 2
      options: []
      style:
        widget: slider
        label:
            zh: 生成多样性
            en: Generation diversity
    - name: response_format
      label:
        zh: 输出格式
        en: Response format
      desc:
        zh: '- **文本**: 使用普通文本格式回复\n- **JSON**: 将引导模型使用JSON格式输出'
        en: '**Response Format**:\n\n- **JSON**: Uses JSON format for replies'
      type: int
      min: ""
      max: ""
      default_val:
        default_val: "0"
      options:
        - label: Text
          value: "0"
        - label: JSON
          value: "2"
      style:
        widget: radio_buttons
        label:
            zh: 输入及输出设置
            en: Input and output settings
meta:
    name: Gemini-2.5-Flash
    protocol: gemini
    capability:
        function_call: true
        input_modal:
            - text
            - image
            - audio
            - video
        input_tokens: 1048576
        json_mode: true
        max_tokens: 1114112
        output_modal:
            - text
        output_tokens: 65536
        prefix_caching: true
        reasoning: true
        prefill_response: true
    conn_config:
        base_url: ""
        api_key: ""
        timeout: 0s
        model: gemini-2.5-flash
        temperature: 0.7
        frequency_penalty: 0
        presence_penalty: 0
        max_tokens: 4096
        top_p: 1
        top_k: 0
        stop: []
        openai: null
        claude: null
        ark: null
        deepseek: null
        qwen: null
        gemini:
            backend: 0
            project: ""
            location: ""
            api_version: ""
            headers:
                key_1:
                    - val_1
                    - val_2
            timeout_ms: 0
            include_thoughts: true
            thinking_budget: null
        custom: {}
    status: 0

Home

3. Model configuration

Model list

Model configuration instructions

Important notes

Configure the model for Coze Studio.

Step 1: Modify the model configuration file.

Step 2: Restart the service

Configuration Reference

Third-Party Model Service

Open-Source Framework

Official Model Service

Field Description

Model information

Example

Volcengine Ark

Claude

Deepseek

Ollama

OpenAI

Qwen

Gemini

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally