Skip to content

Commit 7b170c8

Browse files
authored
Merge pull request #73 from togethercomputer/Vprov/add_dpo
Add support for the training_method and dpo_beta parameters
2 parents ac319b9 + 5be7c01 commit 7b170c8

File tree

1 file changed

+22
-0
lines changed

1 file changed

+22
-0
lines changed

openapi.yaml

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -484,6 +484,18 @@ paths:
484484
type: boolean
485485
default: auto
486486
description: Whether to mask the user messages in conversational data or prompts in instruction data.
487+
training_method:
488+
type: string
489+
enum:
490+
- sft
491+
- dpo
492+
default: sft
493+
description: The training method to use. 'sft' for Supervised Fine-Tuning or 'dpo' for Direct Preference Optimization.
494+
dpo_beta:
495+
type: number
496+
format: float
497+
default: 0.1
498+
description: The beta parameter for DPO training. Only applicable when training_method is 'dpo'.
487499
training_type:
488500
type: object
489501
oneOf:
@@ -2337,6 +2349,16 @@ components:
23372349
enum:
23382350
- auto
23392351
default: auto
2352+
training_method:
2353+
type: string
2354+
enum:
2355+
- sft
2356+
- dpo
2357+
default: sft
2358+
dpo_beta:
2359+
type: number
2360+
format: float
2361+
default: 0.1
23402362
training_type:
23412363
type: object
23422364
oneOf:

0 commit comments

Comments
 (0)