Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visual DPO #1647

Merged
merged 44 commits into from
Jun 26, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
8768fe6
Remove extra whitespaces
qgallouedec May 17, 2024
5d43f2b
idefics
qgallouedec May 17, 2024
f5a3237
vdpo
qgallouedec May 27, 2024
682c034
sft idefics
qgallouedec May 27, 2024
bf01bf3
pad with test
qgallouedec May 30, 2024
aed1aeb
use prompt instead of tokenizer
qgallouedec May 30, 2024
e814f88
rm name main
qgallouedec May 30, 2024
fd5d71b
support vlm in tokenize row
qgallouedec May 30, 2024
e1b8755
temp fix for regex in lora_target_module
qgallouedec May 30, 2024
8075419
format
qgallouedec May 31, 2024
1b815c2
vdpo
qgallouedec May 31, 2024
6d6a194
tmp float16 hard code
qgallouedec Jun 3, 2024
1935d3d
concatenated_forward support for vision
qgallouedec Jun 3, 2024
bdc2b95
style and new command line
qgallouedec Jun 17, 2024
24b08f5
all-linear
qgallouedec Jun 17, 2024
c5ff8d7
format
qgallouedec Jun 18, 2024
a7d1732
delete old examples
qgallouedec Jun 18, 2024
2303c40
get image
qgallouedec Jun 18, 2024
b606190
upcast
qgallouedec Jun 18, 2024
4f78ee5
new test
qgallouedec Jun 18, 2024
c4433c0
modified test
qgallouedec Jun 18, 2024
7a8a94f
new strat for tokenizer
qgallouedec Jun 18, 2024
a9a4607
Merge branch 'main' into fix-vsft-example
qgallouedec Jun 25, 2024
9955710
rm token transfer
qgallouedec Jun 25, 2024
f6ee370
integrate vision in dpo example
qgallouedec Jun 25, 2024
56fb036
format
qgallouedec Jun 25, 2024
c3249e5
add FDivergenceType back
qgallouedec Jun 25, 2024
f69bb1c
precommit
qgallouedec Jun 25, 2024
6d859cf
pillow test dep
qgallouedec Jun 25, 2024
48db3e1
optional prompt
qgallouedec Jun 25, 2024
dea765b
`evaluation_strategy` to `eval_strategy`
qgallouedec Jun 25, 2024
d6dc3ba
revert vsft change (oos)
qgallouedec Jun 25, 2024
3a1f5b8
update test
qgallouedec Jun 25, 2024
5545825
test
qgallouedec Jun 25, 2024
5197d6d
comment and support more in process
qgallouedec Jun 26, 2024
45fda7e
update process
qgallouedec Jun 26, 2024
5a1dfa7
update doc for vdpo
qgallouedec Jun 26, 2024
2c10ca8
caution about limited support
qgallouedec Jun 26, 2024
2e47633
Update docs/source/dpo_trainer.mdx
qgallouedec Jun 26, 2024
f960a2a
revert DPO example changes
qgallouedec Jun 26, 2024
e4c7436
cleaner way to check if a model is vision
qgallouedec Jun 26, 2024
bfb35d3
comment
qgallouedec Jun 26, 2024
7b22153
update vdpo example
qgallouedec Jun 26, 2024
5155194
rename
qgallouedec Jun 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
evaluation_strategy to eval_strategy
  • Loading branch information
qgallouedec committed Jun 25, 2024
commit dea765b6e012d602dcea252484a9e98ff2704ce9
2 changes: 1 addition & 1 deletion tests/my_new_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
remove_unused_columns=False,
gradient_accumulation_steps=1,
learning_rate=9e-1,
evaluation_strategy="steps",
eval_strategy="steps",
beta=0.1,
loss_type="sigmoid",
precompute_ref_log_probs=True,
Expand Down
6 changes: 3 additions & 3 deletions tests/test_dpo_trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,7 @@ def test_vdpo_trainer(self, loss_type, pre_compute):
remove_unused_columns=False,
gradient_accumulation_steps=1,
learning_rate=9e-1,
evaluation_strategy="steps",
eval_strategy="steps",
beta=0.1,
loss_type=loss_type,
precompute_ref_log_probs=pre_compute,
Expand Down Expand Up @@ -855,7 +855,7 @@ def test_dpo_loss_alpha_div_f(self):
remove_unused_columns=False,
gradient_accumulation_steps=4,
learning_rate=9e-1,
evaluation_strategy="steps",
eval_strategy="steps",
f_divergence_type=FDivergenceType.ALPHA_DIVERGENCE.value,
f_alpha_divergence_coef=0.5,
)
Expand Down Expand Up @@ -897,7 +897,7 @@ def test_dpo_loss_js_div_f(self):
remove_unused_columns=False,
gradient_accumulation_steps=4,
learning_rate=9e-1,
evaluation_strategy="steps",
eval_strategy="steps",
f_divergence_type=FDivergenceType.JS_DIVERGENCE.value,
f_alpha_divergence_coef=0.5,
)
Expand Down
Loading