ClipQKV #197

vchiley · 2023-02-28T05:21:06Z

This PR

enables qk_ln in triton attn variant
enables clipping qkv before attn is applied

On its own, Alibi does not fully solve the stability issue; when clipping is added to qkv, training becomes more stable.

Without Alibi, clipping qkv is still very stable (it just doesn't do as well)

I tried clipping values of {0.1, 1, 2, 10}; {0.1, 1} were too aggressive; {2, 10} are shown. We can see that when the clipping value is higher (10) the network is slightly less stable and gets psuedo-loss spikes from which it recovers gracefully.

ClipQKV+Alibi outperforms QKLN

ClipQKV+Alibi has the exact same performance as QKLN+Alibi

abhi-mosaic

Gonna leave this to others to review + approve, but could I suggest adding some unit tests to check equality between the torch/causal/triton attention blocks?

examples/llm/src/models/layers/attention.py

examples/llm/src/models/mosaic_gpt.py

Co-authored-by: Abhi Venigalla <77638579+abhi-mosaic@users.noreply.github.com>

vchiley · 2023-02-28T18:52:09Z

adding some unit tests to check equality between the torch/causal/triton attention blocks?

created issue: https://mosaicml.atlassian.net/browse/RESEARCH-468

dakinggg

LGTM

examples/llm/src/models/layers/attention.py

dakinggg · 2023-03-01T07:26:11Z

nvm, i misread the output. looks like your added tests are failing

vchiley · 2023-03-01T16:14:32Z

#197 (comment)

yeah they're gpu tests. I need to xfail them on cpu. Forgot to do that...

bcui19

LGTM after gating the tests to be on GPU only

codestar12

Looks good!

vchiley added 2 commits February 22, 2023 22:08

add clip to attn and add ln_qk to triton kernel

c21c49a

Merge branch 'main' into clip_qkv

900379f

vchiley requested review from bcui19 and abhi-mosaic February 28, 2023 05:21

vchiley self-assigned this Feb 28, 2023

lint

f573d9a

vchiley requested review from bmosaicml, codestar12 and dakinggg February 28, 2023 16:42

abhi-mosaic reviewed Feb 28, 2023

View reviewed changes

examples/llm/src/models/layers/attention.py Outdated Show resolved Hide resolved

examples/llm/src/models/mosaic_gpt.py Outdated Show resolved Hide resolved

vchiley and others added 2 commits February 28, 2023 10:46

Update examples/llm/src/models/mosaic_gpt.py

65db378

Co-authored-by: Abhi Venigalla <77638579+abhi-mosaic@users.noreply.github.com>

abhi review suggestions

87ffe46

vchiley added 2 commits March 1, 2023 06:27

add attn tests

eeaddd1

updt

212b692

vchiley force-pushed the clip_qkv branch from 386ea7e to 212b692 Compare March 1, 2023 06:39

dakinggg approved these changes Mar 1, 2023

View reviewed changes

examples/llm/src/models/layers/attention.py Outdated Show resolved Hide resolved

examples/llm/src/models/layers/attention.py Outdated Show resolved Hide resolved

bmosaicml approved these changes Mar 1, 2023

View reviewed changes

bcui19 approved these changes Mar 1, 2023

View reviewed changes

vchiley force-pushed the clip_qkv branch 3 times, most recently from 1b02a56 to 7b6f3ae Compare March 1, 2023 17:10

vchiley enabled auto-merge (squash) March 1, 2023 17:11

dk pr cmts

2b6901b

vchiley force-pushed the clip_qkv branch from 7b6f3ae to 2b6901b Compare March 1, 2023 17:30

enable softmax scale in triton kernel

8d2280a

vchiley force-pushed the clip_qkv branch 2 times, most recently from 49392da to 62d9749 Compare March 1, 2023 18:33

vchiley force-pushed the clip_qkv branch from 62d9749 to bfb871b Compare March 1, 2023 18:41

make test smaller

2e2d298

vchiley force-pushed the clip_qkv branch from bfb871b to 2e2d298 Compare March 1, 2023 18:50

codestar12 approved these changes Mar 1, 2023

View reviewed changes

Merge branch 'main' into clip_qkv

dbdfcb1

vchiley merged commit e63c50c into mosaicml:main Mar 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ClipQKV #197

ClipQKV #197

vchiley commented Feb 28, 2023

abhi-mosaic left a comment

vchiley commented Feb 28, 2023 •

edited

Loading

dakinggg left a comment

dakinggg commented Mar 1, 2023 •

edited

Loading

vchiley commented Mar 1, 2023

bcui19 left a comment

codestar12 left a comment

ClipQKV #197

ClipQKV #197

Conversation

vchiley commented Feb 28, 2023

abhi-mosaic left a comment

Choose a reason for hiding this comment

vchiley commented Feb 28, 2023 • edited Loading

dakinggg left a comment

Choose a reason for hiding this comment

dakinggg commented Mar 1, 2023 • edited Loading

vchiley commented Mar 1, 2023

bcui19 left a comment

Choose a reason for hiding this comment

codestar12 left a comment

Choose a reason for hiding this comment

vchiley commented Feb 28, 2023 •

edited

Loading

dakinggg commented Mar 1, 2023 •

edited

Loading