Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding flash attention to GPT2 #27479

Closed
wants to merge 19 commits into from

Conversation

canberk17
Copy link

@canberk17 canberk17 commented Nov 14, 2023

What does this PR do?

Adding Flash Attention2 to GPT2, here are my tests:

image

Contributing to : #26350

Who can review?

Hey guys @younesbelkada @ArthurZucker, could you please review it when you get a chance.

I was trying to debug why I was getting these test failures, some of them point to falcon model ( even though I haven not touched that file ).

Also I ran the flash attention test on another model that has been merged, and these are the test results I am getting:
image

am I on the right path here? I couldn’t address why some of these test failures are happening.

@canberk17 canberk17 changed the title Added flash attention to GPT2 Adding flash attention to GPT2 Nov 14, 2023
@canberk17 canberk17 marked this pull request as draft November 14, 2023 04:24
@canberk17 canberk17 marked this pull request as ready for review November 14, 2023 05:24
@canberk17 canberk17 marked this pull request as draft November 14, 2023 06:26
@canberk17 canberk17 marked this pull request as ready for review November 14, 2023 09:05
Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good thanks for your hardwork!
I left few comments, please have a look, for the failing CI can you try to merge with upstream main?
Can you also add few lines in the docs (e.g. similarly as: #27400) for GPT2 ?
Thanks!

src/transformers/models/gpt2/modeling_gpt2.py Show resolved Hide resolved
src/transformers/models/gpt2/modeling_gpt2.py Outdated Show resolved Hide resolved
src/transformers/models/gpt2/modeling_gpt2.py Show resolved Hide resolved
@canberk17
Copy link
Author

canberk17 commented Nov 15, 2023

@younesbelkada, thanks for the tips ! Now most of the tests are passing. However, I'm facing a challenge with to address the issues in the following test:

tests_torch failure referring to Speech2TextModelTest which is puzzling since I haven't made changes to this part of the code.

For tests_torch I don't know how the tests on google colab passed. I used the following statement to run these test both for the files in this branch:
!RUN_SLOW=1 pytest -sv --disable-warnings -k flash_attn_2 tests/models/gpt2/test_modeling_gpt2.py
Similarly, I used the same command for the recently merged GPT variant:
!RUN_SLOW=1 pytest -sv --disable-warnings -k flash_attn_2 tests/models/gpt_bigcode/test_modeling_gpt_bigcode.py
and I got the results mentioned in my comment. I am wondering if I am executing these tests on Colab properly, as I don't see any error messages when I run these tests. When you get a change could you give me some insights on my approach please

### Using Flash Attention 2
Flash Attention 2 is an advanced optimization method that dramatically reduces memory usage and increases inference speed. It's particularly effective for large-scale generation tasks. To utilize Flash Attention 2, ensure your hardware is compatible and install the necessary package with:

```python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be ```sh here. Not Python script.

Use the model with Flash Attention 2 as follows:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The leading space should be removed otherwise it causes syntax error.

@ArthurZucker
Copy link
Collaborator

cc @younesbelkada

@huggingface huggingface deleted a comment from github-actions bot Jan 10, 2024
Copy link

github-actions bot commented Feb 4, 2024

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot closed this Feb 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants