Skip to content

Conversation

@linsj20
Copy link

@linsj20 linsj20 commented Jan 26, 2024

No description provided.

@KKZ20 KKZ20 merged commit a727ba9 into KKZ20:feature/sequence_parallel_optimization Jan 26, 2024
KKZ20 pushed a commit that referenced this pull request Feb 7, 2024
* [shardformer] add megatron sp to llama

* support llama7B 128k with distributed attention

* [shardformer] robustness enhancement

* add block attn

* sp mode 1: keep input as a complete sequence

* fix sp compatability

* refactor ring implementation

* support mode 2 sp in gpt2
KKZ20 pushed a commit that referenced this pull request Feb 14, 2024
* [shardformer] add megatron sp to llama

* support llama7B 128k with distributed attention

* [shardformer] robustness enhancement

* add block attn

* sp mode 1: keep input as a complete sequence

* fix sp compatability

* refactor ring implementation

* support mode 2 sp in gpt2
KKZ20 pushed a commit that referenced this pull request Mar 27, 2024
* [shardformer] add megatron sp to llama

* support llama7B 128k with distributed attention

* [shardformer] robustness enhancement

* add block attn

* sp mode 1: keep input as a complete sequence

* fix sp compatability

* refactor ring implementation

* support mode 2 sp in gpt2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants