Paged GPTBigCode Support #30

JRosenkranz · 2024-05-09T19:48:32Z

…attention

sahilsuneja1 · 2024-05-09T22:13:39Z

@JRosenkranz Please merge #31 and #32 into paged_gpt_code branch first!

Update paged_gpt_bigcode.py

Update paged_speculative_inference.py

nairbv · 2024-05-10T17:01:31Z

fms_extras/models/paged_gpt_bigcode.py

+            output = layer(
+                x=x,
+                mask=mask,
+                cache_data_layer=None


the value for the line should be assigned to a variable first or something, otherwise this looks like it says cache_data_layer=None

sahilsuneja1 · 2024-06-06T15:07:12Z

@JRosenkranz please merge #34 into paged_gpt_bigcode first :)

Update paged_llama.py for granite-3b-code

added paged_gpt_bigcode and moved PagedMultiHeadAttention to modules.…

9b8210c

…attention

JRosenkranz requested review from nairbv and ani300 May 9, 2024 19:48

JRosenkranz self-assigned this May 9, 2024

sahilsuneja1 added 2 commits May 9, 2024 18:09

Update paged_gpt_bigcode.py

904b8a7

Update paged_speculative_inference.py

c189789

JRosenkranz added 2 commits May 9, 2024 19:59

Merge pull request #32 from sahilsuneja1/patch-2

7a94bcb

Update paged_gpt_bigcode.py

Merge pull request #31 from sahilsuneja1/patch-3

3b1ca6a

Update paged_speculative_inference.py

nairbv reviewed May 10, 2024

View reviewed changes

added calico implementation for paged_llama

f56a1e2

nairbv approved these changes May 24, 2024

View reviewed changes

sahilsuneja1 added 2 commits June 6, 2024 11:02

Update paged_llama.py for granite-3b-code

839c2d4

Update paged_llama.py

51595c9

Merge pull request #34 from sahilsuneja1/patch-4

c2285e4

Update paged_llama.py for granite-3b-code

JRosenkranz merged commit 22f5132 into main Jun 25, 2024
3 checks passed

Provide feedback