posts/gemma/ #4

2024-05-03T06:10:50Z

giscus[bot]
bot May 3, 2024

posts/gemma/

Transformer-based LLMs seem mysterious, but they don’t need to. In this post, we’ll walk through a modern transformer LLM, Google’s Gemma, providing bare-bones PyTorch code and some intuition for why each step is there. If you’re a programmer and casual ML enthusiast, this is written for you.

https://graphcore-research.github.io/posts/gemma/?utm_source=substack&utm_medium=email

partrita · 2024-05-03T06:10:51Z

partrita
May 3, 2024 — with giscus

      1 input_ids = tokenizer("I want to move").input_ids
----> 3 hiddens = p.embedding[input_ids]
      4 # p.embedding.shape = (256000, 2048)
      5 # input_ids.shape = (5,)
      6 # hiddens.shape = (5, 2048)

NameError: name 'p' is not defined

Could you please, add more code for define p?

2 replies

thecharlieblake May 3, 2024

@DouglasOrr

DouglasOrr May 3, 2024
Maintainer

Sorry, I haven't made this clear in the blog text. I'd recommend using the notebook as a starting point for running the code. There, p (in def predict) is a Params object that contains a copy of Gemma's parameters, which can be loaded & converted from HuggingFace using the code from the notebook.

Please let me know if this doesn't work for you (be aware it requires a HuggingFace account to download the parameters, and ~12GB memory to run).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

posts/gemma/ #4

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

posts/gemma/ #4

Uh oh!

giscus[bot] bot May 3, 2024

posts/gemma/

Replies: 1 comment · 2 replies

Uh oh!

partrita May 3, 2024 — with giscus

Uh oh!

thecharlieblake May 3, 2024

Uh oh!

DouglasOrr May 3, 2024 Maintainer

giscus[bot]
bot May 3, 2024

Replies: 1 comment 2 replies

partrita
May 3, 2024 — with giscus

DouglasOrr May 3, 2024
Maintainer