Add support for positions array in `keras_nlp.layers.RotaryEmbedding` layer #1571

tirthasheshpatel · 2024-04-10T21:16:56Z

Adds support for a positions array in our RotaryEmbedding layer. This is useful when non-arange positions arrays are required like the one used in this paper.

mattdangerw

thanks! couple tiny comments

mattdangerw · 2024-04-10T23:35:18Z

keras_nlp/layers/modeling/rotary_embedding.py

+            tensor = ops.cast(positions, dtype="float32")
+        else:
+            seq_len = ops.shape(inputs)[sequence_axis]
+            tensor = ops.arange(seq_len, dtype="float32") + start_index


Let's move this branch into it's own sub function?

That might make it easier to try overriding.

if positions is None: positions = self._compute_positions(inputs, start_index=0) else: positions = ops.cast(positions, dtype="float32")

Since we are renaming the tensor var to positions, we can simply remove the else statement. Did that in the latest commit, let me know if it looks good!

I would actually keep the self._compute_positions over-ridable method for a somewhat hacky reason. Since we don't have great "layer wrappinig" or generation mutation functionality right now, this allows for a patch on the layer itself that would control this computation.

keras_nlp/layers/modeling/rotary_embedding.py

Add support for positions array in our RotaryEmbedding layer

7c368d7

tirthasheshpatel requested a review from mattdangerw April 10, 2024 21:16

Fix formatting issue and GPU bug

532e81a

mattdangerw reviewed Apr 10, 2024

View reviewed changes

tirthasheshpatel added 2 commits April 11, 2024 01:51

Rename tensor to positions, simplify branching

bac2bb2

Move default positions computation to '_compute_positions'

e73a146

mattdangerw merged commit ab8d951 into keras-team:master Apr 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for positions array in `keras_nlp.layers.RotaryEmbedding` layer #1571

Add support for positions array in `keras_nlp.layers.RotaryEmbedding` layer #1571

Uh oh!

tirthasheshpatel commented Apr 10, 2024

Uh oh!

mattdangerw left a comment

Uh oh!

mattdangerw Apr 10, 2024

Uh oh!

tirthasheshpatel Apr 11, 2024 •

edited

Loading

Uh oh!

mattdangerw Apr 11, 2024

Uh oh!

tirthasheshpatel Apr 11, 2024

Uh oh!

Uh oh!

Uh oh!

Add support for positions array in keras_nlp.layers.RotaryEmbedding layer #1571

Add support for positions array in keras_nlp.layers.RotaryEmbedding layer #1571

Uh oh!

Conversation

tirthasheshpatel commented Apr 10, 2024

Uh oh!

mattdangerw left a comment

Choose a reason for hiding this comment

Uh oh!

mattdangerw Apr 10, 2024

Choose a reason for hiding this comment

Uh oh!

tirthasheshpatel Apr 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattdangerw Apr 11, 2024

Choose a reason for hiding this comment

Uh oh!

tirthasheshpatel Apr 11, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Add support for positions array in `keras_nlp.layers.RotaryEmbedding` layer #1571

Add support for positions array in `keras_nlp.layers.RotaryEmbedding` layer #1571

tirthasheshpatel Apr 11, 2024 •

edited

Loading