Add offline dataset generation #39

vkkhare · 2025-06-22T17:58:42Z

Description

Adding offline dataset generation for faster training cycles for predictors.

Creates a new activation capture class for all the training and accuracy benchmarking
generate_dataset.py generates huggingface compatible sparsity datasets for the predictors

Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>

Signed-off-by: Varun Khare <varun.khare@niimbleedgehq.ai>

kaselby · 2025-06-26T13:25:37Z

src/activation_capture.py

+        for i, layer in enumerate(model.model.layers):
+
+            # Capture hidden states before MLP
+            handle = layer.register_forward_hook(


Won't this capture the states pre-attention?

yup that is what Deja VU recommends right for better parallelisation. I can move it post layer normalization too

kaselby · 2025-06-26T13:25:53Z

src/activation_capture.py

+
+            # Capture MLP gate activations (after activation function)
+            if hasattr(layer.mlp, 'gate_proj'):
+                handle = layer.mlp.gate_proj.register_forward_hook(


Won't this capture the states before the activation function rather than after?

This will return the output of gate_proj and up_proj linear layers. We compute the activations separately before storing it

kaselby · 2025-06-26T13:26:21Z

src/activation_capture.py

+            up_act = self.mlp_activations[up_key]
+
+            # Apply SwiGLU activation: silu(gate) * up
+            gated_act = F.silu(gate_act) * up_act


Should have support for non-silu activations here - OPT uses ReLU at the very least

++ will take it as input

kaselby · 2025-06-26T13:27:43Z

src/activation_capture.py

+        self.remove_hooks()
+
+        # Hook into each transformer layer
+        for i, layer in enumerate(model.model.layers):


model.model.layers will cause problems with OPT due to it having an inconvenient wrapper around the decoder, but there are a few ways of fixing this - I can either refactor the OPT code or we can have some kind of model.get_layers() method that we call here.

model.get_layers looks more idiomatic approach

add offline dataset gen

57ef818

Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>

vkkhare marked this pull request as draft June 22, 2025 17:59

remove type hints for dynamic classes

7dede4c

Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>

vkkhare force-pushed the dataset_gen branch from 6138a14 to 7dede4c Compare June 22, 2025 18:34

vkkhare and others added 4 commits June 22, 2025 18:41

add evaluation logic for predictors

b7c65f0

Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>

add batch processing in dataset gen

03c7341

Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>

random sampling for sequence length

d83b8af

Signed-off-by: Varun Khare <varun.khare@nimbleedgehq.ai>

add dataset append for memory efficiency

5d1a9a5

Signed-off-by: Varun Khare <varun.khare@niimbleedgehq.ai>

vkkhare marked this pull request as ready for review June 26, 2025 06:25

vkkhare changed the title ~~[WIIP] Add offline dataset generation~~ Add offline dataset generation Jun 26, 2025

add streaming support for training

a6d5f93

Signed-off-by: Varun Khare <varun.khare@niimbleedgehq.ai>

kaselby reviewed Jun 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add offline dataset generation #39

Add offline dataset generation #39

Uh oh!

vkkhare commented Jun 22, 2025

Uh oh!

kaselby Jun 26, 2025

Uh oh!

vkkhare Jun 26, 2025

Uh oh!

kaselby Jun 26, 2025

Uh oh!

vkkhare Jun 26, 2025

Uh oh!

kaselby Jun 26, 2025

Uh oh!

vkkhare Jun 26, 2025

Uh oh!

kaselby Jun 26, 2025

Uh oh!

vkkhare Jun 26, 2025

Uh oh!

Uh oh!

Add offline dataset generation #39

Are you sure you want to change the base?

Add offline dataset generation #39

Uh oh!

Conversation

vkkhare commented Jun 22, 2025

Description

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!