Skip to content

Conversation

Copy link

Copilot AI commented Oct 13, 2025

Problem

RR-ClArC's fine-tuning procedure was incorrectly optimizing all model parameters, when it should only update layers from the CAV layer onwards. According to the original paper and as spotted by @kasia284, the method should freeze all layers before the layer where gradient reprojection is applied during fine-tuning.

Solution

This PR modifies the apply_model_correction() method in RRCLARC to:

  1. Identify layer ordering - Determine which layers come before the specified cav_layer in the model's execution order
  2. Freeze early layers - Set requires_grad=False for all parameters belonging to modules before cav_layer
  3. Selective optimization - Update the optimizer to only include trainable parameters (from cav_layer onwards)
  4. State restoration - Restore the original requires_grad state after fine-tuning completes

Example

For a model with layers [layer0, layer1, layer2, layer3] and cav_layer='layer1':

  • Before: All layers were updated during fine-tuning
  • After: Only layer1, layer2, and layer3 are updated; layer0 is frozen

Implementation Details

The freezing logic properly handles:

  • Nested modules - Parameters like layer0.conv.weight are correctly frozen when layer0 should be frozen
  • Edge cases - Top-level parameters are handled appropriately
  • State management - Original requires_grad values are restored after training to avoid side effects

Testing

Comprehensive tests verify:

  • ✅ Frozen layers are not updated during fine-tuning
  • ✅ Trainable layers are correctly updated
  • ✅ Works with different CAV layer positions (first, middle, last layer)
  • ✅ Handles nested module structures correctly
  • ✅ Original parameter states are properly restored

Impact

  • This fix is specific to RR-ClArC only and does not affect other ClArC variants (A-ClArC, P-ClArC)
  • The change aligns the implementation with the original paper specification
  • Expected to have minor practical impact since the default configuration uses only 1 fine-tuning epoch

Fixes the issue reported by @kasia284 regarding incorrect layer freezing in RR-ClArC fine-tuning procedure.

Original prompt

This section details on the original issue you should resolve

<issue_title>RR-ClArC should freeze layers up to l-1 while finetuning</issue_title>
<issue_description>Full credit for spotting this to @kasia284.

In RR-ClArC the fine-tuning procedure should freeze all layers before the layer on which we apply the gradient reprojection. It's probably a minor error, because there's only one fine-tuning epoch, but still definitely worth fixing.</issue_description>

<agent_instructions>Check out the original paper, double check that the code you change doesn't interfere with other important execution traces!</agent_instructions>

Comments on the Issue (you are @copilot in this section)

Fixes #23


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 3 commits October 13, 2025 21:05
Co-authored-by: istepka <49250572+istepka@users.noreply.github.com>
Co-authored-by: istepka <49250572+istepka@users.noreply.github.com>
Co-authored-by: istepka <49250572+istepka@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix layer freezing in RR-ClArC fine-tuning process Fix RR-ClArC to freeze layers before CAV layer during fine-tuning Oct 13, 2025
Copilot AI requested a review from istepka October 13, 2025 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RR-ClArC should freeze layers up to l-1 while finetuning

2 participants