Add support LoRA layer keys from adapter config by petrukha-ivan · Pull Request #424 · ml-explore/mlx-swift-examples

petrukha-ivan · 2025-10-22T18:01:19Z

What

Added support for the keys property in the adapter config. Previously, these keys were ignored, and hardcoded values were used instead. With this change, LoRAConfiguration now correctly applies adapters based on the config-defined keys, and hardcoded keys have been removed from model implementations.

By default, models now target all Linear layers when keys are not specified. This aligns with recent changes in the Python package ml-explore/mlx-lm#515. For example, running the example LoRA training code (without specifying keys) using the latest Python package will create LoRA adapters for all self_attn.*_proj and mlp.*_proj layers. In older versions, only self_attn.q_proj and self_attn.v_proj were targeted.

Resolves

#405

davidkoski · 2025-10-23T16:02:54Z

-    func loraLinearLayers() -> LoRALinearLayers
+    /// Typically, this includes all transformer layers.
+    /// Must be defined explicitly since we can't unify it across all models.
+    var loraLayers: [Module] { get }


Overall I like what you built! I wonder if we could make this a non-breaking change?

For example, the old function returned [(Module, [String])] which is pretty close to [Module] + [String]. We could have default implementations of these new properties use that and split it up. I am trying to think how that would go -- do we need a new protocol for the new methods and a conditional implementation for the old protocol? Deprecate the old method/protocol. Make extension LanguageModel where Self: LoRAModel conform to the new protocol (not even sure that is possible).

Anyway, the change for each model is trivial but it would be nice to make it non-breaking if we could.

Thanks for your feedback! After thinking about it, I don’t see a way to implement this without introducing breaking changes. Could you clarify which part you’d like to keep from breaking? I’ll explain what currently breaks and why it can’t work without changing the core logic.

1. Adapters Compatibility

When keys are not specified the Python package behaves differently before and after 0.28.3:

mlx-lm Default LoRA layers

<0.28.3 self_attn.q/v_proj

≥0.28.3 self_attn.q/v/k/o_proj, mlp.up/gate/down_proj

The current Swift implementation mirrors the older Python behavior:

public func loraLinearLayers() -> LoRALinearLayers { model.layers.map { ($0.attention, ["q_proj", "v_proj"]) } }

If we keep the Swift code as is, it won’t work with adapters trained using the latest mlx-lm Python package. Even though both versions treat keys as null, their behavior differs. Since the Python version already introduced a breaking change, the Swift version has to do the same to stay aligned. Unfortunately, that means breaking compatibility with adapters trained using mlx-lm versions earlier than 0.28.3.

2. Swift Code Compatibility

The old code returns an array of (Module, [String]) pairs, which describes a module and its child keys. This works with self_attn and we can even somehow extend it to support all self_attn.q/v/k/o_proj layers. But if we want to adapt the mlp part, this structure becomes a problem.

public func loraLinearLayers() -> LoRALinearLayers { model.layers.map { ($0.attention, ["q_proj", "k_proj", "v_proj", "o_proj"]) } // How to include `mlp`? }

We could try to aggregate it like in the issue workaround:

public func loraLinearLayers() -> LoRALinearLayers { let attention = model.layers.map { ($0.attention, ["q_proj", "k_proj", "v_proj", "o_proj"]) } let mlp = model.layers.map { ($0.mlp, ["up_proj", "gate_proj", "down_proj"]) } return attention + mlp // This looks fine but it won't work }

But this approach is actually buggy because when we take the last num_layers specified in the adapter config, we only get part of the required modules:

let layersToAdapt = model // Assume we have a model with 36 transformer blocks .loraLinearLayers() // 72 items total (36 attention + 36 mlp) .suffix(config.numLayers) // Suffixing 36 items only gives a subset of the modules

So while [(Module, [String])] looks similar to [Module] + [String] they actually differ. In the updated code, we need the LoRAModel protocol only to unify layers property access across different models, since it varies between model.layers, transformer.layers, languageModel.model.layers, etc.

We could provide default implementations that split this up.

The old implementation can’t be split into the new structure because it’s conceptually incompatible, and for the same reason, we can’t provide default values based on the old code. Old code returns very specific parts instead of complete transformer blocks. Given these issues, I don’t see a way to achieve correct behavior without updating each model implementation.

OK, I buy that, but I am still worried about breakage. We can certainly ship it as-is and just document the changes, but ideally people could upgrade into it as needed.

What if we left the existing protocol and made a new protocol and made both work? The downside here is that it might be confusing which one to use. I guess we could deprecate the old one to suggest to avoid it.

Another problem might be that the structure of the LoRA layers isn't full encapsulated, this is in the calling code:

LoRATrain.convert(model: context.model, layers: lora.loraLinearLayers(loraLayers))

and with your change:

// Load LoRA adapter from directory or create a new one let modelAdapter: ModelAdapter do { modelAdapter = try LoRAContainer.from(directory: adapter) } catch { modelAdapter = try await modelContainer.perform { context in return try LoRAContainer.from( model: context.model, configuration: LoRAConfiguration(numLayers: loraLayers))

it is entirely inside the LoRAContainer.from method -- if we update this in the future it will be contained in there.

So I think I am in agreement with what you have done -- we will just document the breaking change and how to adopt it.

davidkoski · 2025-10-28T16:09:38Z

FastVLM has been merged (#423) since this started -- can you rebase with main and convert that one? Thanks!

davidkoski

Changes look great -- just waiting on FastVLM (has been added since this started) to be added or build will break on main.

petrukha-ivan · 2025-10-28T17:44:48Z

@davidkoski done 👍

davidkoski

Thank you for this huge effort!

* Add LoRA layer keys from config Note: this is a source breaking change to adapt to a change on the mlx-lm side. Previously you would write: ```swift public func loraLinearLayers() -> LoRALinearLayers { model.layers.map { ($0.selfAttn, ["q_proj", "v_proj"]) } } ``` now this should read: ```swift public var loraLayers: [Module] { model.layers } ```

davidkoski reviewed Oct 23, 2025

View reviewed changes

davidkoski requested changes Oct 28, 2025

View reviewed changes

petrukha-ivan added 5 commits October 28, 2025 18:39

Add LoRA layer keys from config

9b6c031

Fix tests and examples

7cca208

Apply swift-format

833247b

Fix adapter fusion

ff3f5d5

Add FastVLM LoRA layers

4c548ec

petrukha-ivan force-pushed the lora-layer-keys-config branch from 74315b7 to 4c548ec Compare October 28, 2025 17:43

davidkoski approved these changes Oct 29, 2025

View reviewed changes

davidkoski merged commit 881ad5a into ml-explore:main Oct 29, 2025
4 checks passed

davidkoski mentioned this pull request Dec 16, 2025

Support for loading LoRA layers from adapter config in Swift #405

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support LoRA layer keys from adapter config#424

Add support LoRA layer keys from adapter config#424
davidkoski merged 5 commits into
ml-explore:mainfrom
petrukha-ivan:lora-layer-keys-config

petrukha-ivan commented Oct 22, 2025

Uh oh!

davidkoski Oct 23, 2025

Uh oh!

petrukha-ivan Oct 27, 2025 •

edited

Loading

Uh oh!

davidkoski Oct 27, 2025

Uh oh!

davidkoski Oct 27, 2025

Uh oh!

davidkoski Oct 27, 2025

Uh oh!

davidkoski commented Oct 28, 2025

Uh oh!

davidkoski left a comment

Uh oh!

petrukha-ivan commented Oct 28, 2025

Uh oh!

davidkoski left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mlx-lm	Default LoRA layers
<0.28.3	`self_attn.q/v_proj`
≥0.28.3	`self_attn.q/v/k/o_proj`, `mlp.up/gate/down_proj`

Conversation

petrukha-ivan commented Oct 22, 2025

What

Resolves

Uh oh!

davidkoski Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

petrukha-ivan Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

1. Adapters Compatibility

2. Swift Code Compatibility

Uh oh!

davidkoski Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

davidkoski Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

davidkoski Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

davidkoski commented Oct 28, 2025

Uh oh!

davidkoski left a comment

Choose a reason for hiding this comment

Uh oh!

petrukha-ivan commented Oct 28, 2025

Uh oh!

davidkoski left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

petrukha-ivan Oct 27, 2025 •

edited

Loading