A robust, Flux-aware merge node for ComfyUI with deterministic DARE, extra merge modes, a block-weight preset library, regex and model-based masks, safety skip/lock toggles, and an analysis-only dry run. Designed for FLUX-style UNets (and generally compatible with SD-like networks that follow similar block naming).
This node prioritizes predictability, reproducibility, and VRAM sanity. You choose the base model, control per-block weights (or use one-click presets), and get a clean JSON summary with fingerprints so you can re-run the exact merge later.
-
Install
- Copy
tenos_merge.pyintoComfyUI/custom_nodes/ - Fully restart ComfyUI (kill the process; don’t just “Reload Nodes”)
- Copy
-
Add the node
- Right-click → Add Node → Tenos.ai → Tenosai Merge (Flux+ Deterministic)
-
Wire models
model1= primarymodel2= secondary
-
Pick base & mode
base_model_choice= the weights you start frommerge_mode= algorithm (see “Merge Modes”)
-
Choose weights
- Pick a preset (
block_preset+ toggleapply_preset) or - Dial block sliders manually (0.0 = keep base, 1.0 = take secondary)
- Pick a preset (
-
(Optional) Safety & Masks
- Use skip_ / lock_ toggles to protect sensitive parts
- Use a mask_model or mask_regex to target specific params
-
Run
MODEL→ merged model (send to KSampler or SaveModel)STRING→ JSON summary (audit + reproducibility)
-
Put
tenos_merge.pyinComfyUI/custom_nodes/, restart ComfyUI -
Node appears as “Tenosai Merge (Flux+ Deterministic)”
-
Optimized for Flux-style UNets; generally works with SD-like models using:
input_blocks.*,output_blocks.*,middle_block(ormid_block)
-
model1,model2(MODEL) – the two models to be merged -
base_model_choice(model1 | model2) – which model’s weights you start from -
merge_modesimple– Linear interpolation (lerp)dare– Deterministic DARE: prune smallest diffs, merge the restweighted_sum–weight_1*model1 + (1-weight_1)*model2(independent of base)sigmoid_average– Nonlinear remap of amount via sigmoid for softer blendingtensor_addition– Explicit delta:t1 + amount * (t2 - t1)difference_maximization– Emphasize where|t2| > |t1|auto_similarity– Cosine similarity per param scales effective blend automatically
block_preset(Custom, Balanced, Style-lean, Subject-lean, Text-obedient, Structure-keeper, Detail-boost, Minimal-change)apply_preset(bool) — when on, the preset overrides all sliders before merging
| Preset | Intent |
|---|---|
| Balanced | Neutral 50/50 baseline |
| Style-lean | Lean into style: middle + upsampling |
| Subject-lean | Lean into subject/identity: downsampling |
| Text-obedient | Stronger prompt adherence (conditioning-heavy) |
| Structure-keeper | Keep composition/structure; conservative everywhere |
| Detail-boost | Sharpen details; upsampling heavy |
| Minimal-change | Very light touch; exploratory |
Exact preset weights (0.0 keep base ←→ 1.0 take secondary):
{
"Balanced": {
"Image Hint": 0.50, "Timestep Embedding": 0.50, "Text Conditioning": 0.50,
"Early Downsampling (Composition)": 0.50, "Mid Downsampling (Subject & Concept)": 0.50, "Late Downsampling (Refinement)": 0.50,
"Core Middle Block": 0.50, "Early Upsampling (Initial Style)": 0.50, "Mid Upsampling (Details)": 0.50, "Late Upsampling (Final Textures)": 0.50,
"Final Output Layer": 0.50, "Other": 0.50
},
"Style-lean": {
"Image Hint": 0.50, "Timestep Embedding": 0.50, "Text Conditioning": 0.50,
"Early Downsampling (Composition)": 0.30, "Mid Downsampling (Subject & Concept)": 0.40, "Late Downsampling (Refinement)": 0.50,
"Core Middle Block": 0.80, "Early Upsampling (Initial Style)": 0.75, "Mid Upsampling (Details)": 0.85, "Late Upsampling (Final Textures)": 0.90,
"Final Output Layer": 0.60, "Other": 0.50
},
"Subject-lean": {
"Image Hint": 0.50, "Timestep Embedding": 0.50, "Text Conditioning": 0.50,
"Early Downsampling (Composition)": 0.80, "Mid Downsampling (Subject & Concept)": 0.80, "Late Downsampling (Refinement)": 0.70,
"Core Middle Block": 0.60, "Early Upsampling (Initial Style)": 0.45, "Mid Upsampling (Details)": 0.40, "Late Upsampling (Final Textures)": 0.35,
"Final Output Layer": 0.50, "Other": 0.50
},
"Text-obedient": {
"Image Hint": 0.25, "Timestep Embedding": 0.60, "Text Conditioning": 0.85,
"Early Downsampling (Composition)": 0.55, "Mid Downsampling (Subject & Concept)": 0.55, "Late Downsampling (Refinement)": 0.50,
"Core Middle Block": 0.55, "Early Upsampling (Initial Style)": 0.50, "Mid Upsampling (Details)": 0.50, "Late Upsampling (Final Textures)": 0.50,
"Final Output Layer": 0.50, "Other": 0.50
},
"Structure-keeper": {
"Image Hint": 0.30, "Timestep Embedding": 0.40, "Text Conditioning": 0.50,
"Early Downsampling (Composition)": 0.20, "Mid Downsampling (Subject & Concept)": 0.25, "Late Downsampling (Refinement)": 0.30,
"Core Middle Block": 0.40, "Early Upsampling (Initial Style)": 0.40, "Mid Upsampling (Details)": 0.40, "Late Upsampling (Final Textures)": 0.40,
"Final Output Layer": 0.40, "Other": 0.20
},
"Detail-boost": {
"Image Hint": 0.50, "Timestep Embedding": 0.50, "Text Conditioning": 0.50,
"Early Downsampling (Composition)": 0.35, "Mid Downsampling (Subject & Concept)": 0.40, "Late Downsampling (Refinement)": 0.45,
"Core Middle Block": 0.65, "Early Upsampling (Initial Style)": 0.75, "Mid Upsampling (Details)": 0.85, "Late Upsampling (Final Textures)": 0.90,
"Final Output Layer": 0.65, "Other": 0.50
},
"Minimal-change": {
"Image Hint": 0.10, "Timestep Embedding": 0.10, "Text Conditioning": 0.10,
"Early Downsampling (Composition)": 0.10, "Mid Downsampling (Subject & Concept)": 0.10, "Late Downsampling (Refinement)": 0.10,
"Core Middle Block": 0.10, "Early Upsampling (Initial Style)": 0.10, "Mid Upsampling (Details)": 0.10, "Late Upsampling (Final Textures)": 0.10,
"Final Output Layer": 0.10, "Other": 0.10
}
}Presets don’t bypass safety toggles. If
skip_normsis on, those params stay from the base regardless of preset weights.
- Image Hint – IP-Adapter / control influence
- Timestep Embedding – how the model interprets noise level
- Text Conditioning – prompt adherence
- Early Downsampling (Composition) – composition & layout
- Mid Downsampling (Subject & Concept) – subject identity & concepts
- Late Downsampling (Refinement) – pre-style refinement
- Core Middle Block – global style / identity
- Early/Mid/Late Upsampling – detail creation & textures
- Final Output Layer – output head / latent projection
- Other – anything not matched by the heuristics
Rule of thumb: Downsampling = what, Upsampling = how it looks, Middle = style core.
Context: Every block has a slider (0.0–1.0) used as the blend amount. Some modes add extra knobs. The slider and those knobs work together as described below.
- What: Straight crossfade between base and secondary.
- Uses slider? Yes (directly).
- Extra knobs: none.
- Use when: Baseline blending; “more of model2 here.”
- Try: 0.2–0.6 on the blocks you care about.
- Pitfall: If models disagree wildly, can look muddy → consider
auto_similarityordare.
- What:
final = weight_1*model1 + (1-weight_1)*model2for every merged block. - Uses slider? Gate only. Slider > 0 applies the same global mix; 0 keeps base.
- Extra knobs:
weight_1(0..1) — always refers to model1. - Use when: You want a clean, consistent “70/30” style global blend.
- Try:
weight_1 = 0.5to start; nudge to 0.6–0.8 if base should dominate. - Pitfall: Sliders don’t shape ratio (beyond on/off). For per-block nuance, use
simple/auto_similarity.
- What: Same as
simple, but remaps the slider through a sigmoid. - Uses slider? Yes (curved).
- Extra knobs:
sigmoid_strength(higher = steeper middle). - Use when: Softer, less brittle transitions (style transfer without harsh artifacts).
- Try: sliders 0.3–0.7;
sigmoid_strength = 2–4. - Pitfall: Very high strength “snaps” near the middle.
- What:
final = t1 + amount * (t2 - t1)(equivalent tosimple, framed as applying delta). - Uses slider? Yes (linearly).
- Extra knobs: none.
- Use when: You think in terms of “add model2’s change” on top of model1.
- Try: 0.2–0.5.
- What: Where
|t2| > |t1|, lean toward model2; else keep model1. Slider controls strength of leaning. - Uses slider? Yes (strength).
- Extra knobs: none.
- Use when: Push style/details where model2 is clearly stronger without washing out model1 elsewhere.
- Try: 0.2–0.5 on mid/late upsampling; keep
skip_normson. - Pitfall: Can amplify noise with incompatible models.
- What: Per-param cosine similarity; if tensors are similar → small change; if different → larger change. Effective amount = slider × auto-weight.
- Uses slider? Yes (scales the auto weight).
- Extra knobs:
auto_k(higher = more aggressive when difference is high). - Use when: Great default: preserve stable layers, lean into meaningful differences.
- Try: sliders 0.25–0.6;
auto_k = 4–8. - Pitfall: If models are wildly different, it’ll go big everywhere → lower sliders or
auto_k.
-
What:
- Compute
diff = |t2 - t1| - Prune smallest diffs via quantile (
dare_prune_amount) - On the unpruned elements, blend using slider ×
dare_merge_amountDeterministic: quantile in float32 for stable thresholds.
- Compute
-
Uses slider? Yes (strength on kept elements).
-
Extra knobs:
dare_prune_amount(0..1): fraction of smallest diffs to ignoredare_merge_amount(0..1): cap/scale of merge strength on kept parts
-
Use when: Bring a fine-tune into a base, keeping only the important changes.
-
Try:
dare_prune_amount = 0.05–0.15,dare_merge_amount = 1.0; normal sliders in the blocks you want. -
Pitfall: Too much prune = “nothing changed”; too little = regular blend. Keep
skip_normson.
| Mode | Does the block slider matter? | How it’s used | ||||
|---|---|---|---|---|---|---|
simple |
Yes | Direct linear blend | ||||
weighted_sum |
Gate only | > 0 = apply global weight_1; 0 = keep base |
||||
sigmoid_average |
Yes | Blended through sigmoid curve (softer tails) | ||||
tensor_addition |
Yes | Add scaled delta (t2 - t1) |
||||
difference_maximization |
Yes | Strength when (t2 > t1) |
||||
auto_similarity |
Yes | Scales auto weight from cosine similarity | ||||
dare |
Yes | Strength only on unpruned elements (after prune) |
| Toggle | Protects (keeps base) | Why/When |
|---|---|---|
skip_bias |
All .bias |
Reduce drift; maintain activation centering |
skip_norms |
LayerNorm, GroupNorm, BatchNorm (weights + biases) | Preserve calibration; strongly recommended |
lock_time_embed |
Timestep/time-embedding layers | Keep denoising schedule interpretation stable |
lock_conditioner |
Text conditioning & related projections | Maintain prompt adherence while changing style/identity elsewhere |
lock_output_layer |
Final projection / head (e.g., to_rgb, out., final_*) |
Avoid breaking the last conversion stage |
Tip: Turn on skip_norms by default. Use locks when you want minimum drift in those areas.
Semantics: after computing a candidate merged value for a parameter, the mask blends it with the base value:
final = lerp(base_param, merged_param, mask)
# mask = 0.0 → keep base; mask = 1.0 → take merged
mask_model— a third model whose tensors (by same name) act as masks.mask_regex+mask_value— create a constant mask where the param name matches the regex.
Precedence: mask_model wins for names it covers; regex applies to the rest.
Broadcasting supported:
- exact same shape → used as-is
[C,1,1]or[C]→ broadcast to conv weights[C,H,W,...]- scalar → broadcast to any shape If a mask can’t be broadcast, it’s ignored (warning in console).
-
Protect all norms (keep base):
mask_regex: (?i)(layernorm|groupnorm|batchnorm|ln_|norm) mask_value: 0.0 -
Force final head from secondary:
mask_regex: (?i)(to_rgb|out\.|final_layer|output_layer) mask_value: 1.0 -
Freeze text conditioning:
mask_regex: (?i)(text|cond|token|clip|context) mask_value: 0.0 -
Hit only early downsample blocks:
mask_regex: (?i)^input_blocks\.(?:0|1|2|3) mask_value: 1.0
analysis_only— dry run: returns JSON summary; doesn’t modify weightscalc_dtype— math dtype (float32orbfloat16). Final params keep original dtype.seed— sets torch seeds for deterministic behavior
-
MODEL— the merged model -
STRING— JSON summary including:- Sizes (GB) for model1, model2, final
- Merge mode + full settings (including preset name/applied)
- Preflight counts (missing in base/secondary, shape mismatches)
- Run info (seed + key-set fingerprints)
- Result counts (merged/kept, up to 50 error lines)
Example JSON (trimmed):
{
"base_model": "model1",
"sizes_gb": {"model1": 3.21, "model2": 3.21, "final": 3.21},
"merge_mode": "dare",
"settings": {
"preset": {"name": "Style-lean", "applied": true},
"weights": {"Core Middle Block": 0.8, "...": 0.5},
"dare": {"prune_amount": 0.1, "merge_amount": 1.0}
},
"preflight": {"missing_in_secondary": 0, "missing_in_base": 0, "shape_mismatches": 0},
"run": {
"seed": 0,
"base_keys_fingerprint": "a1b2c3d4e5f6",
"secondary_keys_fingerprint": "f6e5d4c3b2a1"
},
"result": {"merged_params": 1234, "kept_from_base": 56, "errors": 0}
}| UI Label | Typical param name patterns |
|---|---|
| Early Downsampling (Composition) | input_blocks.0–3 |
| Mid Downsampling (Subject & Concept) | input_blocks.4–8 |
| Late Downsampling (Refinement) | input_blocks.9+ |
| Core Middle Block | middle_block, mid_block |
| Early Upsampling (Initial Style) | output_blocks.0–3 |
| Mid Upsampling (Details) | output_blocks.4–8 |
| Late Upsampling (Final Textures) | output_blocks.9+ |
| Final Output Layer | to_rgb, out., final_layer, output_layer |
| Text Conditioning | text, cond, token, clip, context |
| Image Hint | ipadapter, control, image.*hint |
| Other | Anything else |
Heuristics are designed to work across Flux and SD-like repos.
Style over structure
- Base = structure model
- Raise Core Middle Block + Upsampling
skip_normsON
Concept blending
- Raise Downsampling (subject/identity lives there)
Auto as smart default
auto_similaritywith moderate sliders preserves stable parts, leans into meaningful diffs
Fine-tune into base (DARE)
dare_prune_amount0.05–0.15,dare_merge_amount1.0- Adjust block sliders normally
Keep fragile parts steady
skip_normsand oftenskip_biasON- Lock
time_embed,conditioner,output_layerfor minimal drift
- Node not found / old UI appears — You’re loading an older file. Remove duplicates in
custom_nodes/, delete__pycache__, restart Comfy. - Nothing changes — Check block sliders;
0.0means “keep base”. In DARE, too high prune can mask changes. - VRAM pressure — Use
calc_dtype = bfloat16. Math uses less memory; write-back preserves original dtype. - Weighted sum feels “different” — By design:
weight_1always maps to model1, independent of base (predictable semantics). - Import error — Comfy will skip the node. Check the console traceback; fix the file or ping the error.
- Set
seedfor deterministic merges. - Summary includes key-set fingerprints — if they differ, you’re not merging the same models.
- Respect original model licenses; you’re responsible for redistribution rights.
- Deterministic DARE (quantile in float32)
- Extra modes:
difference_maximization,auto_similarity,sigmoid_average - Skip/lock safety toggles
- Regex + model-based masks with broadcasting
- Preset library with UI controls
- Analysis-only dry run and detailed JSON summary