Iimprovements to ACE-Steps 1.5 text encoding (part 2) by blepping · Pull Request #12350 · Comfy-Org/ComfyUI

blepping · 2026-02-07T18:43:22Z

This pull attempts (hopefully, successfully) to:

Skip calculating the uncond batch when CFG is set to 1. Running the LM without CFG doesn't seem to help much with performance and the results are pretty bad, so I'm not sure exactly when someone would want to do this but it makes sense not to calculate a batch that just isn't used. It may reduce memory consumption for memory limited people.
It adds some backend support for more flexible encoding. Third party nodes can now do stuff like set a negative prompt, set min/max LM codes limits (forcing the LM to generate exactly a certain number seems like it often causes problems in my testing), use time or key signatures the built in node doesn't allow, etc.
Attempts to clean up the code a bit and make it easier to read/modify.
Adds console progress output via TQDM like other sampling functions.

I have some ACE related nodes here which have initial support for this: https://gist.github.com/blepping/d0f6a26b1f59ed705999945821a3ee8a

For example, with the backend support you can now do something like this:

The compress node strips off duplicate codes (getting like 20 or so of the same code at the end is pretty common, usually it's just what I assume is the code for silence). Since the number of codes is dynamic when using a flexible min/max token limit (or possibly pruning codes), it's necessary to calculate the latent size from the number of codes.

I was hesitant to mess with ComfyUI's built in text encoding node because adding/reordering fields breaks workflows with how ComfyUI uses order/slot based associations when loading a workflow rather than keys (unless something has changed recently) but at least with this backend support custom node authors have a way to add extended functionality without having to completely reimplement text encoding.

I've done a number of generations using these features and they don't seem to cause any issues but I can't say it's extensively tested and I'm not sure I fully understood how the sampling function works. (Since ComfyUI requires transformers it would probably make sense to use the logits processing functions from Transformers like top-p, etc. Sampling seems like it could really benefit from extended features like repetition or n-gram penalties which Transformers provides.)

The te_debug thing is a bit hacky and can be removed (or changed) if necessary. It is really useful to be able to see exactly what is getting encoded though and there isn't really another way to do that without having to change ComfyUI's TE source files. edit: I just removed it.

comfy/text_encoders/ace15.py

comfyanonymous · 2026-02-08T04:08:58Z

comfy/text_encoders/ace15.py

-        out["qwen3_06b"] = self.qwen3_06b.tokenize_with_weights("# Instruction\nGenerate audio semantic tokens based on the given conditions:\n\n# Caption\n{}\n# Metas\n{}\n<|endoftext|>\n<|endoftext|>".format(text, meta_cap), return_word_ids, **kwargs)
-        out["lm_metadata"] = {"min_tokens": duration * 5,
+        out = {
+            prompt_key: self.qwen3_06b.tokenize_with_weights(prompt, prompt_key == "qwen3_06b" and return_word_ids, **kwargs)


You need to put: disable_weights=True here or else it's going to strip out the () for token weights.

Ah, sorry, that was a dumb mistake. The original code actually didn't have that enabled when encoding the qwen3_06b key. Was that correct/behavior that should be preserved or do we want to always set disable_weights=True? edit: Assuming the original implementation version was correct, this now should replicate that behavior and does not set disable_weights when encoding qwen3_06b.

did you test if the weights actually work? if they don't it's better to just disable them on all the encoders.

No, I didn't try it because generally that doesn't work for LLMs. I think it makes sense to just disable them globally since there is no practical way to pass separate prompts for the different parts. So if one of them used prompt weighting and one used the literal characters, you'd be parsing stuff incorrectly in one of the cases.

The last commit has some additional changes to try to get TE encoding as close to the official implementation as possible. It seems like I was wrong to add the caption to the CoT even though the code apparently does add it. With these changes, comparing the official Gradio app (on left) to ComfyUI for a dumb LLM generated sample prompt:

LM

DIT

I still don't believe in the double EoT tokens but the official app didn't dump the lyrics encoding or negative LM prompt (if it encoded it, not sure) and I don't have time to actually add debug code to that so I didn't mess with it.

I forced the language to not get passed using my node, unfortunately there isn't a way to avoid passing the language in with ComfyUI's current conditioning encode node so there would be something like language: en in the metas.

This is as close as I can get it to the official app for now. I'm also trusting that it's not lying to me with the debug output. There is some weird stuff in the official app, like: https://github.com/ace-step/ACE-Step-1.5/blob/08f56edd7fc68452608933706c59da5102bc602e/acestep/handler.py#L2377 - where it appears to add the lyrics text (with # Languages) section to a text prompt after it dumped the debug info for the prompt. I don't know if that's just informational or if it actually gets encoded and used.

did you test if the weights actually work?

I tested this. Even if the weights are set in the token/weight pairs, it doesn't seem like encode_from_tokens_scheduled respects them. I could verify that a token had an absurd weight like 2000 in the token weight pairs but it didn't actually affect anything.

Reorder meta caps to match official implementation. Remove caption from CoT. Ensure sections are double spaced for the LM prompt. Also strip the /4 from timesignature in meta caps.

comfyanonymous · 2026-02-10T00:44:16Z

I double checked it and with this: #12376

It matches the ref implementation completely.

blepping requested review from Kosinkadink, comfyanonymous and guill as code owners February 7, 2026 18:43

More improvements to ACE-Steps 1.5 text encoding

7f1094c

blepping force-pushed the improve_ace15_te_pt2 branch from 8282a28 to 7f1094c Compare February 7, 2026 18:46

blepping added 2 commits February 7, 2026 12:09

Path of least resistance, just remove the te_debug stuff

7bd68a2

Remove unused logging import

b248547

blepping commented Feb 7, 2026

View reviewed changes

comfy/text_encoders/ace15.py Show resolved Hide resolved

comfyanonymous reviewed Feb 8, 2026

View reviewed changes

blepping added 2 commits February 8, 2026 12:01

Disable weights for ACE15 TE except on qwen3_06b key.

281dbd3

Always disable weights for Ace15 TE.

95d41a9

Reorder meta caps to match official implementation. Remove caption from CoT. Ensure sections are double spaced for the LM prompt. Also strip the /4 from timesignature in meta caps.

comfyanonymous merged commit baf8c87 into Comfy-Org:master Feb 10, 2026
12 checks passed

luna-niemitalo pushed a commit to luna-niemitalo/ComfyUI that referenced this pull request Feb 11, 2026

Iimprovements to ACE-Steps 1.5 text encoding (part 2) (Comfy-Org#12350)

9a4f272

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Iimprovements to ACE-Steps 1.5 text encoding (part 2)#12350

Iimprovements to ACE-Steps 1.5 text encoding (part 2)#12350
comfyanonymous merged 5 commits intoComfy-Org:masterfrom
blepping:improve_ace15_te_pt2

blepping commented Feb 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

comfyanonymous Feb 8, 2026

Uh oh!

blepping Feb 8, 2026 •

edited

Loading

Uh oh!

comfyanonymous Feb 9, 2026

Uh oh!

blepping Feb 9, 2026

Uh oh!

blepping Feb 10, 2026

Uh oh!

Uh oh!

comfyanonymous commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

blepping commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

comfyanonymous Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

blepping Feb 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

comfyanonymous Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

blepping Feb 9, 2026

Choose a reason for hiding this comment

LM

DIT

Uh oh!

blepping Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

comfyanonymous commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

blepping commented Feb 7, 2026 •

edited

Loading

blepping Feb 8, 2026 •

edited

Loading