fix: improves typedef of Message by philnash · Pull Request #1525 · huggingface/transformers.js

philnash · 2026-02-13T06:02:07Z

Message is defined as an object with a role and content, both of which are strings. But there are examples throughout the codebase, when using a vision model, showing that you can pass an array of objects with a type property of "string" or "image" and a text property that is a string.

This expands the JSDoc typedef of Message to cover both cases.

Message is defined as an object with a role and content, both of which are strings. But there are examples throughout the codebase, when using a vision model, showing that you can pass an array of objects with a type property of "string" or "image" and a text property that is a string. This expands the JSDoc typedef of Message to cover both cases.

nico-martin

Hi @philnash, great to see you here!
You're bringing up a pretty important issue there!
Traditionally the Message was only used for the TextGeneration, so text only. But as you correctly pointed out, we could also use { type: string, text?: string }[].
However, I think we will see more multimodal models in the future, which is why we need to find a clean solution here.
Could you channge the type to { type: "text", text: string }[] so we can merge it?
Only type "text" is currently supported which requires a text property.

philnash · 2026-02-13T11:12:36Z

Hey @nico-martin, it's good to be here! I saw you were working on v4 so I had to check it out!

Not sure I understand, it looks to me like { type: "image" } is also supported, it's tested here for example.

I could narrow the type to:

{ type: "text", text: string } | { type: "image" }

if that works better?

nico-martin · 2026-02-13T12:11:50Z

Oh wait. You're right. I was only looking at the TextGeneration pipeline. There it makes sense to only allow text. But if you create your own "pipeline" an vision language model, then you could also pass an image in the content array.

However, I think we should distinguish between the different types of content so that it is clear, for example, that text generation models can only process text input, whereas VLMs can also process images.

Would you like to pursue this further, or is it okay with you if I dive in?

philnash · 2026-02-15T22:42:04Z

I have had a look through and I'm afraid I'm a bit lost with how the processors work and how that applies to the model I'm trying to use. (The app I'm playing around with uses granite-docling-258m-onnx and I'm not sure, for example, how it gets its settings.

I did find some interesting things though:

The Janus processor defines a MultimodalConversation by combining Message with an images property of type (RawImage | string | URL)[]. This allows you to pass the images, but the original Message type still restricts the contents to string
The TextGeneration pipeline could presumably also receive image inputs if the generator was a vision model that still only generated text. The _call method for the TextGeneration pipeline doesn't seem to be typed at all, so the inputs can be any
The Idefics3Processor has a bunch of type issues in its test file, particularly that the Message doesn't fit the data that its tested with
A list of Message[] is redefined as a Chat in the TextGeneration pipeline, so sometimes when you're trying to find Message[] you need to look for a different type instead

I'm happy to help, but maybe don't have as much time, and definitely not as much experience, with the codebase, so please take it from here if you want!

nico-martin requested changes Feb 13, 2026

View reviewed changes

nico-martin self-assigned this Feb 13, 2026

extended Messages and typed TextGenerationPipeline messages

039ea04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: improves typedef of Message#1525

fix: improves typedef of Message#1525
philnash wants to merge 2 commits intohuggingface:mainfrom
philnash:philnash/message-type

philnash commented Feb 13, 2026

Uh oh!

nico-martin left a comment

Uh oh!

philnash commented Feb 13, 2026 •

edited

Loading

Uh oh!

nico-martin commented Feb 13, 2026

Uh oh!

philnash commented Feb 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

philnash commented Feb 13, 2026

Uh oh!

nico-martin left a comment

Choose a reason for hiding this comment

Uh oh!

philnash commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nico-martin commented Feb 13, 2026

Uh oh!

philnash commented Feb 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

philnash commented Feb 13, 2026 •

edited

Loading