Skip to content

[Bug]: Cannot parse images #11043

@weblerson

Description

@weblerson

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (Language Policy).
  • Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • Please do not modify this template :) and fill in all the required fields.

RAGFlow workspace code commit ID

de24e74

RAGFlow image version

de24e74(v0.21.1)

Other environment information

Actual behavior

ALWAYS when I try to parse a image itself or a PDF file with a image within it I get this error:

03:09:34 Task has been received.
03:09:35 Page(1100000001): Finish OCR: ( ...)
03:09:35 Page(1
100000001): Use CV LLM to describe the picture.
03:09:36 Page(1100000001): [ERROR]cannot identify image file <_io.BytesIO object at 0x7d7b84415e90>
03:09:36 Page(1
100000001): No chunk built from image.jpg

Anyone has faces the same issue as mine? Is there something I might be missing?

Expected behavior

No response

Steps to reproduce

- Create a Gemini provider with your API key
- Attach gemini-2.5-flash as image2text model
- Create a dataset
- Populate the dataset with image files
- Try to parse the images

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    🐞 bugSomething isn't working, pull request that fix bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions