Skip to content

Multimodal Prompt Collaboration #98

@lmr2706

Description

@lmr2706

Dear authors,

I'm very interested in the text-visual interaction and collaboration described in your paper. In real-world tasks, multiple prompts are often used, and it's challenging to make these different prompts coordinate effectively to achieve the best results. I've read your paper and tried to find the corresponding implementation of text-visual collaboration in the code, but it seems that most parts directly call the T-Rex2 API for processing. This didn't fully answer my questions. I'd like to know the specific implementation details of how to coordinate multiple types of prompts (such as text and visual prompts) to work together. Could you please provide some insights or guidance on this?

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions