Skip to content

[Feature Request]: Pinning Documents or Chunks #5405

Open
@Peterson047

Description

@Peterson047

Is there an existing issue for the same feature request?

  • I have checked the existing issues.

Is your feature request related to a problem?

Yes. I'm always frustrated when the model fails to recognize important terms or context due to the large number of distinct files or proprietary terminology used in my team. This leads to inconsistent responses and requires manual work to include these details in the system instructions, which becomes inefficient as the data volume grows.

Describe the feature you'd like

Description:
It would be beneficial to add a feature to "pin" documents or chunks in Ragflow. The idea is to allow specific content to always be sent to the model as part of the context, ensuring that critical information is always available for response generation.

Motivation:
In other platforms, this functionality has proven to be very useful, especially in scenarios where:

  • There is a large volume of distinct files.
  • The team uses many proprietary terms that the model does not easily recognize.
  • It is necessary to ensure that certain information is always considered by the model, regardless of the user's query.

Currently, an alternative solution would be to include this information directly in the system instructions. However, when dealing with a large volume of information, this approach may not be ideal, as it overloads the instructions and may not guarantee that the model uses the data efficiently.

Benefits of the Feature:

  • Ensures that essential information is always in context.
  • Facilitates data organization for projects handling extensive documentation.
  • Improves the consistency of model responses when dealing with internal terms and specific instructions.
  • Reduces the need to rephrase queries or repeatedly load documents.

If there is already a way to implement this functionality in Ragflow, I would appreciate guidance on how to use it. If there is another recommended approach to address this need, any insights would be very helpful.

Thank you!

Describe implementation you've considered

One possible implementation is adding a "pinned" attribute to documents or chunks within Ragflow’s data management system. This could be a simple toggle that marks selected content as always included in the context window for model queries.

Another approach could involve creating a persistent context layer where pinned documents or chunks are dynamically appended to user queries before retrieval, ensuring they are always part of the response generation.

A more advanced method would be introducing priority-based retrieval, where pinned documents are given higher weighting in the retrieval process, ensuring their presence without overwhelming the token limit.

If there are existing mechanisms to achieve similar results within Ragflow, I would appreciate any guidance on leveraging them.

Documentation, adoption, use case

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions