Description
Is there an existing issue for the same feature request?
- I have checked the existing issues.
Is your feature request related to a problem?
Yes. I'm always frustrated when the model fails to recognize important terms or context due to the large number of distinct files or proprietary terminology used in my team. This leads to inconsistent responses and requires manual work to include these details in the system instructions, which becomes inefficient as the data volume grows.
Describe the feature you'd like
Description:
It would be beneficial to add a feature to "pin" documents or chunks in Ragflow. The idea is to allow specific content to always be sent to the model as part of the context, ensuring that critical information is always available for response generation.
Motivation:
In other platforms, this functionality has proven to be very useful, especially in scenarios where:
- There is a large volume of distinct files.
- The team uses many proprietary terms that the model does not easily recognize.
- It is necessary to ensure that certain information is always considered by the model, regardless of the user's query.
Currently, an alternative solution would be to include this information directly in the system instructions. However, when dealing with a large volume of information, this approach may not be ideal, as it overloads the instructions and may not guarantee that the model uses the data efficiently.
Benefits of the Feature:
- Ensures that essential information is always in context.
- Facilitates data organization for projects handling extensive documentation.
- Improves the consistency of model responses when dealing with internal terms and specific instructions.
- Reduces the need to rephrase queries or repeatedly load documents.
If there is already a way to implement this functionality in Ragflow, I would appreciate guidance on how to use it. If there is another recommended approach to address this need, any insights would be very helpful.
Thank you!
Describe implementation you've considered
One possible implementation is adding a "pinned" attribute to documents or chunks within Ragflow’s data management system. This could be a simple toggle that marks selected content as always included in the context window for model queries.
Another approach could involve creating a persistent context layer where pinned documents or chunks are dynamically appended to user queries before retrieval, ensuring they are always part of the response generation.
A more advanced method would be introducing priority-based retrieval, where pinned documents are given higher weighting in the retrieval process, ensuring their presence without overwhelming the token limit.
If there are existing mechanisms to achieve similar results within Ragflow, I would appreciate any guidance on leveraging them.
Documentation, adoption, use case
Additional information
No response