You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Design_Doc_Examples/RAG_Q&A_for collaborative_work_platform.md
+69-1Lines changed: 69 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -95,7 +95,75 @@ For some Documents we do not know the diff. Only know how Document looked like a
95
95
96
96
### **IV. Validation Schema**
97
97
98
-
No ideas
98
+
For validation purposes, we will use a data set generated from the original documents using the RAGAS functionality. This approach allows us to create a comprehensive validation set that closely mirrors the real-world usage of our system.
99
+
100
+
#### i. Question Selection and Dataset Creation
101
+
RAGAS takes the original documents and their associated metadata and generates a structured dataset with the following components
102
+
103
+
* Question: Simulation of user queries
104
+
* Context: Relevant parts of the document(s)
105
+
* Answer: The expected answer
106
+
107
+
This structure allows us to evaluate both the retrieval and generation aspects of our RAG system.
108
+
109
+
To create a comprehensive and representative validation dataset, we'll employ a multi-faceted approach to question selection:
110
+
111
+
1. Automated Question Generation
112
+
* Use natural language processing (NLP) techniques to automatically generate questions from the documents.
113
+
* Apply techniques such as named entity recognition, key phrase extraction and syntactic parsing to identify potential question targets.
114
+
* Use question generation models (e.g. T5 or BART fine-tuned for question generation) to create different types of questions.
115
+
116
+
2. Human-in-the-Loop Curation
117
+
* Engage subject matter experts to review and refine auto-generated questions.
118
+
* Have experts create additional questions, especially for complex scenarios or edge cases that automated systems might miss.
119
+
* Ensure questions cover various difficulty levels and reasoning types.
120
+
121
+
3. Real User Query Mining
122
+
* Analyse logs of actual user queries (if available) to identify common question patterns and topics.
123
+
* Include anonymised versions of real user questions in the dataset to ensure relevance to actual use cases.
124
+
125
+
4. Question Diversity. Ensure a balanced distribution of question types:
126
+
* Factual questions (e.g. "Who is the author of this document?")
127
+
* Inferential questions (e.g. "What are the implications of the findings in section 3?)
128
+
* Comparative questions (e.g. "How does the methodology in version 2 differ from that in version 1?)
129
+
* Multi-document questions (e.g. "Summarise the common themes across these three related documents.)
130
+
* Version-specific questions (e.g. "What changes have been made to the conclusion between versions 3 and 4?)
131
+
132
+
5. Context Selection
133
+
* For each question, select a relevant context from the document(s).
134
+
* Include both perfectly matching contexts and partially relevant contexts to test the system's ability to handle nuanced scenarios.
135
+
136
+
6. Answer Generation
137
+
* Generate a gold standard answer for each question-context pair.
138
+
* Use a combination of automated methods and human expert review to ensure answer quality
139
+
140
+
7. Metadata Inclusion
141
+
* Include relevant metadata for each question-context-answer triplet, such as document version, page numbers or section headings.
142
+
143
+
8. Edge Case Scenarios
144
+
* Deliberately include edge cases, such as questions about rare document types or extremely long documents.
145
+
* Create questions that require an understanding of document structure, such as tables of contents or footnotes.
146
+
147
+
9. Negative Examples
148
+
* Include some questions that cannot be answered from the given context to test the system's ability to recognise when it doesn't have sufficient information.
149
+
150
+
151
+
#### ii. Periodic Updates
152
+
The validation dataset will be updated periodically to maintain its relevance and comprehensiveness. This includes:
153
+
154
+
* Addition of newly uploaded documents
155
+
* Including new versions of existing documents
156
+
* Updating the question set to reflect evolving user needs
157
+
158
+
We recommend updating the validation set monthly or whenever there's a significant influx of new documents or versions.
159
+
160
+
#### iii. Stratified Sampling
161
+
To ensure balanced representation, we'll use stratified sampling when creating the validation set. Strata may include:
162
+
163
+
* Document length (short, medium, long)
164
+
* Document type (text, scanned image)
165
+
* Topic areas
166
+
* Query complexity (simple factual, multi-step reasoning, version comparison)
99
167
100
168
-**Key Takeaways:**
101
169
1. The selection of a validation schema is crucial for accurately measuring a model's performance on unseen data, requiring careful consideration of the specific characteristics of the dataset and the problem at hand.
0 commit comments