feat: Add support for custom Q&A in the knowledge base #10873 #10874

dajianguo · 2024-11-20T01:14:27Z

Summary

I need to import some Q&A text，not need LLM to generate results for me。
I modified the code to support excel and csv uploading qa files。
The processing logic is that when there are only two columns in csv or excel and qa mode is selected, the LLM will not be called。

Resolves #4664
Resolves #6904
Resolves #7735
Resolves #7430
Resolves #10873

Screenshots

Checklist

Important

Please review the checklist below before submitting your pull request.

This change requires a documentation update, included: Dify Document
I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
I've updated the documentation accordingly.
I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

crazywoola · 2024-11-20T04:23:16Z

api/core/indexing_runner.py

-                response = LLMGenerator.generate_qa_document(
-                    current_user.current_tenant_id, preview_texts[0], doc_language
-                )
+                if "Q00001:" in preview_texts[0] and "A00001:" in preview_texts[0]:


This one doesn't seem very generic.

Is this the separator? "Q00001:" and "A00001:"? I can change to a common separator.

I refer to the format_split_text method。

It is only a runtime delimiter variable and does not actually store

AkisAya · 2024-11-22T06:23:10Z

this should not be done by implicitly change QA LLM mode to a normal QA extraction by a template.
i think user should choose to disable llm QA mode when import file on ui.

something like this

and if user choose to generate QA pair by template, ui shows a hint what template should be according to the file extension

kiendn1 · 2024-11-22T10:11:25Z

api/core/rag/splitter/text_splitter.py

I think u shouldn't modify function create_documents, instead of that, create a separate splitter QASplitter

crazywoola · 2024-11-23T15:48:27Z

I discussed with @JohnJyong, we decide not to merge this PR. If you have any other questions, please feel free to contact with @JohnJyong

feat: Add support for custom Q&A in the knowledge base langgenius#10873

862e7f4

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. 📚 feat:datasource Data sources like web, Notion, Logseq, Lark, Docs labels Nov 20, 2024

crazywoola assigned JohnJyong Nov 20, 2024

crazywoola requested a review from JohnJyong November 20, 2024 04:22

crazywoola reviewed Nov 20, 2024

View reviewed changes

kiendn1 reviewed Nov 22, 2024

View reviewed changes

crazywoola closed this Nov 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add support for custom Q&A in the knowledge base #10873 #10874

feat: Add support for custom Q&A in the knowledge base #10873 #10874

dajianguo commented Nov 20, 2024 •

edited by crazywoola

Loading

crazywoola Nov 20, 2024

dajianguo Nov 20, 2024

dajianguo Nov 20, 2024

dajianguo Nov 20, 2024

crazywoola Nov 20, 2024

AkisAya commented Nov 22, 2024 •

edited

Loading

kiendn1 Nov 22, 2024

crazywoola commented Nov 23, 2024

feat: Add support for custom Q&A in the knowledge base #10873 #10874

feat: Add support for custom Q&A in the knowledge base #10873 #10874

Conversation

dajianguo commented Nov 20, 2024 • edited by crazywoola Loading

Summary

Screenshots

Checklist

crazywoola Nov 20, 2024

Choose a reason for hiding this comment

dajianguo Nov 20, 2024

Choose a reason for hiding this comment

dajianguo Nov 20, 2024

Choose a reason for hiding this comment

dajianguo Nov 20, 2024

Choose a reason for hiding this comment

crazywoola Nov 20, 2024

Choose a reason for hiding this comment

AkisAya commented Nov 22, 2024 • edited Loading

kiendn1 Nov 22, 2024

Choose a reason for hiding this comment

crazywoola commented Nov 23, 2024

dajianguo commented Nov 20, 2024 •

edited by crazywoola

Loading

AkisAya commented Nov 22, 2024 •

edited

Loading