Description
Hi everyone,
I'm thinking about creating a chatbot (or at least a query system with enhanced result) based on the Symfony/PHP documentation.
Tell me what you think, if you see any blockers or if you have any suggestions. Happy coding !
1. issues
- Mainstream chatbots are based on outdated data (2021/2022). When we are looking for some information in a technical doc we need up to date data.
- Mainstream chatbots most of the time doesn't give their sources
- Mainstream chabots are mainstream, a specialized one could be more efficient
2. Solution
Of course, I'm not crazy, I don't want to trained a model, it's quite expensive and I don't have the skills. My idea is to create a RAG. By indexing multiple documentations of the Symfony ecosystem in a vector database, and then use a small open-source model (Mistral 7b) to analyse and enhanced the result, I think I can create something really nice.
3. Steps
The POC
- index last Symfony documentation (7.0) into Weaviate (with haystack or longchain to chunk the data)
- use the generative search of Weaviate with Anyscale and Mistral 7b for the model
V1
- index Symfony LTS docs until 4.4, PHP docs until 7.4 and famous PHP/Symfony librairies (APIP, Doctrine, PHPUnit...)
- module AI written in Python around weaviate and serve by fastAPI
- Back-end to handle user, caching, auto-completion, rate-limiting (Symfony)
- Simple console client to request the back-end (Symfony)
- Dockerize and script everything
V2
- from a query system to a chatbot (historic and context)
- Simple website interface with Symfony UX
More feature
- possibility to give some code with the question
- index stack overflow question-answer
- put a link in the profiler, so in one click we can send the error to the chatbot
4. Business model
Hosting the infrastructure and using the mistral 7b model will have a cost which I can afford as a side-project but not if people start using it a lot. I'm open to any suggestion.