Inspired by mayooear's gpt4-pdf-chatbot-langchain repository, we have integrated ConvoStack to create a chatGPT playground for multiple Large PDF files.
Amazingly, we achieved the same outcome with a much more concise codebase.
Join the discord if you have questions
Star our repo to support ConvoStack as an open-source project!
-
Run
npm installto install all necessary dependencies. -
Set up your
.envfile
- Copy
.env.exampleinto.envYour.envfile should look like this:
OPENAI_API_KEY=
PINECONE_API_KEY=
PINECONE_ENVIRONMENT=
PINECONE_INDEX_NAME=
- Visit openai to retrieve API keys and insert into your
.envfile. - Visit pinecone to create and retrieve your API keys, and also retrieve your environment and index name from the dashboard.
- In the
configfolder, replace thePINECONE_NAME_SPACEwith anamespacewhere you'd like to store your embeddings on Pinecone when you runnpm run ingest. This namespace will later be used for queries and retrieval.
This repo can load multiple PDF files
-
Create a
docsfolder. Insidedocsfolder, add your pdf files or folders that contain pdf files. -
Run the script
npm run ingestto 'ingest' and embed your docs. -
Check Pinecone dashboard to verify your namespace and vectors have been added.
-
In
src/index.ts, change thetemplates.qaPromptfor your own usecase. ChangemodelNameinnew OpenAItogpt-4, if you have access togpt-4api. Please verify outside this repo that you have access togpt-4api, otherwise the application will not work.
Once you've verified that the embeddings and content have been successfully added to your Pinecone, you can run the app npm run dev to launch the ConvoStack chatbot playground.
Feel free to give our repo a ⭐ to support open-source AI projects: https://github.com/ConvoStack/convostack