A Streamlit application for crawling websites or processing PDFs, then answering questions about their content using retrieval-augmented generation.
- Crawl websites and index their content
- Process PDF documents
- Ask questions using similarity search
- Generate AI-powered answers via OpenAI
- Save and manage multiple indices
# Install dependencies
pip install streamlit faiss-cpu numpy sentence-transformers openai pdfplumber beautifulsoup4 requests
# Run the app
streamlit run webrag-streamlit.py- Crawl a website or upload a PDF
- Ask questions about the content
- Get relevant passages and AI-generated answers
- Python 3.7+
- OpenAI API key (for AI answer generation)