Skip to content

VertexCodeStudio/summarise_flashcards

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI-Powered Flashcard Generator

## Overview 🧠

This project is an AI-powered web application that automates the creation of study materials. Built with Streamlit, this tool can process information from multiple sources—including YouTube video transcripts, PDF documents, and even text from images (OCR).

The application intelligently chunks and summarizes the content using advanced AI models. It then generates question-and-answer pairs and exports them as a ready-to-use Anki flashcard deck (.apkg file), complete with spaced repetition metadata for efficient learning.

Key Features ✨

  • 📄 Multi-Source Input: Ingests text directly from YouTube video links, uploaded PDFs, or images.
  • 🤖 OCR Capabilities: Extracts text from images using EasyOCR and Tesseract for preprocessing.
  • ✂️ Intelligent Text Chunking: Automatically preprocesses and intelligently splits long, noisy transcripts or documents into manageable segments.
  • 🧠 AI-Powered Summarization: Utilizes the facebook/bart-large-cnn model to create concise summaries of the text chunks.
  • ❓ Automatic Q&A Generation: Employs the google/flan-t5-base model to generate relevant question-and-answer pairs from the content, perfect for flashcards.
  • 🗂️ Direct Anki Export: Seamlessly packages the generated Q&A pairs into a standard Anki deck file (.apkg) using genanki.

Tech Stack 🛠️

  • App Framework: Streamlit
  • AI & Machine Learning: Transformers (Hugging Face), PyTorch, spaCy, NLTK
  • Data Extraction & OCR: PyMuPDF (for PDFs), EasyOCR, Pytesseract, youtube-transcript-api
  • Data Handling: Pandas, NumPy
  • Flashcard Generation: genanki

How to Run Locally

To get this project running on your local machine, follow these steps:

  1. Clone the repository:
    git clone https://github.com/VertexCodeStudio/summarise-flashcards.git
  2. Navigate to the project directory:
    cd summarise-flashcards
  3. Install the required Python packages:
    pip install -r requirements.txt
  4. Download the necessary spaCy language model:
    python -m spacy download en_core_web_sm
  5. Run the Streamlit application:
    streamlit run src/app.py

The application will then be running in your web browser.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages