Skip to content

AI learning platform with roadmaps, content search, PDF/audio analysis, and project profiles. Built with FastAPI

Notifications You must be signed in to change notification settings

itzTiru/LearnLinkAI-internal-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 

Repository files navigation

learnLinkAI-internal-server

Overview

LearnLinkAI is an AI powered search engine designed to recommend educational content from platforms like YouTube and the web. It leverages natural language processing, embeddings, and APIs to fetch, rank, and store relevant content. The application uses FastAPI for the backend, integrates with Google APIs for search, and employs AssemblyAI for audio transcription. Content is ranked using cosine similarity based on embeddings generated by a HuggingFace model, and results are stored in a SQLite database for efficient retrieval

Features

  • Multi-Platform Search: Retrieves educational content from YouTube and web sources using Programmable Search and YouTube APIs.
  • AI-Powered Ranking: Uses sentence-transformers (all-MiniLM-L6-v2) to generate embeddings and rank results by cosine similarity.
  • Audio Transcription: Supports audio file transcription via AssemblyAI for voice-based queries.
  • Database Storage: Stores content metadata and embeddings in a SQLite database for persistence and efficient querying.
  • FastAPI Backend: Provides a high-performance API with CORS support for integration with frontends (e.g., Next.js).
  • Gemini API Integration: Fetches concise educational answers for queries using Google's Gemini model.

Requirements

To run LearnLinkAIfile, ensure you have Python 3.8+ and the dependencies listed in requirements.txt. Key dependencies include:

  • FastAPI: For the API server.
  • HuggingFace Embeddings: For text embeddings (sentence-transformers/all-MiniLM-L6-v2).
  • SQLAlchemy: For SQLite database interactions.
  • Google APIs: For YouTube and web search.
  • AssemblyAI: For audio transcription.
  • PyTorch: For handling embeddings.

Install dependencies using: bash pip install -r requirements.txt

Setup

  1. Clone the Repository: bash git clone cd LearnLinkAIfile

  2. Environment Variables: Create a .env file in the project root with the following: plaintext YOUTUBE_API_KEY= GOOGLE_API_KEY= GOOGLE_CSE_ID= GEMINI_API_KEY= ASSEMBLYAI_API_KEY=

  3. Database Initialization: The SQLite database (content.db) is automatically created when you run the application for the first time, thanks to the Base.metadata.create_all call in database.py.

  4. Run the Application: Start the FastAPI server using Uvicorn: bash uvicorn main:app --reload

    The API will be available at http://localhost:8000.

API Endpoints

  • GET /: Returns a welcome message for the API.
  • POST /search: Searches for educational content based on a query. Expects a JSON payload with query (string), max_results (int, default 5), and platforms (list of strings, e.g., ["youtube", "web"]).
    • Example: json { "query": "machine learning basics", "max_results": 50, "platforms": ["platform 1", "platform 2"] }

    • Response includes query, counts (YouTube, web, ranked, stored_new), and ranked results.

  • POST /aiinfo: Fetches concise AI-generated answers for a query using the Gemini API. Expects a JSON payload with query.
    • Example: json { "query": "machine learning basics", "max_results": 5, "platforms": ["ai"] }

    • Response includes query and AI-generated answers with references.

  • POST /transcribe: Transcribes an uploaded audio file using AssemblyAI. Expects a file upload.

Project Structure

  • main.py: FastAPI application with search, AI info, and transcription endpoints.
  • database.py: SQLAlchemy setup for SQLite database and ORM model for content storage.
  • requirements.txt: List of Python dependencies.
  • .env: Environment variables for API keys (not included in version control).

Usage

  1. Search for Content: Use the /search endpoint to fetch and rank educational content. Results are ranked by relevance using cosine similarity on embeddings and stored in the database for future queries.

  2. AI-Generated Answers: Use the /aiinfo endpoint to get concise answers with references for educational queries.

  3. Transcribe Audio: Upload audio files to the /transcribe endpoint to convert voice queries into text.

Notes

  • Performance: The application uses batched embeddings for efficient ranking and minimizes database commits for speed.
  • Scalability: The SQLite database is suitable for small to medium-scale applications. For larger datasets, consider switching to a more robust database like PostgreSQL.
  • Error Handling: The API includes comprehensive error handling for API failures, invalid inputs, and transcription errors.
  • Hardware: For faster embeddings, use a GPU by setting model_kwargs={"device": "cuda"} in main.py. CPU is used by default for compatibility.

About

AI learning platform with roadmaps, content search, PDF/audio analysis, and project profiles. Built with FastAPI

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages