learnLinkAI-internal-server

Overview

LearnLinkAI is an AI powered search engine designed to recommend educational content from platforms like YouTube and the web. It leverages natural language processing, embeddings, and APIs to fetch, rank, and store relevant content. The application uses FastAPI for the backend, integrates with Google APIs for search, and employs AssemblyAI for audio transcription. Content is ranked using cosine similarity based on embeddings generated by a HuggingFace model, and results are stored in a SQLite database for efficient retrieval

Features

Multi-Platform Search: Retrieves educational content from YouTube and web sources using Programmable Search and YouTube APIs.
AI-Powered Ranking: Uses sentence-transformers (all-MiniLM-L6-v2) to generate embeddings and rank results by cosine similarity.
Audio Transcription: Supports audio file transcription via AssemblyAI for voice-based queries.
Database Storage: Stores content metadata and embeddings in a SQLite database for persistence and efficient querying.
FastAPI Backend: Provides a high-performance API with CORS support for integration with frontends (e.g., Next.js).
Gemini API Integration: Fetches concise educational answers for queries using Google's Gemini model.

Requirements

To run LearnLinkAIfile, ensure you have Python 3.8+ and the dependencies listed in requirements.txt. Key dependencies include:

FastAPI: For the API server.
HuggingFace Embeddings: For text embeddings (sentence-transformers/all-MiniLM-L6-v2).
SQLAlchemy: For SQLite database interactions.
Google APIs: For YouTube and web search.
AssemblyAI: For audio transcription.
PyTorch: For handling embeddings.

Install dependencies using: bash pip install -r requirements.txt

Setup

Clone the Repository: bash git clone cd LearnLinkAIfile
Environment Variables: Create a .env file in the project root with the following: plaintext YOUTUBE_API_KEY= GOOGLE_API_KEY= GOOGLE_CSE_ID= GEMINI_API_KEY= ASSEMBLYAI_API_KEY=
Database Initialization: The SQLite database (content.db) is automatically created when you run the application for the first time, thanks to the Base.metadata.create_all call in database.py.
Run the Application: Start the FastAPI server using Uvicorn: bash uvicorn main:app --reload

The API will be available at http://localhost:8000.

API Endpoints

GET /: Returns a welcome message for the API.
POST /search: Searches for educational content based on a query. Expects a JSON payload with query (string), max_results (int, default 5), and platforms (list of strings, e.g., ["youtube", "web"]).
- Example: json { "query": "machine learning basics", "max_results": 50, "platforms": ["platform 1", "platform 2"] }
- Response includes query, counts (YouTube, web, ranked, stored_new), and ranked results.
POST /aiinfo: Fetches concise AI-generated answers for a query using the Gemini API. Expects a JSON payload with query.
- Example: json { "query": "machine learning basics", "max_results": 5, "platforms": ["ai"] }
- Response includes query and AI-generated answers with references.
POST /transcribe: Transcribes an uploaded audio file using AssemblyAI. Expects a file upload.
- Example (using curl): bash curl -X POST -F "file=@audio.wav" http://localhost:8000/transcribe
- Response includes the transcribed text.

Project Structure

main.py: FastAPI application with search, AI info, and transcription endpoints.
database.py: SQLAlchemy setup for SQLite database and ORM model for content storage.
requirements.txt: List of Python dependencies.
.env: Environment variables for API keys (not included in version control).

Usage

Search for Content: Use the /search endpoint to fetch and rank educational content. Results are ranked by relevance using cosine similarity on embeddings and stored in the database for future queries.
AI-Generated Answers: Use the /aiinfo endpoint to get concise answers with references for educational queries.
Transcribe Audio: Upload audio files to the /transcribe endpoint to convert voice queries into text.

Notes

Performance: The application uses batched embeddings for efficient ranking and minimizes database commits for speed.
Scalability: The SQLite database is suitable for small to medium-scale applications. For larger datasets, consider switching to a more robust database like PostgreSQL.
Error Handling: The API includes comprehensive error handling for API failures, invalid inputs, and transcription errors.
Hardware: For faster embeddings, use a GPU by setting model_kwargs={"device": "cuda"} in main.py. CPU is used by default for compatibility.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
.vscode		.vscode
Fast-Api		Fast-Api
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

learnLinkAI-internal-server

Overview

Features

Requirements

Setup

API Endpoints

Project Structure

Usage

Notes

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

itzTiru/LearnLinkAI-internal-server

Folders and files

Latest commit

History

Repository files navigation

learnLinkAI-internal-server

Overview

Features

Requirements

Setup

API Endpoints

Project Structure

Usage

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages