Stars
A New Tamil Large Language Model (LLM) Based on Llama 2
An evolving list of electronic media data sets used to model mental-health status.
Performing sentiment analysis for binary classification with neural networks.
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
The open source Firebase alternative. Supabase gives you a dedicated Postgres database to build your web, mobile, and AI applications.
This repository holds the code for working with data from counselchat.com
MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation
Generating responses with pretrained XLNet and GPT-2 in PyTorch.
The OWASP Cheat Sheet Series was created to provide a concise collection of high value information on specific application security topics.
Convert PDF to markdown + JSON quickly with high accuracy
The dataset comprises resumes collected from various sources, including Google Images, Bing Images, and the website LiveCareer. Each resume entry consists of two columns: "Category" and "Text".
A roadmap describing the required skills, learning resources and sample tools to become an AI Engineer
Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥
LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data…
Robust Speech Recognition via Large-Scale Weak Supervision
Granite Snack Cookbook -- easily consumable recipes (python notebooks) that showcase the capabilities of the Granite models
Agno is a lightweight framework for building multi-modal Agents
Access large language models from the command-line
AgentQL is an AI-powered query language for web scraping and automation. It uses natural language selectors to find data on any page, including authenticated content. AgentQL queries are self-heali…
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
This repository contains python code to create a corpus of 12,215 terms of service documents scraped from TOSDR, intended for legal, privacy, and natural language processing research.