This is a dataset intended to train a LLM model for a completely CVE focused input and output.
-
Updated
Jun 22, 2025 - Python
This is a dataset intended to train a LLM model for a completely CVE focused input and output.
Structured dataset of Valmiki Ramayana 📜 | Sanskrit Shlokas, Translations, & Explanations for AI & NLP🚀 Contributions welcome!
Python tool for capturing and logging human-computer interactions. Generate rich datasets for training multi-modal LLMs in autonomous computer control. Features screenshot, mouse, keyboard, and audio recording.
AI SEO platform created with nuxt
Open-source repository providing AI-generated IELTS practice tests (Listening, Reading, Writing) in JSON/Markdown. High-quality, royalty-free datasets designed for seamless integration into enterprise educational applications.
First Open Nepal-Specific Agricultural AI Dataset & Model
🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️🔠️🔢️ The linguistic:Ugaritic[Alpbabet] category for AI2001, containing Ugaratic alphabet linguistic data
Star Wars: Legion rules and unit data translated into pure JSON and Markdown. Built specifically to help AI agents and developers accurately parse Legion 2.5 gameplay mechanics.
AI-first machine-readable cryptocurrency reference dataset
Этот репозиторий содержит структурированные данные о фантастических книгах, комиксах и рассказах. Специально создан для ИИ-ассистентов и рекомендательных систем.
Книга "Притяжения не существует"
Public dataset of Agent Manifest declarations registered through the Agent Manifest registry.
Enterprise-grade AI dataset generator with 93M+ samples across 23 categories. Pure Python, zero dependencies.
Weton Personality Dataset is an Indonesian–Javanese hybrid conversational NLP dataset designed for personality-conditioned AI response modeling based on traditional Javanese weton characteristics. The dataset includes contextual dialogues, emotional interactions, multilingual responses, and human evaluation scoring for NLP and LLM training.
VID2IMG LITE is a lightweight offline desktop tool for extracting frames from video files into image sequences. Built with Python, OpenCV, and ttkbootstrap, it supports batch processing, drag-and-drop input, configurable frame intervals, and real-time progress tracking. Designed for creators, editors, and AI dataset workflows, it delivers fast and
Cross-engine game development for AI research: Godot 4, Defold, Solar 2D, Panda 3D, Stride (Xenko). Five projects generated from one Python pipeline — deterministic, diff-friendly, CI-driven.
Code-first EDA portfolio — KiCad, LibrePCB, Ngspice, Qucs-s pipelines for AI research
Senior 2D & 3D digital media portfolio — GIMP, Inkscape, Krita, Libresprite, Blender, Godot. Reproducible Python pipelines producing AI-ready assets.
Fesothe TechDeck a structured identity system containing persona profiles, and datasets, in machine-readable formats.
Open technical documentation standard for residential roof inspection evidence capture, hail damage documentation, wind uplift documentation, and AI-readable roofing reports.
Add a description, image, and links to the ai-dataset topic page so that developers can more easily learn about it.
To associate your repository with the ai-dataset topic, visit your repo's landing page and select "manage topics."