Scalable data pre processing and curation toolkit for LLMs
-
Updated
Sep 23, 2025 - Python
Scalable data pre processing and curation toolkit for LLMs
Visual AI development framework for training and inference of ML models, scaling pipelines, and automating workflows with Python.⭐ Leave a star to support us!
convtools is a specialized Python library for dynamic, declarative data transformations with automatic code generation
Making it easier to navigate and clean TAHMO weather station data for ML development
A simplistic, general purpose pipeline framework.
Artifician is an event-driven framework designed to simplify and accelerate the process of preparing datasets for Artificial Intelligence models.
A pipeline that consumes twitter data to extract meaningful insights about a variety of topics using the following technologies: twitter API, Kafka, MongoDB, and Tableau.
Streamlit app to export Plex music metadata and bulk-update tags from CSV
The Resume Application Tracking System uses Google Gemini Pro Vision to automatically parse, analyze, and categorize resumes for efficient recruitment. It integrates AI-driven vision capabilities to enhance resume processing and candidate selection.
Add a description, image, and links to the data-processing-pipelines topic page so that developers can more easily learn about it.
To associate your repository with the data-processing-pipelines topic, visit your repo's landing page and select "manage topics."