Support Yolov5(4.0)/Yolov5(5.0)/YoloR/YoloX/Yolov4/Yolov3/CenterNet/CenterFace/RetinaFace/Classify/Unet. use darknet/libtorch/pytorch/mxnet to onnx to tensorrt
-
Updated
Aug 2, 2021 - C++
Support Yolov5(4.0)/Yolov5(5.0)/YoloR/YoloX/Yolov4/Yolov3/CenterNet/CenterFace/RetinaFace/Classify/Unet. use darknet/libtorch/pytorch/mxnet to onnx to tensorrt
Torchserve server using a YoloV5 model running on docker with GPU and static batch inference to perform production ready and real time inference.
Analyze and generate unstructured data using LLMs, from quick experiments to billion token jobs.
Batch LLM Inference with Ray Data LLM: From Simple to Advanced
PipelineScheduler optimizes workload distribution between servers and edge devices, setting optimal batch sizes to maximize throughput and minimize latency amid content dynamics and network instability. It also addresses resource contention with spatiotemporal inference scheduling to reduce co-location interference.
Ray Saturday Dec 2022 edition
Torchfusion is a very opinionated torch inference on datafusion.
Serve pytorch inference requests using batching with redis for faster performance.
Support batch inference of Grounding DINO. "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Self-hosted batch LLM pipeline for analyzing customer feedback from Excel. Upload xlsx, describe the task, configure output fields — get structured results. Works with any OpenAI-compatible API and Ollama.
LightGBM Inference on Datafusion
简单的 Ollama JSONL 批量推理工具 / Simple Ollama JSONL batch inference tool.
Neural network classifier with training, evaluation, calibration, and prediction using PyTorch.
This repository provides sample codes, which enable you to learn how to use auto-ml image classification, or object detection under Azure ML(AML) environment.
🚀 Process JSON data in batches with `llm-batch`, leveraging sequential or parallel modes for efficient interaction with LLMs.
This repo simulates how an ML model moves to production in an industry setting. The goal is to build, deploy, monitor, and retrain a sentiment analysis model using Kubernetes (minikube) and FastAPI.
sdkgenai 🛠️🔃📦 : Gen AI SDK # Model Parameters # Safety Filters # Multi-turn Chat # Content Streaming # Asynchronous Requests # Token Counting # Context Caching # Function Calling # Batch Prediction # Text Embeddings
Fast self-hosted embedding engine for search, RAG, and reindexing workloads on NVIDIA GPUs. Built in Rust + TensorRT for teams that care about scale, cost, and control.
Production-ready credit risk modeling platform built with Streamlit and scikit-learn to predict loan default probability, generate 300–900 credit scores, explain decisions with SHAP, run what-if simulations, batch-score CSV files, and export PDF assessment reports.
Indian Data Club : Databricks 14-Days Challenge-2 is designed to help beginners build a strong foundation in Databricks through daily learning, hands-on practice, and problem solving.
Add a description, image, and links to the batch-inference topic page so that developers can more easily learn about it.
To associate your repository with the batch-inference topic, visit your repo's landing page and select "manage topics."