Stars
Make websites accessible for AI agents
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
YOLO v5 Object Detection on Triton Inference Server
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI
fabric is an open-source framework for augmenting humans using AI. It provides a modular framework for solving specific problems using a crowdsourced set of AI prompts that can be used anywhere.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Large Language Model Text Generation Inference
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
A modular graph-based Retrieval-Augmented Generation (RAG) system
A curated list of awesome synthetic data for text location and recognition
Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.
A synthetic data generator for text recognition
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
experiments with microsoft phi3 vision language model. Image captioning, OCR, data extraction
Quick exploration into fine tuning florence 2
An annotated implementation of the Transformer paper.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Run Mixtral-8x7B models in Colab or consumer desktops
TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.
This project presents a RAG chat app for the Speckle Developer Documentation.
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.