- Karachi , Pakistan
- https://muhammadbilal848.github.io
- in/bilal-haneef-32014a1a2
- bilalhaneef484
- @MBHQs
Highlights
- Pro
Stars
Code and Slides
UTRNet: High-Resolution Urdu Text Recognition In Printed Documents (ICDAR'23)
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
π‘ All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
New repo collection for NVIDIA Cosmos: https://github.com/nvidia-cosmos
π€ smolagents: a barebones library for agents that think in python code.
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
A diffusers pipeline for zero shot stylised portrait creation
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Document to Markdown OCR library with Llama 3.2 vision
Face Analysis: Detection, Age Gender Estimation & Recognition
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Inpaint anything using Segment Anything and inpainting models.
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama modeβ¦
π Collection of Kaggle Solutions and Ideas π
The Microsoft Bot Framework provides what you need to build and connect intelligent bots that interact naturally wherever your users are talking, from text/sms to Skype, Slack, Office 365 mail and β¦
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
π Text-Prompted Generative Audio Model
GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
Robust Speech Recognition via Large-Scale Weak Supervision
ππ€ Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN
Faster Whisper transcription with CTranslate2