-
Carnegie Mellon University
- Pittsburgh, PA
- https://akshaj.ai
- in/akshaj-jain
Highlights
- Pro
Stars
Whisper realtime streaming for long speech-to-text transcription and translation
ViT Prisma is a mechanistic interpretability library for Vision Transformers (ViTs).
thevoicecompany / gazelle-train
Forked from tincans-ai/gazelleJoint speech-language model - respond directly to audio!
Joint speech-language model - respond directly to audio!
Website for hosting the Open Foundation Models Cheat Sheet.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Mass-editing thousands of facts into a transformer memory (ICLR 2023)
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
Erasing concepts from neural representations with provable guarantees
Examples and guides for using the OpenAI API
A ChatGPT plugin that allows you to load and edit your local files in a controlled way, as well as run any Python, JavaScript, and bash script.
ImageBind One Embedding Space to Bind Them All
Unsupervised Speech Decomposition Via Triple Information Bottleneck
A dataset featuring diverse dialogues between two ChatGPT (gpt-3.5-turbo) instances with system messages written by GPT-4. Covering various contexts and tasks (task-oriented dialogue systems, abstr…
Neural network-based singing voice synthesis library for research
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
Diffusers Stable Diffusion as a Cog model
A collection of resources and papers on Diffusion Models
CMU Lecture: Machine Learning In Production / AI Engineering / Software Engineering for AI-Enabled Systems (SE4AI)
Software Engineering for AI/ML -- An Annotated Bibliography
A python Library to easily use & configure a suite of services provided by Lethical.AI (https://lethical.ai)